Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
ref_manual:db_layer [2009/05/04 22:59] janakj created |
ref_manual:db_layer [2009/05/05 01:33] (current) janakj |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Database Abstraction Layer ====== | ====== Database Abstraction Layer ====== | ||
+ | // | ||
+ | |||
+ | The database subsystem in SER 2.1 was significantly changed from previous | ||
+ | versions. A few major reasons for extensive overhaul of the database subsystem | ||
+ | include: | ||
+ | |||
+ | * Support for multiple databases at the same time (even within a single SER module). | ||
+ | * Support for high-available database setups, in other words the possibility to use multiple database instances at the same time. | ||
+ | * Support for database load balancing and fail-scenarios without the need to re-implement it in every database driver. | ||
+ | * Support for pre-compiled database commands/ | ||
+ | * Support for per-driver specific options. | ||
+ | * Better re-use of code commonly shared among database drivers, such as connection pooling and database URI parsing. | ||
+ | |||
+ | ===== Design Goals ===== | ||
+ | * Try to avoid using SQL queries assembled at runtime. If the database driver supports pre-compiled SQL queries, that is queries that are assembled and planned once and executed many times, try to use this and assemble queries in mod_init or child_init (during the SER initialization phase). | ||
+ | * The database abstraction layer should be designed in a way which would make integration with non-SQL based database systems easy and straightforward. | ||
+ | * Support for command executed on multiple database servers at the same time with various load-balancing and high-availability strategies. | ||
+ | |||
+ | ===== Naming Conventions ===== | ||
+ | Names of all core data structures in the database API start with `db_` | ||
+ | prefix. For example `db_con` is a core db structure that represents a database | ||
+ | connection, `db_uri` is a structure representing database URI, and so on. The | ||
+ | `db_` prefix is reserved for the data structures in the SER core only and | ||
+ | should not be used in data structures defined in database drivers to avoid | ||
+ | naming conflicts. | ||
+ | |||
+ | Driver specific data structures, that is the data structures defined in SER | ||
+ | database drivers for various database servers, such as MySQL, PostgreSQL, | ||
+ | should have names starting with a two letter prefix that is unique for | ||
+ | the driver. For example all data structures defined in the mysql SER module | ||
+ | start with prefix " | ||
+ | is the mysql-specific equivalent of structure `db_con`. | ||
+ | |||
+ | ===== Extending Data Structures ===== | ||
+ | TBD | ||
+ | |||
+ | ===== Core Data Structures ===== | ||
+ | Core data structures is a set of data structures that are defined in the | ||
+ | database abstraction layer in SER core. They are defined in the header fields | ||
+ | under ser/db directory. | ||
+ | |||
+ | ==== '' | ||
+ | Database context is a data structure used to bind multiple database | ||
+ | connections and an algorithm which controls how database commands are executed | ||
+ | over those connections. Database contexts are used by database commands. Every | ||
+ | database command contains a reference to one context which is used to | ||
+ | communicate with database servers(s). | ||
+ | |||
+ | The purpose of the database context is to abstract database commands from | ||
+ | database connections. This allows to separate fail-over and high-availability | ||
+ | logic from database commands. | ||
+ | |||
+ | Every database context contains an array of connections and a textual | ||
+ | identifier that is used to identify it. | ||
+ | |||
+ | ==== '' | ||
+ | |||
+ | ==== '' | ||
+ | Physical database connections, | ||
+ | |||
+ | ==== '' | ||
+ | Physical database connections, | ||
+ | |||
+ | ==== '' | ||
+ | |||
+ | ==== '' | ||
+ | Database record, there may be multiple records per result | ||
+ | |||
+ | ==== '' | ||
+ | |||
+ | ==== '' | ||
+ | Pools of database connections shared across SER modules. | ||
+ | |||
+ | ==== '' | ||
+ | Database record, there may be multiple records per result. | ||
+ | |||
+ | ==== '' | ||
+ | Physical database connections, | ||
+ | |||
+ | ===== Writing a New Database Driver ===== | ||
+ | Database drivers are just ordinary SER modules. They typically do not provide | ||
+ | any functions which would be directly usable from the configuration | ||
+ | script. Instead they export an API which exports their functions and data | ||
+ | structures to the data abstraction layer in SER. | ||
+ | |||
+ | A database driver is basically a collection of data structures and functions | ||
+ | which in form of a SER module. Database drivers can be loaded into SER just | ||
+ | like any other SER modules. There is nothing special about them in this | ||
+ | respect. | ||
+ | |||
+ | The first thing you need to do if you want to create a new database driver for | ||
+ | SER is to create a new SER module. The name of the module should be the same | ||
+ | as the string identifying the database in database URI used in the | ||
+ | configuration script of SER. If you name your new module " | ||
+ | module will be used when the administrator uses a URI with scheme " | ||
+ | the configuration file. | ||
+ | |||
+ | As the next step you should try to compile your new (yet empty) database | ||
+ | driver. Make sure that you link it with the database client library that you | ||
+ | are going to use and that you include at least one header file from the | ||
+ | library. | ||
+ | |||
+ | If your driver compiles fine then you can define a minimalistic SER module | ||
+ | interface (without any functions and parameters). Here is one example of such | ||
+ | minimal SER module interface: | ||
+ | |||
+ | <code c> | ||
+ | #include " | ||
+ | |||
+ | MODULE_VERSION | ||
+ | |||
+ | static cmd_export_t cmds[]={ | ||
+ | {0,0,0,0,0} | ||
+ | }; | ||
+ | |||
+ | |||
+ | /* Exported parameters | ||
+ | static param_export_t params[] = { | ||
+ | {0, 0, 0} | ||
+ | }; | ||
+ | |||
+ | |||
+ | struct module_exports exports= { | ||
+ | " | ||
+ | cmds, | ||
+ | 0, /* RPC methods */ | ||
+ | params, | ||
+ | 0, /* module initialization function */ | ||
+ | (response_function) 0, | ||
+ | 0, | ||
+ | 0, | ||
+ | 0 /* per-child init function */ | ||
+ | }; | ||
+ | </ | ||
+ | |||
+ | After that you can compile your module again and then you can try to load the | ||
+ | module into SER. The module wouldn' | ||
+ | then your skeleton driver module is working and you can proceed to | ||
+ | implementation of various data structures and functions your driver must | ||
+ | contain in order to be useful. | ||
+ | |||
+ | ==== Extending Core Data Structures ==== | ||
+ | The database related core data structures only contain a minimal set of | ||
+ | variables and attributes. Typically it is the minimim amount of information | ||
+ | that every database driver will use. | ||
+ | |||
+ | If you take a look at db_uri as an example, the data structure that represents | ||
+ | databse URIs: | ||
+ | |||
+ | <code c> | ||
+ | typedef struct db_uri { | ||
+ | db_gen_t gen; /* Generic part of the structure */ | ||
+ | str scheme; | ||
+ | str body; /* Entire URI body */ | ||
+ | db_uri_cmp_t* cmp; /* Comparison function */ | ||
+ | } db_uri_t; | ||
+ | </ | ||
+ | |||
+ | You will see that except for some auxiliary variables the data structure | ||
+ | contains only two fields related to the URI, the scheme and the body. It does | ||
+ | not parse the body of the URI because it does not know anything about the | ||
+ | format, different database drivers may use different formats of the body. | ||
+ | |||
+ | For example, if the URI was %%'' | ||
+ | part would contain '' | ||
+ | '' | ||
+ | |||
+ | But the database driver, which would be mysql in this example, would most | ||
+ | likely need to parse the URI body even more and store various parts of it | ||
+ | separately in separate variables because this is who the corresponding | ||
+ | function of the mysqlclient library expects them. | ||
+ | |||
+ | And because there is no space for additional variables in '' | ||
+ | the driver has to create its own--driver specific--data structure and attach | ||
+ | is to the generic one. We say that the driver **extends** the generic db_uri | ||
+ | structure with **driver-specific payload**. | ||
+ | |||
+ | ==== Define Payload Structure ==== | ||
+ | In our example the mysql driver would define a structure called my_uri which | ||
+ | would contain some additional variables. | ||
+ | |||
+ | <code c> | ||
+ | struct my_uri { | ||
+ | db_drv_t drv; | ||
+ | char* username; | ||
+ | char* password; | ||
+ | char* host; | ||
+ | unsigned short port; | ||
+ | char* database; | ||
+ | }; | ||
+ | </ | ||
+ | |||
+ | And this data structure will the extend the generic structure. Note the first | ||
+ | variable in the data structure: | ||
+ | |||
+ | <code c> | ||
+ | db_drv_t drv; | ||
+ | </ | ||
+ | |||
+ | This is an auxiliary data structure which contains some data that has to be | ||
+ | present in every structure with extends a core structure. We will not go into | ||
+ | details yet, all you need to remember here is that: | ||
+ | |||
+ | * Every extension (that is structures that are supposed to extend the core db api structures) structure that you define must have this variable. | ||
+ | * Furthermore this variable has to be the first variable defined in the structure! | ||
+ | |||
+ | The rest of the variables contains various parts of the URI body. | ||
+ | |||
+ | ==== Extending Core Structures ==== | ||
+ | We now have the '' | ||
+ | make the core data layer use it? This section describes the process of | ||
+ | extending the core structure in the data abstraction layer. | ||
+ | |||
+ | First of all, all core data structure in the data abstraction layer are | ||
+ | created whenever the data layer needs them to be created and destroyed | ||
+ | whenever the data layer does not need them anymore. The database driver does | ||
+ | not know anything about it nor it can affect it any way. | ||
+ | |||
+ | This is one of the bigger changes from the previous versions of the SER data | ||
+ | abstraction layer. In previous versions SER would just call functions in | ||
+ | database drivers and the database drivers had to create/ | ||
+ | structures themselves. | ||
+ | |||
+ | Starting with SER 2.1, it is the data abstraction layer in the core who | ||
+ | decideds when are the core data structures created/ | ||
+ | driver cannot affect it (we simplify here a little bit), but it can define a | ||
+ | function which will be called when a data structure is created and it can also | ||
+ | define a function which will be called when the very same data structure is | ||
+ | about to be destroyed. These two functions can be used to extend the core | ||
+ | structure. | ||
+ | |||
+ | In our previous example we defined a data structure called '' | ||
+ | want to have the core structure '' | ||
+ | SER creates a new object from the '' | ||
+ | of functions exported by the database driver for a function with the same | ||
+ | name. In this case it would search the list of all functions exported by the | ||
+ | mysql module and if it finds a function called '' | ||
+ | be called immediately after the object is created. | ||
+ | |||
+ | Because the mysql driver wants to extend the structure, it will export the | ||
+ | function in its module API: | ||
+ | |||
+ | <code c> | ||
+ | static cmd_export_t cmds[] = { | ||
+ | {" | ||
+ | </ | ||
+ | |||
+ | and defines the function to be called: | ||
+ | |||
+ | <code c> | ||
+ | int my_uri(db_uri_t* uri) | ||
+ | { | ||
+ | struct my_uri* res; | ||
+ | |||
+ | res = (struct my_uri*)pkg_malloc(sizeof(struct my_uri)); | ||
+ | if (res == NULL) { | ||
+ | ERR(" | ||
+ | goto error; | ||
+ | } | ||
+ | memset(res, ' | ||
+ | if (db_drv_init(& | ||
+ | if (parse_mysql_uri(res, | ||
+ | |||
+ | DB_SET_PAYLOAD(uri, | ||
+ | uri->cmp = my_uri_cmp; | ||
+ | return 0; | ||
+ | |||
+ | | ||
+ | if (res) { | ||
+ | db_drv_free(& | ||
+ | if (res) pkg_free(res); | ||
+ | } | ||
+ | return -1; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | The function gets the newly created db_uri object, which contains the body of | ||
+ | the URI for the function to parse. As you can see above the function does a | ||
+ | few steps worth explaining. | ||
+ | |||
+ | First of all a new structure '' | ||
+ | object is reset. | ||
+ | |||
+ | After that the data structure must be initialized by calling db_drv_init | ||
+ | function. This function is used to initialize the mysterious db_drv_t drv | ||
+ | variable which we mentioned in the section describing '' | ||
+ | |||
+ | The second parameter is a pointer to another function, '' | ||
+ | that later. | ||
+ | |||
+ | '' | ||
+ | |||
+ | And then the structure is ready to be attached to the '' | ||
+ | extend it, this is done by calling '' | ||
+ | ponter to the generic '' | ||
+ | to the newly created '' | ||
+ | structure. | ||
+ | |||
+ | The newly created structure is now attached to it and contains all the | ||
+ | additional information that the mysql driver needs. | ||
+ | |||
+ | ==== Destroying Payload ==== | ||
+ | When the core data layer destroys the generic structure it also has to destroy | ||
+ | the driver specific payload which we just attached to it. Because it does not | ||
+ | know anything about its variables and how/if they need to be destroyed, it | ||
+ | will call a function which we provided to it and which will take care of the | ||
+ | payload. Here is how the function can look like: | ||
+ | |||
+ | <code c> | ||
+ | static void my_uri_free(db_uri_t* uri, struct my_uri* payload) | ||
+ | { | ||
+ | if (payload == NULL) return; | ||
+ | db_drv_free(& | ||
+ | if (payload-> | ||
+ | if (payload-> | ||
+ | if (payload-> | ||
+ | if (payload-> | ||
+ | pkg_free(payload); | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | ==== Accessing Payload ==== | ||
+ | Accessing the driver specific payload attached to a core structure is | ||
+ | easy. A function called by the data abstraction layer would receive a pointer | ||
+ | to a generic structure as a parameter and it can call '' | ||
+ | to obtain a pointer to the payload: | ||
+ | |||
+ | <code c> | ||
+ | /* | ||
+ | * Compare two connection identifiers | ||
+ | */ | ||
+ | static unsigned char my_uri_cmp(db_uri_t* uri1, db_uri_t* uri2) | ||
+ | { | ||
+ | struct my_uri* muri1, *muri2; | ||
+ | |||
+ | if (!uri1 || !uri2) return 0; | ||
+ | |||
+ | muri1 = DB_GET_PAYLOAD(uri1); | ||
+ | muri2 = DB_GET_PAYLOAD(uri2); | ||
+ | if (muri1-> | ||
+ | |||
+ | if (cmpstr(muri1-> | ||
+ | if (cmpstr(muri1-> | ||
+ | if (cmpstr(muri1-> | ||
+ | if (cmpstr(muri1-> | ||
+ | return 1; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | The example above shows function my_uri_cmp which is used to compare two mysql | ||
+ | URIs. The function would extract the payload from both of them and then | ||
+ | compare various fields in the payloads. | ||