Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
ref_manual:db_layer [2009/05/04 22:59]
janakj created
ref_manual:db_layer [2009/05/05 01:33] (current)
janakj
Line 1: Line 1:
 ====== Database Abstraction Layer ====== ====== Database Abstraction Layer ======
 +//(preliminary text by Jan, needs refactoring)//
 +
 +The database subsystem in SER 2.1 was significantly changed from previous
 +versions. A few major reasons for extensive overhaul of the database subsystem
 +include:
 +
 +  * Support for multiple databases at the same time (even within a single SER module).
 +  * Support for high-available database setups, in other words the possibility to use multiple database instances at the same time.
 +  * Support for database load balancing and fail-scenarios without the need to re-implement it in every database driver.
 +  * Support for pre-compiled database commands/queries in databases that support it.
 +  * Support for per-driver specific options.
 +  * Better re-use of code commonly shared among database drivers, such as connection pooling and database URI parsing.
 +
 +===== Design Goals =====
 +  * Try to avoid using SQL queries assembled at runtime. If the database driver supports pre-compiled SQL queries, that is queries that are assembled and planned once and executed many times, try to use this and assemble queries in mod_init or child_init (during the SER initialization phase).
 +  * The database abstraction layer should be designed in a way which would make integration with non-SQL based database systems easy and straightforward.
 +  * Support for command executed on multiple database servers at the same time with various load-balancing and high-availability strategies.
 +
 +===== Naming Conventions =====
 +Names of all core data structures in the database API start with `db_`
 +prefix. For example `db_con` is a core db structure that represents a database
 +connection, `db_uri` is a structure representing database URI, and so on. The
 +`db_` prefix is reserved for the data structures in the SER core only and
 +should not be used in data structures defined in database drivers to avoid
 +naming conflicts.
 +
 +Driver specific data structures, that is the data structures defined in SER
 +database drivers for various database servers, such as MySQL, PostgreSQL,
 +should have names starting with a two letter prefix that is  unique for
 +the driver. For example all data structures defined in the mysql SER module
 +start with prefix "my_", thus the data structure `my_con`
 +is the mysql-specific equivalent of structure `db_con`.
 +
 +===== Extending Data Structures =====
 +TBD
 +
 +===== Core Data Structures =====
 +Core data structures is a set of data structures that are defined in the
 +database abstraction layer in SER core. They are defined in the header fields
 +under ser/db directory.
 +
 +==== ''db_ctx'': Context ====
 +Database context is a data structure used to bind multiple database
 +connections and an algorithm which controls how database commands are executed
 +over those connections. Database contexts are used by database commands. Every
 +database command contains a reference to one context which is used to
 +communicate with database servers(s).
 +
 +The purpose of the database context is to abstract database commands from
 +database connections. This allows to separate fail-over and high-availability
 +logic from database commands.
 +
 +Every database context contains an array of connections and a textual
 +identifier that is used to identify it.
 +
 +==== ''db_uri'': URI ====
 +
 +==== ''db_con'': Connections ====
 +Physical database connections, pooled, shared across modules.
 +
 +==== ''db_cmd'': Commands ====
 +Physical database connections, pooled, shared across modules
 +
 +==== ''db_drv'': Driver Structure Prototype ====
 +
 +==== ''db_fld'': Fields ====
 +Database record, there may be multiple records per result
 +
 +==== ''db_gen'': Shared Structure Members ====
 +
 +==== ''db_pool_entry'': Connection Pools ====
 +Pools of database connections shared across SER modules.
 +
 +==== ''db_rec'': Records ====
 +Database record, there may be multiple records per result.
 +
 +==== ''db_res'': Results ====
 +Physical database connections, pooled, shared across modules
 +
 +===== Writing a New Database Driver =====
 +Database drivers are just ordinary SER modules. They typically do not provide
 +any functions which would be directly usable from the configuration
 +script. Instead they export an API which exports their functions and data
 +structures to the data abstraction layer in SER.
 +
 +A database driver is basically a collection of data structures and functions
 +which in form of a SER module. Database drivers can be loaded into SER just
 +like any other SER modules. There is nothing special about them in this
 +respect.
 +
 +The first thing you need to do if you want to create a new database driver for
 +SER is to create a new SER module. The name of the module should be the same
 +as the string identifying the database in database URI used in the
 +configuration script of SER. If you name your new module "abcde" then the
 +module will be used when the administrator uses a URI with scheme "abcde:" in
 +the configuration file.
 +
 +As the next step you should try to compile your new (yet empty) database
 +driver. Make sure that you link it with the database client library that you
 +are going to use and that you include at least one header file from the
 +library.
 +
 +If your driver compiles fine then you can define a minimalistic SER module
 +interface (without any functions and parameters). Here is one example of such
 +minimal SER module interface:
 +
 +<code c>
 +#include "../../sr_module.h"
 +
 +MODULE_VERSION
 +
 +static cmd_export_t cmds[]={
 +    {0,0,0,0,0}
 +};
 +
 +
 +/* Exported parameters
 +static param_export_t params[] = {
 +    {0, 0, 0}
 +};
 +
 +
 +struct module_exports exports= {
 +    "abcde",
 +    cmds,
 +    0,       /* RPC methods */
 +    params,  /* param exports */
 +    0,       /* module initialization function */
 +    (response_function) 0,
 +    0,
 +    0,
 +    0  /* per-child init function */
 +};
 +</code>
 +
 +After that you can compile your module again and then you can try to load the
 +module into SER. The module wouldn't do anything yet but if SER starts fine
 +then your skeleton driver module is working and you can proceed to
 +implementation of various data structures and functions your driver must
 +contain in order to be useful.
 +
 +==== Extending Core Data Structures ====
 +The database related core data structures only contain a minimal set of
 +variables and attributes. Typically it is the minimim amount of information
 +that every database driver will use.
 +
 +If you take a look at db_uri as an example, the data structure that represents
 +databse URIs:
 +
 +<code c>
 +typedef struct db_uri {
 +    db_gen_t gen;      /* Generic part of the structure */
 +    str scheme;        /* URI scheme */
 +    str body;          /* Entire URI body */
 +    db_uri_cmp_t* cmp; /* Comparison function */
 +} db_uri_t;
 +</code>
 +
 +You will see that except for some auxiliary variables the data structure
 +contains only two fields related to the URI, the scheme and the body. It does
 +not parse the body of the URI because it does not know anything about the
 +format, different database drivers may use different formats of the body.
 +
 +For example, if the URI was %%''mysql://ser:heslo@localhost/ser''%% then the scheme
 +part would contain ''mysql'' and the body part would contain
 +''ser:heslo@localhost/ser''.
 +
 +But the database driver, which would be mysql in this example, would most
 +likely need to parse the URI body even more and store various parts of it
 +separately in separate variables because this is who the corresponding
 +function of the mysqlclient library expects them.
 +
 +And because there is no space for additional variables in ''db_uri'' structure,
 +the driver has to create its own--driver specific--data structure and attach
 +is to the generic one. We say that the driver **extends** the generic db_uri
 +structure with **driver-specific payload**.
 +
 +==== Define Payload Structure ====
 +In our example the mysql driver would define a structure called my_uri which
 +would contain some additional variables.
 +
 +<code c>
 +struct my_uri {
 +    db_drv_t drv;
 +    char* username;
 +    char* password;
 +    char* host;
 +    unsigned short port;
 +    char* database;
 +};
 +</code>
 +
 +And this data structure will the extend the generic structure. Note the first
 +variable in the data structure:
 +
 +<code c>
 +db_drv_t drv;
 +</code>
 +
 +This is an auxiliary data structure which contains some data that has to be
 +present in every structure with extends a core structure. We will not go into
 +details yet, all you need to remember here is that:
 +
 +  * Every extension (that is structures that are supposed to extend the core db api structures) structure that you define must have this variable.
 +  * Furthermore this variable has to be the first variable defined in the structure!
 +
 +The rest of the variables contains various parts of the URI body.
 +
 +==== Extending Core Structures ====
 +We now have the ''my_uri'' structure declared in the mysql module, but how do we
 +make the core data layer use it? This section describes the process of
 +extending the core structure in the data abstraction layer.
 +
 +First of all, all core data structure in the data abstraction layer are
 +created whenever the data layer needs them to be created and destroyed
 +whenever the data layer does not need them anymore. The database driver does
 +not know anything about it nor it can affect it any way.
 +
 +This is one of the bigger changes from the previous versions of the SER data
 +abstraction layer. In previous versions SER would just call functions in
 +database drivers and the database drivers had to create/destroy all data
 +structures themselves.
 +
 +Starting with SER 2.1, it is the data abstraction layer in the core who
 +decideds when are the core data structures created/destroyed. The database
 +driver cannot affect it (we simplify here a little bit), but it can define a
 +function which will be called when a data structure is created and it can also
 +define a function which will be called when the very same data structure is
 +about to be destroyed. These two functions can be used to extend the core
 +structure.
 +
 +In our previous example we defined a data structure called ''my_uri'' and we
 +want to have the core structure ''db_uri'' extended with this structure. When
 +SER creates a new object from the ''db_uri'' structure, it would search the list
 +of functions exported by the database driver for a function with the same
 +name. In this case it would search the list of all functions exported by the
 +mysql module and if it finds a function called ''db_uri'' then the function will
 +be called immediately after the object is created.
 +
 +Because the mysql driver wants to extend the structure, it will export the
 +function in its module API:
 +
 +<code c>
 +static cmd_export_t cmds[] = {
 +    {"db_uri",    (cmd_function)my_uri,       0, 0, 0},
 +</code>
 +
 +and defines the function to be called:
 +
 +<code c>
 +int my_uri(db_uri_t* uri)
 +{
 +    struct my_uri* res;
 +
 +    res = (struct my_uri*)pkg_malloc(sizeof(struct my_uri));
 +    if (res == NULL) {
 + ERR("mysql: No memory left\n");
 +        goto error;
 +    }
 +    memset(res, '\0', sizeof(struct my_uri));
 +    if (db_drv_init(&res->drv, my_uri_free) < 0) goto error;
 +    if (parse_mysql_uri(res, &uri->body) < 0) goto error;
 +
 +    DB_SET_PAYLOAD(uri, res);
 +    uri->cmp = my_uri_cmp;
 +    return 0;
 +
 + error:
 +    if (res) {
 +        db_drv_free(&res->drv);
 +        if (res) pkg_free(res);
 +    }
 +    return -1;
 +}
 +</code>
 +
 +The function gets the newly created db_uri object, which contains the body of
 +the URI for the function to parse. As you can see above the function does a
 +few steps worth explaining.
 +
 +First of all a new structure ''my_uri'' is allocated and the memory of the
 +object is reset.
 +
 +After that the data structure must be initialized by calling db_drv_init
 +function. This function is used to initialize the mysterious db_drv_t drv
 +variable which we mentioned in the section describing ''my_uri''.
 +
 +The second parameter is a pointer to another function, ''my_uri_free'', more on
 +that later.
 +
 +''parse_mysql_uri'' parses the URI body stored in ''db_uri->body''.
 +
 +And then the structure is ready to be attached to the ''db_uri'' structure to
 +extend it, this is done by calling ''DB_SET_PAYLOAD'' macro. The macro takes a
 +ponter to the generic ''db_uri'' structure as the first parameter and a pointer
 +to the newly created ''my_uri'' as the second parameter and extends the generic
 +structure.
 +
 +The newly created structure is now attached to it and contains all the
 +additional information that the mysql driver needs.
 +
 +==== Destroying Payload ====
 +When the core data layer destroys the generic structure it also has to destroy
 +the driver specific payload which we just attached to it. Because it does not
 +know anything about its variables and how/if they need to be destroyed, it
 +will call a function which we provided to it and which will take care of the
 +payload. Here is how the function can look like:
 +
 +<code c>
 +static void my_uri_free(db_uri_t* uri, struct my_uri* payload)
 +{
 +    if (payload == NULL) return;
 +    db_drv_free(&payload->drv);
 +    if (payload->username) pkg_free(payload->username);
 +    if (payload->password) pkg_free(payload->password);
 +    if (payload->host) pkg_free(payload->host);
 +    if (payload->database) pkg_free(payload->database);
 +    pkg_free(payload);
 +}
 +</code>
 +
 +==== Accessing Payload ====
 +Accessing the driver specific payload attached to a core structure is
 +easy. A function called by the data abstraction layer would receive a pointer
 +to a generic structure as a parameter and it can call ''DB_GET_PAYLOAD'' macro
 +to obtain a pointer to the payload:
 +
 +<code c>
 +/*
 + * Compare two connection identifiers
 + */
 +static unsigned char my_uri_cmp(db_uri_t* uri1, db_uri_t* uri2)
 +{
 +    struct my_uri* muri1, *muri2;
 +
 +    if (!uri1 || !uri2) return 0;
 +
 +    muri1 = DB_GET_PAYLOAD(uri1);
 +    muri2 = DB_GET_PAYLOAD(uri2);
 +    if (muri1->port != muri2->port) return 0;
 +
 +    if (cmpstr(muri1->username, muri2->username, strcmp)) return 0;
 +    if (cmpstr(muri1->password, muri2->password, strcmp)) return 0;
 +    if (cmpstr(muri1->host, muri2->host, strcasecmp)) return 0;
 +    if (cmpstr(muri1->database, muri2->database, strcmp)) return 0;
 +    return 1;
 +}
 +</code>
 +
 +The example above shows function my_uri_cmp which is used to compare two mysql
 +URIs. The function would extract the payload from both of them and then
 +compare various fields in the payloads.
  

Navigation

Wiki

Other

QR Code
QR Code ref_manual:db_layer (generated for current page)