PostgreSQL
PrevChapter 43. ecpg - Embedded SQL in CNext

For the Developer

This section is for those that wants to develop the ecpg interface. It describes how the things work. The ambition is to make this section contain things for those that want to have a look inside and the section on How to use it should be enough for all normal questions. So, read this before looking at the internals of the ecpg. If you are not interested in how it really works, skip this section.

ToDo List

This version the preprocessor has some flaws:

Preprocessor output

The variables should be static.

Preprocessor cannot do syntax checking on your SQL statements

Whatever you write is copied more or less exactly to the Postgres and you will not be able to locate your errors until run-time.

no restriction to strings only

The PQ interface, and most of all the PQexec function, that is used by the ecpg relies on that the request is built up as a string. In some cases, like when the data contains the null character, this will be a serious problem.

error codes

There should be different error numbers for the different errors instead of just -1 for them all.

library functions

to_date et al.

records

Possibility to define records or structures in the declare section in a way that the record can be filled from one row in the database.

This is a simpler way to handle an entire row at a time.

array operations

Oracle has array operations that enhances speed. When implementing it in ecpg it is done for compatibility reasons only. For them to improve speed would require a lot more insight in the Postgres internal mechanisms than I possess.

indicator variables

Oracle has indicator variables that tell if a value is null or if it is empty. This largely simplifies array operations and provides for a way to hack around some design flaws in the handling of VARCHAR2 (like that an empty string isn't distinguishable from a null value). I am not sure if this is an Oracle extension or part of the ANSI standard.

typedefs

As well as complex types like records and arrays, typedefs would be a good thing to take care of.

conversion of scripts

To set up a database you need a few scripts with table definitions and other configuration parameters. If you have these scripts for an old database you would like to just apply them to get a Postgres database that works in the same way.

To set up a database you need a few scripts with table definitions and The functionality could be accomplished with some conversion scripts. Speed will never be accomplished in this way. To do this you need a bigger insight in the database construction and the use of the database than could be realised in a script.

The Preprocessor

First four lines are written to the output. Two comments and two include lines necessary for the interface to the library.

Then the preprocessor works in one pass only reading the input file and writing to the output as it goes along. Normally it just echoes everything to the output without looking at it further.

When it comes to an EXEC SQL statements it interviens and changes them depending on what iit is. The EXEC SQL statement can be one of these:

Declare sections

Declare sections begins with

exec sql begin declare section;
and ends with
exec sql end declare section;
In the section only variable declarations are allowed. Every variable declare within this section is also entered in a list of variables indexed on their name together with the corresponding type.

The declaration is echoed to the file to make the variable a normal C-variable also.

The special types VARCHAR and VARCHAR2 are converted into a named struct for every variable. A declaration like:

VARCHAR var[180];
is converted into
struct varchar_var { int len; char arr[180]; } var;

Include statements

An include statement looks like:

exec sql include filename;
It is converted into
#include <filename.h>

Connect statement

A connect statement looks like:

exec sql connect 'database';
That statement is converted into
ECPGconnect("database");

Open cursor statement

An open cursor statement looks like:

exec sql open cursor;
and is ignore and not copied from the output.

Commit statement

A commit statement looks like

exec sql commit;
and is translated on the output to
ECPGcommit(__LINE__);

Rollback statement

A rollback statement looks like

exec sql rollback;
and is translated on the output to
ECPGrollback(__LINE__);

Other statements

Other SQL statements are other statements that start with exec sql and ends with ;. Everything inbetween is treated as an SQL statement and parsed for variable substitution.

Variable substitution occur when a symbol starts with a colon (:). Then a variable with that name is found among the variables that were previously declared within a declare section and depending on whether or not the SQL statements knows it to be a variable for input or output the pointers to the variables are written to the output to allow for access by the function.

For every variable that is part of the SQL request the function gets another five arguments.

The type as a special symbol
A pointer to the value
The size of the variable if it is a varchar
Number of elements in the array (for array fetches)
The offset to the next element in the array (for array fetches)

Since the array fetches are not implemented yet the two last arguments are not really important. They could perhaps have been left out.

A Complete Example

Here is a complete example describing the output of the preprocessor:

exec sql begin declare section;
int index;
int result;
exec sql end declare section;
...
    exec sql select res into :result from mytable where index = :index;
is translated into:
/* These two include files are added by the preprocessor */
#include <ecpgtype.h>
#include <ecpglib.h>
/* exec sql begin declare section */

 int index;
 int result;
/* exec sql end declare section */

...
    ECPGdo(__LINE__, "select res from mytable where index = ;;", 
           ECPGt_int,&index,0,0,sizeof(int), 
           ECPGt_EOIT, 
           ECPGt_int,&result,0,0,sizeof(int), 
           ECPGt_EORT );
(the indentation in this manual is added for readability and not something that the preprocessor can do.)

The Library

The most important function in the library is the ECPGdo function. It takes a variable amount of arguments. Hopefully we wont run into machines with limits on the amount of variables that can be accepted by a varchar function. This could easily add up to 50 or so arguments.

The arguments are:

A line number

This is a line number for the original line used in error messages only.

A string

This is the SQL request that is to be issued. This request is modified by the input variables, i.e. the variables that where not known at compile time but are to be entered in the request. Where the variables should go the string contains “;”.

Input variables

As described in the section about the preprocessor every input variable gets five arguments.

ECPGt_EOIT

An enum telling that there are no more input variables.

Output variables

As described in the section about the preprocessor every input variable gets five arguments. These variables are filled by the function.

ECPGt_EORT

An enum telling that there are no more variables.

All the SQL statements are performed in one transaction unless you issue a commit transaction. This works so that the first transaction or the first after a commit or rollback always begins a transaction.

To be completed: entries describing the other entries.


PrevHomeNext
InstallationUplibpq