Friday, December 15, 2006

Implementation of Foreign Keys for the PBXT storage engine

I have just committed the initial implementation of foreign keys for the PrimeBase XT storage engine, whew! The implementation is similar to that of InnoDB and supports the RESTRICT, CASCADE and SET NULL options.

Currently the PBXT implementation requires that the data types of the referenced columns are identical. InnoDB allows references between CHAR and VARCHAR and strings of different length. PBXT also requires that the index used by foreign keys have the exact number of components. InnoDB can use indexes that have more components than the number of columns specified in the foreign key declaration.

On the other hand, PBXT supports all SQL syntax that can be used to create foreign keys. This includes the following syntax which is not supported by InnoDB:

CREATE TABLE t2 (
id INT PRIMARY KEY,
fk int REFERENCES t1(id)
) ENGINE=pbxt;

Maybe you are wondering why one engine supports a certain syntax, and another does not. Well, this was one thing that surprised me as well: MySQL parses but ignores the foreign key definitions. In fact, when you execute SHOW CREATE TABLE, MySQL calls the storage engine for the text of the foreign key definitions. So one of the biggest jobs has been to parse the SQL statements and extract the foreign key information.

Unfortunately, its not just done with CREATE TABLE. The engine also has to support ALTER TABLE, and all other statements that modify a table (e.g. CREATE/DROP INDEX) have to be parsed to determine which table is being altered. The reason for this is, when MySQL alters a table it creates a new table and copies the data from the old to the new table. So when the new table is created the engine has to identify the source table and copy over the foreign key definitions or they will be lost!

One thing to watch out for when parsing the SQL statements is that column and table names have to be converted to UTF-8. The conversion depends on the charset of the text which in turn depends on the current charset of the database connection. Table names have to be further converted into file names which use a special encoding schema to avoid non-US ASCII characters. All this requires calling certain character conversion routines in MySQL.

The other major problem that needed to be solve involved the MySQL st_table structure. This structure contains all the information about a table, including: column definitions, data types, character sets and index descriptions. This is actually the data from the .frm file of the table. Since the engine has access to this structure when a table is opened by MySQL, there is normally no need for the engine to store this information itself.

However, foreign keys force the engine to access tables that are not necessarily open by MySQL. For example if MySQL does an INSERT on table t2 (see definition above), then MySQL will open t2 before it performs the operation. But the engine is required by the foreign key definition to check column id on table t1. This table may not be open by MySQL (depending on the table cache). And without the st_table structure accessing the table is not possible, unless the engine stores all the information itself.

It took a little while to figure out, but my solution was to open the MySQL table myself. This means that PBXT calls routines in MySQL that load the information from the .frm file as needed.

Of course the engine will have a bit of a problem if the definition of any of the functions change. Since the engine is linked dynamically to MySQL this could occur without notice. So I think there is a general need to make these routines part of an official engine callback API.

Next on my list is CHECK and REPAIR table but before I get on to that, I will be on holiday in South Africa until the 12th of January! :)

If you would like to test PBXT foreign keys, checkout the trunk of the subversion tree directly as follows:

$ svn co https://pbxt.svn.sourceforge.net/svnroot/pbxt/trunk/pbxt pbxt

Then, assuming you have your MySQL tree in the directory /home/build/mysql-5.1.12 you can build PBXT as follows:

$ cd pbxt
$ ./configure --with-mysql=/home/build/mysql-5.1.12 --libdir=/home/build/mysql-5.1.12/mysql-test/lib/mysql
$ make install


This will install the PBXT plug-in (libpbxt.so) in the directory required by mysql-test-run. If you built MySQL as a debug version then remember to include --with-debug=full when configuring PBXT.

Instructions about how to test PBXT with mysql-test-run are given in the README file in the mysql-test-update directory. I have modified the InnoDB test scripts that test foreign keys to work with PBXT.