Tuesday, November 28, 2006

PBXT: Experience with testing Storage Engines

Many users may be unaware of the fact that there is a great tool for automatically testing MySQL called mysql-test-run. The most recent version of this program is written in perl and resides in the mysql-test directory of the MySQL source code tree.

Tests are written as scripts (.test files) which contains SQL statements and a number of meta-commands. Each .test file has a corresponding .result file which is used to verify the test output. A text-based "diff" is used to confirm that the output generated by the test is identical to the expected result.

To run a test mysql-test-run automatically sets up a clean environment and starts the MySQL server. It then executes the test scripts specified (by default all tests are executed) and verifies the output.

Most of the tests shipped with MySQL are intended to run with the MyISAM storage engine. There are also tests specifically created for other MySQL engines like NDB and InnoDB.

But what about testing 3rd party engines like PBXT? Firstly, its easy enough to start the test, just execute the following:

$ ./mysql-test-run --mysqld=--default-storage-engine=pbxt

This tells mysql-test-run to start mysqld with the option that sets PBXT to the default engine. So all CREATE TABLE statements that do not explicitly specify an engine now use PBXT. After this the real work starts:

There are over 630 tests in MySQL 5.1.12 and currently PBXT requires 223 individual changes to the test scripts for them to run. Of the 630 test results, 187 need to be changed. This is a lot of work given that verifying a difference is not a bug can be quite time consuming.

Meanwhile many differences are trivial, so ideally they would not appear at all. Lets have a look at some of the most common test differences:

SHOW CREATE TABLE:

The SHOW CREATE TABLE statement output depends on the default engine. So CREATE TABLE (...) ENGINE=MyISAM must be changed to CREATE TABLE (...) ENGINE=PBXT in over 360 instances.

The correct way to solve this problem would be to use the mysql-test-run command --replace-regex to transform the output to something like ENGINE=?. But, of course, this change would need to be checked into the MySQL source code tree to be of long term help.

MERGE, FULLTEXT or GEOMETRY:

All these features are only supported by MyISAM tables. When using these features the test should explicitly specify ENGINE=MyISAM. An even better solution would be if MySQL could automatically pick an engine that supported the features used in the CREATE TABLE. In this case the default engine will take preference.

ENGINE=InnoDB:

In order to use the InnoDB tests to test some other transactional engine, all the CREATE TABLE (...) ENGINE=InnoDB statements must be changed. Ideally MySQL would support syntax of the form: ENGINE=TRANSACTIONAL, which would automatically pick a transactional engine from those available. If the default engine is transactional then that would obviously be used.

ORDER BY Required:

By far the biggest problem when testing PBXT is that the "natural" SELECT order often differs from that of MyISAM. This means that a SELECT without an ORDER BY returns the correct rows, but in a different order to MyISAM. This is due to the way PBXT manages free space.

Making this problem even worse is the fact that the order can differ from one run to the next because PBXT manages the free space asynchronously. The only thing that helps is to add an ORDER BY clause to the SELECT, but I am wondering if the problem cannot be solved in general by mysql-test-run. For example this could be done by a more intelligent diff function.

EXPLAIN Differences:

Each engine has its own way of calculating the heuristics used by the optimizer. This causes many differences in the result files due to slightly different EXPLAIN output. On the other hand there is a need to verify that queries are being optimized correctly. I'm clueless on this one at the moment, so suggestions are welcome.

I think other engine developers will agree, its a major task so any help in testing will be greatly appreciated!

Monday, November 06, 2006

The PBXT pluggable storage engine and the MySQL 5.1.12 Beta release

OK, it took a while, but it has been worth the wait. At last I can now guarantee that when you download the PBXT source code, it will compile! But seriously, with the released of MySQL 5.1.12 Beta we now have an excellent platform for testing the pluggable storage engine API.

These are interesting times because we are seeing the future of storage engines in MySQL. The PBXT 0.9.73 release demonstrates the ease with which externally built storage engines can be combined with MySQL in the future. And, as I have mentioned before, PrimeBase XT is the first engine to take full advantage of this new feature in MySQL.

And while I am singing my praises let me remind you of 2 other reasons why you should try out PBXT: the engine achieves high concurrency using a pure MVCC implementation (MVCC stands for multi-generational concurrency control, if you want to know why, ask Jim! ;) and great performance with a write-once update strategy.

Download and further details can be found at http://www.primebase.com/xt as usual. The PBXT project is now hosted by SourceForge.net so for SVN access, bug reports and tracking, please go to: http://sourceforge.net/projects/pbxt.

To help you get started I have included 6 steps to build and test PBXT with MySQL 5.1.12 below.

Building and testing PBXT for MySQL 5.1.12

This description assumes the following:
  • The user's name is myname.

  • The home directory of myname is: /home/myname

  • myname is using a working directory called: /home/myname/mysql-dev

1. Download the MySQL 5.1.12 source code tree, and unpack it in the /home/myname/mysql-dev directory.

2. Build MySQL as follows:

$ cd /home/myname/mysql-dev/mysql-5.1.12-beta
$ ./configure --prefix=/home/myname/mysql-dev/test --with-debug=full --with-mysqld-user=myname --with-extra-charsets=complex --with-plugins=max-no-ndb --with-embedded-server --enable-thread-safe-client --with-big-tables
$ make


3. Download the source code of PBXT 0.9.73 (or later), and unpack it in the /home/myname/mysql-dev directory.

4. Build and install PBXT as follows:

$ cd /home/myname/mysql-dev/pbxt-0.9.73-beta
$ ./configure --with-mysql=/home/myname/mysql-dev/mysql-5.1.12-beta --libdir=/home/myname/mysql-dev/mysql-5.1.12-beta/mysql-test/lib/mysql --with-debug=full
$ make install


5. Install the PBXT version of the mysql-test-run scripts:

$ cd mysql-test-update
$ ./update_mysql_test.sh /home/myname/mysql-dev/mysql-5.1.12-beta


6. Run the MySQL test scripts:

$ cd /home/myname/mysql-dev/mysql-5.1.12-beta/mysql-test
$ ./mysql-test-run --mysqld=--default-storage-engine=pbxt --force


NOTES:
  • Almost all the tests should run through without error (I am still working on 2 or 3 tests that fail).

  • Step 4 installs the PBXT shared library, libpbxt.so, in the mysql-test/lib/mysql directory where it is required by mysqld, which it is started by mysql-test-run.

  • Step 5 modifies the file mysql-test/lib/init_db.sql, adding the statement: INSERT INTO plugin VALUES ('pbxt', 'libpbxt.so');, which will cause mysqld to load the plug-in when started.

  • Both MYSQL and PBXT must be compiled with the same value for the --with-debug switch (on/off/full). If not, the plug-in will not load.

Normally you would install a plug-in using the command: INSTALL PLUGIN pbxt SONAME 'libpbxt.so';. However, unless you add this statement to every test script, this will not work when using mysql-test-run because the server environment is re-installed every time mysql-test-run starts.

By the way, the new perl based mysql-test-run is much faster on startup than the old UNIX shell based program. Makes testing much more fun ... thanks MySQL :)