Friday, May 18, 2007

The Scalable BLOB Streaming discussion so far

Having discussed BLOB streaming for MySQL with a number of people I have gathered quite a bit of input on the subject.

So here are some details of the issues raised so far:

Server-side API

One of the first things that Brian Aker pointed out to me was that the server-side API must make it possible to use the sendfile() system call. The call does direct disk to network transfer and can speed up delivery of a BLOB stream significantly.

The server-side API must also include a mechanism to inform the stream enabled engine that an upload is complete. Assuming the streaming protocol used is HTTP, there are 2 ways of determining the end of a download: either the connection is closed when download is complete, or the length of the BLOB data is specified in a HTTP header.

ODBC Issues

Monty was concerned about compatibility with the MySQL ODBC driver. The ODBC function SQLPutData() is used to send BLOB data, and SQLGetData() is used to retrieve BLOB data. Both functions transfer data in chunks. The implementation of these functions would have to be made aware of streamable BLOB values.

This can probably be done in a way that is transparent to the user. It may be necessary to designate certain database types as streaming BLOBs. One suggestion is to add 2 new types to MySQL. LONGBIN and LONGCHAR. These keywords are not yet used by MySQL.

However, from the OBDC side the data types LONGVARBINARY and LONGVARCHAR are already mapped to the MySQL types BLOB and TEXT respectively. But this simply means that it would be transparent to an ODBC application whether a BLOB is being streamed, or retrieved using the standard client/server connection.

Upload Before or After INSERT

There is some debate about how exactly a BLOB should be inserted. The 2 main possibilities are:

1) Upload the BLOB. The upload returns a handle (or key). Perform the INSERT setting the BLOB column to the key value.

2) Perform an INSERT, but specify in a CREATE_BLOB() function the size of the BLOB that is to be inserted into the column. The caller then SELECTs back the inserted data. In the BLOB column the caller finds a handle. The handle can then be used to upload the data.

For method (1) it was also suggested that the client application specify a name, instead of using a handle returned by the server. The name could then be used as input to a "create_blob" function which is specified in the INSERT statement. The advantage of this is that the text of the insert statement is readable.

One advantage of method (1), as pointed out by Monty, is that the client does not have to wait for the upload to complete before doing the INSERT. The server could wait for the upload to complete. This would require a client side generated identifier for each BLOB uploaded.

Note that method (2) requires use of a transaction to prevent the row with the BLOB from being selected before the BLOB data has been uploaded.

Storage of BLOBs

Stewart Smith suggested that the BLOBs be stored by the streaming engine itself. This is something to be considered, and probably will be necessary in most scale-out scenarios. Otherwise the current plan is that each engine that is streaming BLOB enabled (i.e. that supports the server-side streaming API) will store the BLOBs themselves.

Scaling Writes

Also mentioned in regard to scaling is the fact that read scale-out is usually more important than write. However, there are web-sites that have a heavy write load. To be investigated would be how NBD could used in combination with the streaming engine to scale-out uploading of BLOBs.

mysqldump & mysql

As pointed out by Monty, the output of mysqldump is a readable stream. So any changes to mysqldump in order to support streamable BLOBs should ensure that this continues to be the case. This is related to the question of how to upload BLOBs using the mysql client. One idea would be to provide a command in mysql to upload a file or a block of binary data or text.


In some cases it may be necessary to check whether a client is authorized to download or upload a BLOB. The most straight forward method would be for the server to issue authorization tokens which are submitted with an upload or download. These tokens expire after a certain amount of time, or when the associated server connection is closed. However, if the streaming protocol is to be HTTP then we need to investigate the possibility of using standard HTTP authentication when retrieving and sending BLOBs.

In-place Updates

In-place updates of BLOBs is allowed by some SQL servers (for example by using functions UPDATETEXT and WRITETEXT supported by MS SQL Server). In this case the streaming enabled engine would be responsible for locking or using some other method to ensure that concurrent reads of the BLOB remain consistent.

PHP Upload

Georg Richter, who is responsible for the PHP Native Driver at MySQL, noted that using BLOB streaming, data could be uploaded from the Web directly into the database. The PHP extension to the standard MySQL client software should include a function to make this possible.


Georg also pointed out that it should be possible to convert BLOBs currently stored in MySQL tables into streamable BLOBs using ALTER TABLE. This would enable developers to quickly try out streaming BLOBs and test the performance characteristics, in combination with their applications.

These are the main issues raised so far. Any further ideas, comments and questions are welcome! :)

Tuesday, May 01, 2007

PBXT and the MySQL Conference & Expo 2007

The conference is over, and it was a great week but pretty exhausting! From what I saw, and from what I have heard the sessions were of a very high standard this year. I have come back with a number of new ideas and quite a few things I would like to implement in PBXT. One of these is new online backup API for MySQL which will enable you to make a snapshot backup of data stored by all engines, and do point in time recovery.

For me the week started off with the BoF on scalable BLOB streaming on Tuesday evening. The BoF was well attended and there was significant interest in the topic. I will be reporting on some of the issues discussed soon.

On Wednesday morning I presented: PrimeBase XT - Design and Implementation of a Transactional Storage Engine. I was pleased with the number of questions, interest and feedback I received.

When I mentioned to Mårten Mickos that someone had said it was the best session they had heard so far at the conference, Mårten told me that they pay guys to say this to the speakers so that they feel good! So I must thank MySQL for this very thoughtful gesture - joke, of course ;)

Thanks also to Sheeri Kritzer for video recording the talk so I hope to get a link to that soon. In the meantime I have posted my slides to the PBXT home page:

After that I also took part in 2 lightning rounds sessions: Top MySQL Community Contributors and State of the Partner Engines. Both sessions were interesting in their diversity. One of the top contributors, Ask Bjørn Hansen, described how meeting his goal of filing a bug a week was a lot easier in the early days of MySQL. However, more recently he has been helped by the current state of the GUI tools!

During the Partner Engines session, it came as a surprise to some people in the audience that not all engines are for free. Actually there is a wide spectrum from GPL over partially free to highly priced. In fact, one of the developers of a data-warehousing engine found the question as to whether it may be free quite amusing. Solid has not decided to what extent it high-availability offering will be commercial, and the Amazon S3 engine is free, but the service behind it not. So that's something to watch out for in general.

I took the opportunity during the Partner Engines session to mention our plans for the future of PBXT which involve building a scalable BLOB streaming infrastructure for MySQL. This is relevant to a number engine developers as we will be providing an open system which will enable all engines to stream BLOBs directly in and out of the server. So please check out for more details.