"GET"ing a BLOB from the database with the BLOB Streaming Engine
Current plans call for the use of the HTTP protocol to upload and retrieve BLOBs to and from the database. The BLOB Streaming Engine makes this possible by integrating a lightweight HTTP server directly into the MySQL server.
I am currently working on an alpha implementation of this, and would like to give a short demonstration of what is possible using this system.
We start by creating a table using any streaming enabled storage engine (a streaming enable storage engine, is an engine that supports the server-side streaming API):
use test;
CREATE TABLE notes_tab (
n_id INTEGER PRIMARY KEY,
n_text BLOB
) ENGINE=pbxt;
INSERT notes_tab VALUES (1, "This is a BLOB streaming test!");
Now assuming the MySQL server is on the localhost, and the BLOB streaming engine has been set to port 8080, you can open your browser, and enter this URL:
http://localhost:8080/test/notes_tab/n_text/n_id=1
With the following result:

So without even doing a
SELECT, you can GET a BLOB directly out of the database!Note that there is no need for the BLOB in the database to be explicitly "streamable" for this to work.

10 Comments:
Wooohooo! Wicked cool, Paul!
Thanks Jay! A response like that makes the effort worthwhile! :)
This is very interesting. Especially for user uploads, it would be nice to junk the asset server and just hit the DB, assuming it's fast.
How would horizontal scaling work? Would each DB shard somehow register itself as a proxy listener? Any thoughts?
Nice idea, but I'm curious about how the MIME type and character set for the response will be set, and how will security be handled?
Hi Anon,
Yes, one of the goals of the BLOB Streaming Project is to enable upload of BLOB data directly into the database, without any intermediate storage.
With regard to horizontal scaling: are you referring to registration with a reverse proxy? With the idea that the proxy would take the hits and then distribute them amongst the available database servers?
Hi Richard,
Funny, I have been wondering about that too! :)
Yes, I am not so sure myself. This project is still very much in the planning stage. Although I am working on an alpha version, this only includes components that I am fairly sure of at the moment.
So I welcome any input and ideas you may have, especially on the subject of security.
Here are my thoughts on the matter so far:
There are various levels to consider here: database level, application level and web-server level.
Web-server level security is probably the easiest to achieve. This can be done, at least initially, by using a proxy (like the Apache mod_proxy).
Database level security requires that a request be related to a particular database connection. In this way, an engine can determine if the database user has SELECT (for example) privileges on the column for which the BLOB is requested.
Application level security is probably the most tricky. It would involve knowing whether an application session is still active, and which BLOBs have been authorized for viewing. One idea would involve encoding an expiry time into the link.
With regard to MIME types and character sets (i.e "meta" data associated with a BLOB):
The character set is already part of the column definition, my idea at the moment is to put the rest of the meta data in the column comment.
I would be glad to hear if you have any suggestions or comments in this regard!
What if we take this further and allows read access to any data, returnable in JSON? That would provide a very nice REST access method into the data without having to go through the overhead of logging in and dealing with a SQL statement.
looks great! I can't wait to see the application (& source code) and play with it.
One question, though -- would it be possible to have some kind of a randomly picked unique identifier? Imagine YouTube running this, what's to stop folks from retriving record 1, then record 2, then record 3.....
Hi Francois,
As you have noted, the BLOB streaming engine turns MySQL into a "RESTful" application. I am sure this will please Roy Fielding, and fits perfectly into the MySQL Web 2.0 strategy!
What you are now suggesting is that we enable retrieval of entire records (or a selection of records) in JSON. Then, as you say, an application could just GET the data it wants out of the database (it could also use PUT to insert, POST to update and DELETE to ... delete records, for that matter).
Excellent idea! :) I will make sure that the server-side streaming API makes this possible!
Hi Sheeri,
I hope to post the first version within a month or 2. So we be able to try it out and kick some ideas around at the MySQL Camp this year!
You are right about the sequential IDs. It would make it easy for someone on the internet to retrieve all the BLOBs out of the database. This is another good point which belongs under the heading of security.
One possible solution is to restrict search access to an alternative key, which is generated randomly. For example, we could add a column to the notes_tab table, as follows:
CREATE TABLE notes_tab (
n_id INTEGER PRIMARY KEY,
n_random INTEGER UNIQUE KEY,
n_text BLOB
) ENGINE=pbxt;
A trigger could be used to generate a random number and insert this value in n_random on INSERT.
Then we need to be able to restrict HTTP access to the columns notes_tab.n_random and notes_tab.n_text. This could be done by defining an "anonymous_streaming" user. If no other authorization is given, then this user is assumed.
So anonymous web access will be forced to use the following URL (for example) for the notes_tab table:
http://localhost:8080/test/notes_tab/
n_text/n_random=327156188
Post a Comment
<< Home