April 18th, 2008


At the MySQL Storage Engine Summit

The complaint is the same as last year:

MySQL's internal data structures change from version to version without notice. Everyone wants abstract APIs. But MySQL engineering management executives dont want to spend the effort to implement them.

In their defense, MySQL is already so late in delivery that it's hard to justify tasking people to work on that instead.

To demolish that argument, one of the main reasons that MySQL is so slow to ship, is BECAUSE we dont have workable abstract interfaces to our internal data structures.

Comments on the Slashdot article "Sun to Begin Close Sourcing MySQL"

A number of people have asked me about the Slashdot article Sun to Begin Close Sourcing MySQL.

The software that was proposed to be closed source are portions of the online backup drivers. Each such driver has to be written in close cooperation with the developers of each storage engine. Well...

InnoDB already has an online backup tool, and even if/when they revise their tool to use this new API, it's still going to be theirs, open or closed, not the property of the MySQL Group.

Online backup of the engines for Archive, CSV, Blackhole, and Memcached doesn't even make sense, and even if it did, BrianA will flat out refuse to write crippleware into his own software.

Similarly, while online backup makes sense for Maria, I don't see MontyW writing crippleware into his work.

How about MyISAM? I think that work is already done, but, the horse is already out of the barn, in that the online backup drivers for it just went up on bkbits.

Looking even closer, the part that was going to be closed was not even the entire online backup driver set, but just compression and encryption. Any halfway competent developer would be able to hook in the necessary calls to azio, zlib, and openssl, and replicate the work.

So this is a big tempest over something that's not going to happen, and doesn't matter anyway.

Plus, best practices for backup dont even use or want online backup. The Right Way to backup a real production MySQL instances is via filesystem snapshot, using something like LVM or ZFS.

As a small aside, the Slashdot headline was not entirely accurate. It wasn't the Sun executives who decided this. It was the MySQL executives. What that means, especially in light of the keynote speeches given by CEO Jonathan Schwartz and VP Rich Green, is interesting, and remains to be publically seen.

Varchar patch for MySQL memory engine

MySQL has a storage engine slash table type of "memory" or "heap". The memory engine keeps the table in memory instead of on a disk. It can be very fast. It has a couple of problems. One problem is that it handles varchar types poorly and wastefully.

This was a problem for eBay. So they paid one of their own smart employees to fix that problem.

Other people have heard about that patch, and asked for it. I couldn't give it to them, and it all started bogging down in pointless legal discussions. At least by the lawyer people and people who are into lawyer people.

So the author of the patch took a sword to the knot, and released the patch to the public.

Here it is: http://code.google.com/p/mysql-heap-dynamic-rows/

This project provides support for variable size records (aka dynamic row format) in MySQL Heap (Memory) Engine.

Contributed under GPL2 by eBay, Inc April 2008


As of March 2008, MySQL Heap Engine of any version is limited to fixed row format. It allocates fixed memory size for each record stored in a given Heap Table. For example, if table A has a VARCHAR(4000) column, MySQL will use at least 4000 bytes (plus other columns and per-record overhead) for every record regardless of whether it has that much user data. In this example, the table will use 4GB memory per 1M records.

Assuming that actual data in a VARCHAR varies (average data length is less than maximum), it would be best if Heap Engine could optimize its memory use. In the example above, if average VARCHAR data was 1000 bytes, the memory consumption would be 1GB rather than 4GB.

This project changes memory allocation mechanisms used by the MySQL Heap Engine. Each Heap Table uses separate area of memory for its own private use. When inserting new records, previous versions of MySQL would always allocate areas of "reclength" size. Reclength equals to maximum space that could be ever needed by one record in a given table.

The new approach adds a "create table" option to set chunk sizes. When inserting a new record, Heap Engine will analyze actual record data and calculate the number of "chunks" needed for a compact form of that record. It will then allocate the necessary chunks, link them together into a chunkset and store data in these chunks.

Variable size records provide "dynamic" row format, as opposed to "fixed" row format. For backwards compatibility, Heap Tables default to fixed row format. The project's scope is Heap (aka Memory) Engine/Table only, MyISAM and other engines do not have similar limitations.

For added flexibility, project introduces new HP_DATASPACE construct to MySQL Heap Engine, which should greatly simplify future BLOB support. Essentially, all BLOB's in a table would share their own HP_DATASPACE instance. Unfortunately, the author of this project did not have time to add BLOB implementation.

All the interesting technical details are provided in a big comment section in the beginning of hp_dspace.c, which documents design. The same information has been copied to the project's Wiki.

This project has been implemented by Igor Chernyshev of eBay Kernel Team.