Log in

No account? Create an account
entries friends calendar profile My Website Previous Previous Next Next
Mark Atwood
Varchar patch for MySQL memory engine
MySQL has a storage engine slash table type of "memory" or "heap". The memory engine keeps the table in memory instead of on a disk. It can be very fast. It has a couple of problems. One problem is that it handles varchar types poorly and wastefully.

This was a problem for eBay. So they paid one of their own smart employees to fix that problem.

Other people have heard about that patch, and asked for it. I couldn't give it to them, and it all started bogging down in pointless legal discussions. At least by the lawyer people and people who are into lawyer people.

So the author of the patch took a sword to the knot, and released the patch to the public.

Here it is: http://code.google.com/p/mysql-heap-dynamic-rows/

This project provides support for variable size records (aka dynamic row format) in MySQL Heap (Memory) Engine.

Contributed under GPL2 by eBay, Inc April 2008


As of March 2008, MySQL Heap Engine of any version is limited to fixed row format. It allocates fixed memory size for each record stored in a given Heap Table. For example, if table A has a VARCHAR(4000) column, MySQL will use at least 4000 bytes (plus other columns and per-record overhead) for every record regardless of whether it has that much user data. In this example, the table will use 4GB memory per 1M records.

Assuming that actual data in a VARCHAR varies (average data length is less than maximum), it would be best if Heap Engine could optimize its memory use. In the example above, if average VARCHAR data was 1000 bytes, the memory consumption would be 1GB rather than 4GB.

This project changes memory allocation mechanisms used by the MySQL Heap Engine. Each Heap Table uses separate area of memory for its own private use. When inserting new records, previous versions of MySQL would always allocate areas of "reclength" size. Reclength equals to maximum space that could be ever needed by one record in a given table.

The new approach adds a "create table" option to set chunk sizes. When inserting a new record, Heap Engine will analyze actual record data and calculate the number of "chunks" needed for a compact form of that record. It will then allocate the necessary chunks, link them together into a chunkset and store data in these chunks.

Variable size records provide "dynamic" row format, as opposed to "fixed" row format. For backwards compatibility, Heap Tables default to fixed row format. The project's scope is Heap (aka Memory) Engine/Table only, MyISAM and other engines do not have similar limitations.

For added flexibility, project introduces new HP_DATASPACE construct to MySQL Heap Engine, which should greatly simplify future BLOB support. Essentially, all BLOB's in a table would share their own HP_DATASPACE instance. Unfortunately, the author of this project did not have time to add BLOB implementation.

All the interesting technical details are provided in a big comment section in the beginning of hp_dspace.c, which documents design. The same information has been copied to the project's Wiki.

This project has been implemented by Igor Chernyshev of eBay Kernel Team.


3 comments or Leave a comment
intrepid_reason From: intrepid_reason Date: April 18th, 2008 11:02 pm (UTC) (Link)
That is really cool that Ebay allowed that.

Edited at 2008-04-18 11:02 pm (UTC)
From: jeffrey_mcmanus Date: April 19th, 2008 04:24 pm (UTC) (Link)
Not totally fair to lay the blame for this at the feet of eBay legal...when my team and I were releasing open source tools there back in 2005 I found the lawyers (especially the senior guys) were remarkably clued-in to open source. It was all the Harvard MBAs who needed convincing; legal arguments were sort of a fig leaf for them.
From: ebayspider Date: April 28th, 2008 05:58 am (UTC) (Link)
eBay's intention all along was to make it available to open source via GPL2. eBay thought Igor had posted it prior to eBay's announcement of it at the MySQL User Conference, but unfortunately there was a mixup on the timing, so people were looking for it and it wasn't there yet, but now its out there. The legal blah blah was around having it put in the enterprise product, but nothing too intriguing there either.
3 comments or Leave a comment