I recently had a conversation with Lalit Sarna of Oxylabs about scalability and he introduced me to Tokyo Cabinet, a key/value store (database). This category of databases, referred to as DBM, differs from a RDBMS (such as MySQL) in that there are no tables and therefore no concept of rows. Instead you soley provide keys and get/set/delete values for that particular key.
Your advantage is lightning speed. And, apparently Tokyo Cabinet is the king of the category. We are talking speeds of 10-50x greater than MySQL. Tokyo Cabinet supports multiple underlying database engines with each providing its own advantages.
- Hash – Hash tables provide O(1) insert and lookup which can not be beaten so they are your fastest option
- B+ Tree – The underlying data is sorted allowing for prefix and range matching. Speed is not quite as great as Hash since B-trees are O(logb n) for insert and lookup. See here for more details
- Fixed Length – Your values are stored in one large array which is as fast as it gets since its O(1) and the data is concurrent. However your keys have to be natural numbers.
- Table – Attempts to replicate a traditional table database, however no fixed data schema or data types are required. Built on top of the hash db for speed.
Tokyo Tyrant is the network interface that sits on top of Tokyo Cabinet allowing your software to communicate with Tokyo Cabinet. Tokyo Cabinet is often referred to as the “database library” while Tokyo Tyrant is the “database server”. Tokyo Tyrant supports the memcached and http protocols.
I love that it supports memcached. That means you can plug and play with Tokyo Cabinet using the many existing memcached clients/libraries. Tokyo Cabinet is not meant to be a replacement for memcached, but you could theoretically use it as such and with minimal setup time. You would get the benefits of persistent data and much cheaper storage with some loss in performance.
In conclusion, targeted and appropriate use of Tokyo Cabinet would allow load to be removed from your traditional RDBMS in cases where the full functionality of a RDBMS is not needed, resulting in major performance improvements.