Friday, 16 August 2013

Redis vs BangDB - Performance Comparison

This post is not related to our series of posts on "distributed computing". I have digressed a bit and since I released the BangDB as master -slaves config cluster hence thought of doing a simple performance comparison with the very popular db redis. This post is about a simple performance comparison of Redis and BangDB (server)

Redis: ( http://redis.io/topics/introduction )

Redis is an open source, BSD licensed, advanced key value store. It is also referred to as a data structure server since key can contain strings, hashes, lists, and sorted sets

In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on the use case one can persist it either by dumping the data set to disk every once in a while, or by appending each command  to a log

Redis also supports trivial-to-setup master-slave replication, with very fast non-blocking first synchronization, auto reconnection. Other features include Transaction, Pub/Sub, Lua  scripting, Keys with limited time-to-live, and configuration to make Redis behave like a cache

BangDB: ( www.iqlect.com )

BangDB is multi flavored, BSD licensed, key value store. The goal of BangDB is to be fast, reliable, robust, scalable and easy to use data store for various data management services required by applications

BangDB is transactional key value store which supports full ACID by implementing optimistic concurrency control with parallel verification for high performance and concurrency. BangDB implements it's own buffer pool, write ahead log with crash recovery and provides users with many configuration to control the execution environment including the memory budget

BangDB works as embedded, stand alone server and cluster db. It's very simple to set up master-slave configuration, with high performant non-blocking slave synchronization without ever bringing the server standstill or down



Following machine (commodity hardware) used for the test;
  • Model : 4 CPU cores, Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz, 64bit
  • CPU cache : 6MB
  • OS : Linux, 3.2.0-51-generic, Ununtu, x86_64
  • RAM : 8GB
  • Disk : 500GB, 7200 RPM, 16MB cache
  • File System: ext4
The Bangdb configuration;
  • Key size : 12 bytes  (unique, random)
  • Val size : 20 bytes - randomly picked
  • Page size : 8KB
  • Write ahead log : ON
  • Log split check : ON, check every 30 ms
  • Buffer pool size : 512MB
  • Background log flush : every 50 ms
  • Checkpointing : every ~4 sec
  • Buffer pool background work : ON, every 60 ms
  • Number of concurrent threads : 4

The Redis configuration:


All default values provided in the redis.config. However, we switched OFF/ON the save part as mentioned before individual tests below. Note that with save OFF redis performs better as it doesn't have to dump data frequently

Goal:


Goal of the exercise is to test the redis and BangDB as server. I will not go into the details of slave synchronization and replication to test them as cluster (master - slaves). However, in future post I will cover this area too

The client and server are on the same machine in order to avoid the network latency for the tests

Test 1

10 M (key, val) put and get. Save = ON and OFF. For Save = ON, redis saving with default values and BangDB flushing with 50ms frequency. For both cases fsync is off




Test 2

50M key val put and get. Here redis completes the test when save is OFF, that is when redis is not writing anything to disk not even the append only log. But when save is ON, it take very very long time to finish the test. In fact it took more than 10 times to complete the test (compared with BangDB). Here are the graphs;







Test3

100 M (key, val) put and get. Here I was unable to complete the test for redis even with save = OFF. Redis works well until 67M keys but then the performance goes down really low and it starts taking a second to put 50 keys. I had to halt the test in the interest of time hence the graph will only show partial figure for redis. Note that this is happening when save is off and db is doing no write to disk at all. The performance of redis goes further down if we enable save and it gets stuck much before 50M when save = ON. For get, redis performs well even for the higher numbers, but unfortunately since I was unable to insert 100M keys and values hence could not run the get test for 100M keys.

BangDB on the other hand works on expected lines and performs well for both read and write


Test4

Last test is to put and get 1Billion  keys and values. Expectedly, I was unable to complete the redis test as it was not done even after running it for more than 10 - 12 hours, hence I am producing the results of  BangDB only



Conclusion:

BangDB and Redis both are very high performant db and serves well for both read and write of key value sets. Though BangDB performs better when compared with redis, both for read and write and in save mode or non save mode(as a cache). We also note that beyond a certain point, performance of redis suffers a lot and it goes down drastically to 100 keys per second for write which is too low from any perspective. Case in point is the 100M key value insert test which could not get completed because redis was too slow after 67M inserts. Whereas BangDB continued with expected performance even for billion keys insert. The test was done on 20 byte val, if we increase the value size then redis would slow down much before 67M as experienced here

BangDB implements its own Buffer Pool with semi adaptive page prefetch and page flush algorithm. BangDB also implements write ahead log which is just append only and hence optimizes the disk writes by avoiding random seeks as much as possible. Interesting point to note is that the log flush frequency which can be set in micro sec and for our test it was 50ms.

Redis has got many features and hence called as data structure server as well. Whereas BangDB focus is on handling high volume of data with expected high performance. If specific features are required (like list, set etc...) then redis works very well and BangDB would not support these. But when handling of large amount of data is required, and survival in highly stressed scenario is important and finally performance is important criteria then BangDB suits better. Note that transaction, sync and replication are other features which are supported by both these dbs

Note that BangDB comes in various flavor, namely embedded db, client server model and p2p based clustered elastic data space. Interestingly BangDB client is same for all flavors which means once code is written for any of the flavor, user can potentially switch from one model to another based on requirement in few minutes.

The test apps was written in c++ and used hiredis for redis and bangdb-client for BangDB. Both dbs are available free of cost under BSD license hence one is free to download and test accordingly. Also note that for BangDB we can specify memory budget for the server to use, that is if we allocate 2GB on 8GB RAM machine, it will only use 2GB and not go beyond that. For the above test we used 5GB as memory budget on 8GB RAM machine. On the other hand redis was using all available and there was no limit set.

Please post your comments and thoughts.

Enjoy! 
Best,
Sachin Sinha