I really wanted to feature a WALL-E pun in this post, but I couldn't come up with anything.
I spent some time adding code for a commit log this weekend. The log is disabled by default and can only be enabled via a code change and recompile.
The commit log hooks into SiteMailbox.deliver() and gives the mailbox an opportunity to provide messages that require durable delivery to a commit log. When the log is disabled a dummy commit log is used that immediately marks messages as durable and delivers them.
The change set is annoyingly large because it adds a parameter to the constructor for SiteMailbox as well as a bool requiresDurability() method that needs to be implemented by all subclasses of VoltMessage. The impact also propagates to test cases and mock classes that need to adapt to the new constructor as well as the new methods they are expected to implement.
As far as I can tell only initiate tasks, membership notices, and heartbeat messages need to be durable. There is a fast path for heartbeat messages that delivers them before they are durable so that the heartbeat response can be sent back immediately. This helps ensure that once initiate tasks are durable they will be ready to leave the restricted priority queue. I am %99 sure this is okay because the heartbeat responses don't effect the durability of information sent back to the client.
Right now the commit log implementation doesn't support truncating or replaying the log. It is just there to measure the overhead of waiting for messages to go to disk. Individual portions of the commit log that are the smallest truncatable quantity are called commit quantums. Each commit quantum is a file named by the txn id of the snapshot that created it, or 0 if it is the first quantum.
I ran the code on my laptop which only gets a paltry 2400 TPC-C txns/sec. This used to be up in the 10k range which is a little disappointing. Scaling down is not something we are targeting, but it is sad to have regressed. I didn't see a performance impact, even with synchronous batch commits every 20 milliseconds. TPC-C doesn't report latency to the console so I don't have any information on that. I assume that the file-system is lying about msyncs. I will try and run some numbers on a real cluster with replication this coming weekend, but our cluster may be occupied preparing for the 1.2 release.
This is my first experience using memory mapped files, and I have to say that it is the most civilized method of file IO that I have ever had the pleasure of experiencing. I regret not taking advantage of memory mapped IO when writing the snapshot code.
Statements and opinions presented here are not those of VoltDB Inc. unless specifically presented as such.