Monday, September 27, 2010

fsync/fdatasync from a separate thread

I spent some time last week trying to resolve the issue of snapshots sucking up the entire page cache. I was concerned that calling fsync/fdatasync might block writes and contribute to the disk remaining idle based on what I read in this blog post by Antirez. I created this test case to simulate the snapshot use case and got the following output:
Starting first writes at 1285341921 
Finishing first writes with duration 18 at 1285341939 
Starting first sync at 1285341939 
Beginning second writes at 1285341939 
Finished second writes with duration 22 at 1285341961 
Finished first sync with duration 26 at 1285341965 
Doing final sync 1285341965 
Did final sync 1285341965 
This shows that an fdatasync doesn't block concurrent writes. In fact, it appears to block until the data buffered by concurrent writers is flushed.

I made the following changes to DefaultSnapshotDataTarget in order to have it perform an fdatasync every second, and to cap the outstanding number of bytes before an fdatasync returns to 256 megabytes. The change set this diff is from also includes a bug fix to the statistics for snapshots so that it now reports accurate throughput information. During tests I found that 8-core AMD box with 4 7.5 disk in RAID-0 could snapshot 220 megabytes/sec which is a new record. Previously throughput on that box was only 120 megabytes/sec, but I can't recall if that was a result of the errors in measurement. A naive IO benchmark that writes 0s to a file can achieve a throughput of 260 megabytes/sec on that box.

I have some further changes that I would like to push to DefaultSnapshotDataTarget that remove the unnecessary counter, but I am going to hold that back until after 1.2.


Statements and opinions presented here are not those of VoltDB Inc. unless specifically presented as such.

No comments:

Post a Comment