RC: W2 D4 — How Bitcask avoids running out of disk space

February 22, 2024

In a previous article, I explained that, because Bitcask is a log-structured storage engine, it keeps on writing new key-value pairs at the end of a datafile. Each recently written value for a given key is the one that will be returned when looking up that key. As a consequence, all the values that have been previously recorded in one of the datafiles for that key will never be used anymore.

Because we keep on appending to datafiles, at some point, we will run out of disk space. To avoid that, log-structured engines typically perform a “merge/compact” process. The compaction process means that we discard the obsolete key-value pairs to keep only the relevant ones (i.e. the most recent ones) in order to reclaim disk space. The merge process means that we merge datafiles together so that we reduce the number of files that need to be managed.

In practice, this means we will need to create a new “merged datafile” that displays only the most recent value for each key. We want to start reading in this new merged file only once it has been written fully (so that we avoid returning values that have only been partially written). Because Bitcask uses an in-memory hash index (the “keydir”) to know in which datafile it should read the value, this should be easy to do. Here is what I did:

  1. First, read old datafiles (that will be merged) from oldest to newest and record only the most recent value for each key. We can do this by using an in-memory hashmap to record the value for each key (everytime a new value for an already existing key is read, we replace the old value with the new value in the hashmap);
  2. Second, write the new merged file, by flushing the content of the hashmap (i.e. the most recent value for each key) to disk;
  3. Third, once the new merged file has been fully written, update the in-memory hash index (“keydir”) to retrieve values in the new merged file instead of the old datafiles;
  4. Last, delete the old datafiles, since the “keydir” does not point towards any of them anymore.

With this, we have a fully working merge worker that can perform the merge/compact process. Hurray ! 🎉