RC: W3 D1 — Tombstones
February 26, 2024Today, I decided I would add one missing feature to my storage engine: the ability to delete records.
But wait: this is a log-structured storage engine. Which means that we only ever append to a file. So how can we delete a record? The answer is: we add a special kind of record that is called a tombstone. So basically, whenever we add a tombstone, the storage engine has to understand that the key no longer has any value associated to it.
In the case of Bitcask, I figured this would have to be handled at two places:
- When we insert a tombstone, we want to remove the corresponding key from the keydir right away, so that we don’t even bother reading the datafile (an alternative would have been to point to the tombstone record, seek it and return ‘None’ when decoding it — but this seemed like unnecessary overhead)
- When we perform the merge process, every time we come across a tombstone, we need to:
- Update the keydir so that we stop looking for this key when it is a tombstone
- Nonetheless record the tombstone in the merged file (this is in case the key is still present in older data files)
Now, how can we encode a tombstone? I guess we have several options (this list is likely not exhaustive):
- We could enter a particular string in lieu of a value (for example: “__TOMBSTONE__”). Though, the problem with this approach is that you could never distinguish between an actual tombstone and a value “__TOMBSTONE__” ( arguably, the more specific the string, the less likely it is to collide with an actual value).
- Alternatively, we could add a field in the metadata that indicates whether the record is a tombstone or not. Though, this would take an extra 1 byte when only 1 bit is really necessary. David, whom I paired with on this, very rightly told me that I could later reclaim this extra 7 bits to store another piece of information later.
- I could also decide that a tombstone would be a record whose value size is 0 bytes.
I decided to go for the third options because it seemed to be the most efficient in terms of space. Of course, now that I have finished it and that am writing it down, I am wondering if this was actually a good idea. Answer: probably not, because I can’t differentiate between a value that is an empty string and a tombstone then… Well, I’ll leave it as is for now and figure it out later!