RC: W9 D2 — Serializing manifest events
April 9, 2024Once I knew I could reconstruct the state from a list of events, I needed to serialize them to disk. There are two kinds of events:
- Flush events
- Compaction events
To distinguish between the two, I prepended each record with 1 byte stating whether the event is of the former (code: 0) or latter (code: 1) kind. Both these events basically need to store the path to the SSTable files they created or deleted. When deserializing them, I need to know at which offset the path encoding ends, so I prepended each path by a 1-byte integer indicating the size of the path (1 byte seemed sufficient, as I assumed that no path would be longer than 255 bytes). Each such “ManifestSSTable” is thus formatted as follows:
+-----------------------------+
| MANIFEST SSTABLE |
+-----------+-----------------+
| path_size | path |
+-----------+-----------------+
| 1 byte | path_size bytes |
+-----------+-----------------+
Each flush event stores only one such SSTable path, so its format is relatively straightforward:
+------------------------------------------------------------------------+
| FLUSH EVENT |
+------------------+-----------------------+-----------------------------+
| Category (FLUSH) | manifest_sstable_size | ManifestSSTable |
+------------------+-----------------------+-----------------------------+
| 1 byte | 1 byte | manifest_sstable_size bytes |
+------------------+-----------------------+-----------------------------+
However, compaction events are slightly more complex as they require storing an array of input SSTables and another array of output SSTables, as well as the level to which they are applied. I thus ended up using the following format for them:
+--------------------------------------------------------------------------------------------+
| COMPACTION EVENT |
+----------+---------------------------------+-----------------------+-----------------------+
| Category | Extra | Input SSTables | Output SSTables |
+----------+---------------------------------+-----------------------+-----------------------+
| COMPACT | level | iSSTs_size | oSSTs_size | iSST_1 | ... | iSST_n | oSST_1 | ... | oSST_n |
+----------+---------------------------------+-----------------------+-----------------------+
(iSST_k = input SSTable k, oSST_k = output SSTable k - encoded as ManifestSSTable)
With each input (iSST) and each output (oSST) SSTable formatted as a “ManifestSSTablePath” defined above, and with the Extra section having the following format:
+--------+------------+------------+
| level | iSSTs_size | oSSTs_size |
+--------+------------+------------+
| 1 byte | 2 bytes | 2 bytes |
+--------+------------+------------+
"""
On top of this, I also defined a format for the header of the manifest, whose goal is to contain the configuration of the engine, namely:
- The number of levels;
- The thresholds for the size of SSTable levels that triggers a compaction for this level (ratio of size between one level and the next, as well as maximum number of files in the L0 level);
- The max sizes for a SSTable and a block.
With all of this, both configuration and events can be serialized into the manifest file, and thus persisted so that the state can be recreated from them upon restart.