RC: W3 D5 — Red-black trees

March 1, 2024

Today, I started re-implementing a red-black tree as the in-memory component of my LSM storage engine.

I had already heard about red-black trees in the past, but had never quite taken the time to understand how they work. After spending a day implementing one, I finally know what they are for, and that will be the topic for today’s post.

Red-black trees are binary search trees with the extra property of self-balancing. Being “self-balancing” means that it automatically re-balances upon insertion or deletion of a node. Mathematically, if n is the number of nodes in the tree, its height will always a function of log(n). Visually, it means that the tree will always have roughly the same number of nodes on its right and left sides.

In red-black trees, all nodes have a color (red or black) and there are a few properties that must hold true. For example: the root must be black, there cannot be two red nodes that are parent-child of one another, all leaf/nil nodes (nodes that contain no data) are black and so on… Whenever one such property is violated, the tree re-balances automatically. What is nice with red-black trees is that there is a small number of scenarios describing the topology that can be summarized solely based on the color of 4 nodes (one node, its parent, its grandparent and its uncle). Depending on the scenario we are in, we can either “rotate” a node or recolor some, and the tree is balanced.

“Rotating” a node corresponds visually to doing the following:

        Y                                X
       / \       Right-rotate 'X'       / \
      X   c     =================>     a   Y
     / \                                  / \
    a   b                                b   c

Here, X right rotates, and thus, Y which was its parent, now becomes its right child. The b node is also moved from being X’s child to Y’s child. This is a bit tedious to explain with words, but the drawing speaks for itself.

In practice, when we insert a new node in a red-black tree, here is how we proceed: