05 · Self-Balancing Trees
BST vs AVL vs Red-Black · same insertion sequence · watch the rotations.
What you're watching
Each panel is a different binary search tree implementation. All
three receive the same insertion sequence one key at a time;
after each insert the panels re-render, with the freshly inserted
key briefly ringed in red. Stats above each panel report
height (depth of the deepest leaf) and
nodes (total node count).
The discriminating metric is height. For n nodes, the theoretically minimum height is ⌈log₂(n+1)⌉; a self-balancing tree gets you within a small constant factor of that. The plain BST gives no such guarantee.
1, 2, 3, ..., 10. The plain BST degenerates
into a 10-deep linked list — it's a "tree" only in name. AVL
and Red-Black both stay around height 4–5, exactly as their
invariants promise.
The three trees
BST no rebalancing
The naive binary search tree. Insert by walking from the root, going left if the new key is smaller, right if larger. No effort to balance — when the input is sorted (or almost sorted) the tree degenerates to a linked list with O(n) lookup. The PATHOLOGICAL warning on the stat line lights up when height ≥ 0.7 × node count, i.e. when the tree is mostly chain.
AVL strict height balance
Each node stores its height (shown as a small
hk badge below). After every insert, walk
back up to the root, recompute heights, and rotate whenever the
balance factor leaves the range [−1, +1]. This costs slightly
more rotations than Red-Black per insert, but maintains a tighter
height bound: a 1.44 × log₂(n+2) worst case.
Red-Black looser balance, fewer rotations
Each node is coloured red
or black; the
invariants are: root is black, no two consecutive reds, every
root-to-leaf path has the same number of black nodes. Looser
balance than AVL — height bound is 2 log₂(n+1) — but
fewer rotations per insert on average, which is why
most language standard libraries (std::map,
Java TreeMap, Linux scheduler's CFS) use Red-Black.
Why three trees instead of just one balanced version
Every working programmer eventually internalises that "balanced trees are O(log n)" — but the cost of balance isn't usually visualised. AVL is more rigid, RB is more elastic; both achieve the same asymptotic class with very different rotation profiles. Showing them side-by-side on the same input makes the trade-off concrete: AVL trees tend to be a level shorter, Red-Black trees rebalance less aggressively.
- Insert
1, 2, 3, ... 30— BST height 30, AVL ≈ 5, RB ≈ 6. - Insert a random permutation of 1..15 — all three trees end up similarly balanced; randomness saves the BST.
- Insert
10, 5, 15, 3, 7, 12, 18— all three trees produce similar shapes since this input is already balanced. - Toggle between ascending and descending presets — Red-Black recolours its root after each, AVL flips between left- and right-heavy rotations.
What I learned writing this
- Red-Black is annoying to implement and pleasant to verify. The CLRS pseudocode is right but unforgiving — case 1 / 2 / 3 mirror for left-uncle vs right-uncle, parent pointer bookkeeping during rotations, root recolour at the end. Once it's right, the visual test is easy: pick any path from root to leaf and count black nodes. Every path should have the same count.
-
AVL's rebalance code reads like a four-case match statement.
balance factor > 1 + key < left.key → right rotation.
> 1 + key > left.key → left-right rotation.
The four lines at the bottom of
avlInsertare the entire algorithm. Rest is just height bookkeeping. - The "in-order x" tree layout is dumb but correct. Each node gets its own column equal to its in-order rank. Subtrees can never overlap because every node has a unique x slot. More sophisticated layouts (Reingold-Tilford) pack tighter but require two passes; for visualization, the simple version is fine.
- The "PATHOLOGICAL" callout pays for itself. A stat that says "BST height 10, 10 nodes" is technically informative but easy to overlook. "PATHOLOGICAL · degraded to chain" in red immediately tells the visitor what to actually take away.