Self-balancing BST

As the performance of operations on any BST depends on the height of the tree, it is desirable to keep the average height as small as possible. A self-balancing BST is a BST that automatically keeps its height balanced in the process of arbitrary item insertions and deletions. The most popular self-balancing BSTs are AVL trees and red-black trees.

Red-black tree

2-3 tree

  • 2-nodes and 3-nodes

    A 2-node is a node with one key and two children while a 3-node is a node with two keys and three children. A 2-3 tree are always balanced.

  • Search - straight forward with two comparisons on each node

  • Insertion

    • If the tree is empty, create a 2-node with the key and return.

    • To insert to a 2-node, create a 3-node with the new key added.

    • To insert to a 3-node, create a temporary 4-node with the key added and move the middle key to the parent node. This operation may propagate up.

    • If there is no parent node (root), create a new root node with the middle key.

  • Delete (Optional content)

  • It is hard to implement certain programming languages as is because it is hard to maintain many node types.

Red-black tree

History: The original red-black was invented by Guibas and Sedgewick in 1978. Which is derived from the 2-3-4 tree invented by Bayer in 1972. It was improved several times by Sedgewick and others. The current version is called left-leaning red-black tree. A new version is proposed in 2008 by Robert Sedgewick and Kevin Wayne in the Textbook “Algorithms”, 4th edition. New constraints are added to the original left-leaning red-black tree to make it easier to implement and more efficient. It is now representing the 2-3 tree instead of the 2-3-4 tree.

A red-black tree is a representation of a 2-3 tree as a BST. It uses the red and black colors to help representing the 3-nodes and 2-nodes respectively. The red-black tree is always balanced as a 2-3 tree but not really balanced as a BST.

We will use left-leaning red-black tree as an example in which the red link is always leaning left by definition.

  • Representation

    • The nodes is colored rather than the links as each link can only point to one node and one-to-one mapping exists.

    • Each node is either red or black as stored in a boolean instance variable.

  • Search - same as BST

  • Insert

    • Insert as BST first

    • Abnormal cases created in BST insertion

      • right-leaning 3-node

      • 4-node

        • both children are red

        • left-left 4-node

        • left-right 4-node

    • Adjust the tree to make it a valid red-black tree

      • left rotation

      • right rotation

      • split

AVL Tree

Note

Optional reading for COP5417

A BST is an AVL tree if and only if for every node in the tree, the heights of the left and right subtrees differ by at most 1. It keeps track of the balance factor of each node in the tree and uses this information to decide whether to rotate the tree to balance it.

Comparison between AVL tree and red-black tree

Aspect

AVL Trees | Red-black Trees

Balance Factor

Maintains balance using height difference (balance factor)

Maintains balance using color (red or black)

Height

More strictly balanced (better worst-case performance)

Less strictly balanced (slightly worse performance)

Memory Overhead

May have higher memory overhead due to balance factor storage

Generally has lower memory overhead due to color storage

Insertion/Deletion Performance

Can be slower due to stricter balancing requirements

Generally faster due to fewer rotations and looser balancing