Module 5 Symbol Table, Binary Search tree

In this module we will start approaching the topic of searching. Searching problems are a group of problems that require the algorithm to locate a certain record by a key. You should already know methods to search in a sequential data structure such as an array or a linked-list. In this module, we will discuss the topic in more complicate scenarios.

Symbol Table

  • Symbol table is a term used historically in compiler to refer to a data structure that keeps track of all the identifiers in a program and how they are used. Under this context, this term refers to a data structure.

  • It is defined as an ADT to store key-value pairs by some authors.

  • Alternative names are dictionary and map.

  • Key must be unique

  • Behaviors

    • put: put a key-value pair into the table

    • get: return value with a given key

    • contains: is there a value paired with a given key?

    • delete: remove key-value pair with a given key

    • size: number of key-value pairs

    • isEmpty: is the table empty?

  • Conventions for Java

    • use of null

    • equals()

    • key should be immutable (use final keyword)

    • key should be comparable

Elementary implementations

  • Store key-value pairs in a sequential data structure (array or linked-list)

  • put, get, contains and remove are all based on key lookup

    • put: search for key, if found, replace value, else add new key-value pair

    • get: search for key, if found, return associated value, else return null

    • remove: search for key, if found, remove key-value pair, else do nothing

  • Lookup algorithm

    • Unsorted list + Sequential search: \(\Theta(n)\)

    • Sorted list + Binary search: \(\Theta(\log n)\)

Advanced implementations of symbol table

  • Binary search tree implementation

    • Tree basics

    • Binary search tree basics

    • Binary search tree (BST) is a binary tree in symmetric order.

    • Symmetric order: each node has a key, and every node’s key is

      • larger than all keys in its left subtree

      • smaller than all keys in its right subtree

    • BST implementation

      • Node class: key, value, left, right

      • BST class: root node

      • put

      • get

      • delete

      • size

      • isEmpty

      • min, max, floor, ceiling, rank, select (optional)

    • Efficiency depends on the height of the tree

      • Worst case: \(\Theta(n)\) (unbalanced tree)

      • Average case: \(\Theta(\log n)\) (balanced tree)

    • Self-balancing BST to guarantee Theta(log n) performance

      • AVL tree (optional)

      • Red-black tree (next module)

  • Hash table (next module)