***************************************** Module 5 Symbol Table, Binary Search tree ***************************************** In this module we will start approaching the topic of ``searching``. Searching problems are a group of problems that require the algorithm to locate a certain record by a ``key``. You should already know methods to search in a sequential data structure such as an array or a linked-list. In this module, we will discuss the topic in more complicate scenarios. Symbol Table ============ + ``Symbol table`` is a term used historically in compiler to refer to a data structure that keeps track of all the identifiers in a program and how they are used. Under this context, this term refers to a **data structure**. + It is defined as an **ADT** to store key-value pairs by some authors. + Alternative names are ``dictionary`` and ``map``. + Key must be unique + Behaviors * put: put a key-value pair into the table * get: return value with a given key * contains: is there a value paired with a given key? * delete: remove key-value pair with a given key * size: number of key-value pairs * isEmpty: is the table empty? + Conventions for Java * use of ``null`` * ``equals()`` * key should be immutable (use final keyword) * key should be comparable Elementary implementations ========================== + Store key-value pairs in a sequential data structure (array or linked-list) + put, get, contains and remove are all based on key lookup * put: search for key, if found, replace value, else add new key-value pair * get: search for key, if found, return associated value, else return null * remove: search for key, if found, remove key-value pair, else do nothing + Lookup algorithm * Unsorted list + Sequential search: :math:`\Theta(n)` * Sorted list + Binary search: :math:`\Theta(\log n)` Advanced implementations of symbol table ======================================== + Binary search tree implementation * :doc:`Tree basics ` * :doc:`Binary search tree basics ` * ``Binary search tree`` (BST) is a binary tree in symmetric order. * Symmetric order: each node has a key, and every node's key is - larger than all keys in its left subtree - smaller than all keys in its right subtree * BST implementation - Node class: key, value, left, right - BST class: root node - put - get - delete - size - isEmpty - min, max, floor, ceiling, rank, select (optional) * Efficiency depends on the height of the tree - Worst case: :math:`\Theta(n)` (unbalanced tree) - Average case: :math:`\Theta(\log n)` (balanced tree) * Self-balancing BST to guarantee \Theta(\log n) performance - AVL tree (optional) - Red-black tree (next module) + Hash table (next module)