Hash Table¶

Hash table uses hashing to implement a data structure that maps a key to an index so the value can be stored in the corresponding location (a bucket) in a sequential data structure like an array.

Quick question: why linked-list is not a good way to store buckets in a hash table? (Because the most frequent usage is to access by index, which is slow with linked-list data structure)

Characteristics¶

fast lookup \(\Theta(1)\)
redundant storage (certain ratio of empty buckets)
- performance degradation

Hash function for hash tables¶

\(h(x)\)
a key of any type mapping to integer from 0 to N - 1 (indices in a size N array)
modulo operator on any integer generating hash function

Collision resolution¶

Methods to handle collision when multiple keys are mapped to a same bucket.

Chaining¶

each bucket is a list to store multiple values (chained)
Nodes in the linked-list keeps key, value and next pointer
unlimited storage space
Performance degradation when over-filled

Open addressing¶

find another available bucket to store the value when a collision occurs
probing sequence: the sequence of the buckets to check when collision happens
simplest linear probing:

\((h1(key) + i) \% NUM\_BUCKETS, i = 0, 1, 2, 3 \dots\)
other probing (optional): linear probing that skips, quadratic probing, and double hashing
implementation
- each bucket keeps both key and value
- search in the probing sequence until find the key or an empty bucket.
- insertion
  - if the key is already in the hash table, update the value
  - if the key is not in the hash table, find an empty bucket in the probing sequence and store the key and value there
- deletion
  - will create a hole in the probing sequence that interferes with future searches, insertion and deletion
  - rearrange after every deletion
  - differentiate empty-since-start vs empty-after-removal (Optional)

Hash Table Glossary¶

Hash Table¶: An associative data structure that models mapping from keys to values. It is usually implemented as an array. A hash function is employed to map the key to the index of the array element where the value is stored
Key (hash table)¶: The input to the hash function to calculate the index of the buckets in the hash table
Value (hash table)¶: The values stored in the buckets in tha hash table
Collision (hash table)¶: A collision occurs when multiple keys result in the same index in the array
Collision resolution¶: The approach to resolve collisions in hash tables
Chaining¶: A collision resolution approach in which every element in an array is a linked-list
Open addressing¶: A collision resolution approach in which an empty bucket elsewhere in the hash table is used when collision happens
Probing¶: The mechanism to determine the next index of bucket to check in the open addressing collision resolution approach
Hash map¶: An implementation of the Map ADT based on the hash table data structure
Hash set¶: An implementation of the Set ADT based on the hash table data structure

Hash Table¶

Characteristics¶

Hash function for hash tables¶

Collision resolution¶

Chaining¶

Open addressing¶

Hash Table Glossary¶

Table of Contents

Previous topic

Next topic

This Page