Hash Table¶
Hash table uses hashing to implement a data structure that maps a key to an index so the value can be stored in the corresponding location (a bucket) in a sequential data structure like an array.
Quick question: why linked-list is not a good way to store buckets in a hash table? (Because the most frequent usage is to access by index, which is slow with linked-list data structure)
Characteristics¶
fast lookup \(\Theta(1)\)
redundant storage (certain ratio of empty buckets)
performance degradation
Hash function for hash tables¶
\(h(x)\)
a key of any type mapping to integer from 0 to N - 1 (indices in a size N array)
modulo operator on any integer generating hash function
Collision resolution¶
Methods to handle collision when multiple keys are mapped to a same bucket.
Chaining¶
each bucket is a list to store multiple values (chained)
Nodes in the linked-list keeps key, value and next pointer
unlimited storage space
Performance degradation when over-filled
Open addressing¶
find another available bucket to store the value when a collision occurs
probing sequence: the sequence of the buckets to check when collision happens
simplest linear probing:
\((h1(key) + i) \% NUM\_BUCKETS, i = 0, 1, 2, 3 \dots\)
other probing (optional): linear probing that skips, quadratic probing, and double hashing
implementation
each bucket keeps both key and value
search in the probing sequence until find the key or an empty bucket.
insertion
if the key is already in the hash table, update the value
if the key is not in the hash table, find an empty bucket in the probing sequence and store the key and value there
deletion
will create a hole in the probing sequence that interferes with future searches, insertion and deletion
rearrange after every deletion
differentiate empty-since-start vs empty-after-removal (Optional)
Hash Table Glossary¶
- Hash Table¶
An associative data structure that models mapping from keys to values. It is usually implemented as an array. A hash function is employed to map the key to the index of the array element where the value is stored
- Key (hash table)¶
The input to the hash function to calculate the index of the buckets in the hash table
- Value (hash table)¶
The values stored in the buckets in tha hash table
- Collision (hash table)¶
A collision occurs when multiple keys result in the same index in the array
- Collision resolution¶
The approach to resolve collisions in hash tables
- Chaining¶
A collision resolution approach in which every element in an array is a linked-list
- Open addressing¶
A collision resolution approach in which an empty bucket elsewhere in the hash table is used when collision happens
- Probing¶
The mechanism to determine the next index of bucket to check in the open addressing collision resolution approach
- Hash map¶
An implementation of the Map ADT based on the hash table data structure
- Hash set¶
An implementation of the Set ADT based on the hash table data structure