.. highlight:: c++ :linenothreshold: 5 ******* Hashing ******* Hashing is an algorithm to map an input value (key) to an output value. It is usually defined as a function with a single parameter and a return value. The mappings are usually expressed using arrows like ``string -> int``, ``int -> int``. Hash function ============= + Domain of a hash function: The set of all possible inputs to a hash function + Range of a hash function: The set of all possible outputs of a hash function + Required properties * deterministic (required!) * uniform distribution - hash function that maps all possible input values (keys) with roughly equal probability to each possible output value (hash code) - ensures minimal collision - hard to guarantee and prove + Other properties (application specific) * variable length key to fixed length hash value * constant time computation :math:`\Theta(1)` * squeeze a large domain to a smaller range * chance of collision * asymmetric + Wellknown examples: MD5, SHA-1, SHA-2 + Make your own hash function Applications ============ + Hash table data structure * Hash Map (Map ADT, dictionary, key-value store) * Hash Set (Set ADT) + Fingerprinting * Short summary of a large data set * Checksum * Password management + Cryptography + Advanced data structures * Bloom filter * Cuckoo filter * Merkle tree (e.g. BitCoin) Hashing Glossary ================ .. glossary:: Hash Function A function that maps a variable sized value to a fixed sized value. Key (hashing) The input value to the hash function. Value (hashing) A.k.a hash value, hash code or simply hash. The output of the hash function. Key space / Domain (hashing) The set of all possible keys. Value space / Range (hashing) The set of all possible values. Collision (hashing) Multiple distinct keys maps to a same output value.