EduC++ / STL Unordered (Hash) Containers

STL Unordered (Hash) Containers

Demonstrates: unordered_map, unordered_set, unordered_multimap, and how to write custom hash functions for user-defined types. Hash-based containers provide O(1) average lookup/insert/erase. Elements are NOT sorted -- use when you need speed over order.

Prereqs basic understanding of hash tables, big-O notation.
Standard available since C++11. This file uses C++20 (std::format, contains()). Custom hash support available since C++11.

Frequently Asked Questions

QWhat are the requirements for a hash function in C++?
AA hash function must be deterministic (same input always produces the same output) and must be compatible with operator== (if two objects compare equal, their hashes must be identical). It should also distribute values uniformly across buckets to minimize collisions.
QWhat is hash flooding and how can it affect my program?
AHash flooding is a denial-of-service attack where an adversary crafts inputs that all hash to the same bucket, degrading O(1) operations to O(n). Mitigations include using randomized hash seeds (some implementations do this), switching to ordered containers for untrusted input, or using collision-resistant hash functions like SipHash.
QWhen should I prefer an ordered container (map/set) over an unordered one?
AUse ordered containers when you need sorted iteration, range queries (lower_bound / upper_bound), or guaranteed O(log n) worst-case performance. Use unordered containers when you only need fast lookup/insert and can tolerate non-deterministic iteration order.
QWhy does unordered_map require operator== in addition to a hash?
ABecause hash collisions are inevitable -- multiple distinct keys can hash to the same bucket. The container uses operator== to distinguish between different keys that landed in the same bucket and to confirm an exact match during lookup.
QHow can I tell if my hash function is performing well?
ACheck load_factor() (average elements per bucket) and bucket_count(). A load factor near max_load_factor() (default 1.0) is normal. If specific buckets are overloaded, use bucket_size(i) to detect hotspots. Ideally, bucket sizes should be roughly uniform.
C++
#include <iostream>
#include <format>
#include <unordered_map>
#include <unordered_set>
#include <string>
#include <vector>

Custom hash function for user-defined types.

Required to use your type as a key in unordered containers.

Watch out: a bad hash function (e.g., always returns 0) degrades all operations to O(n). Aim for uniform distribution across buckets by combining member hashes properly.

C++
struct Point {
    int x, y;

    bool operator==(const Point& other) const {
        return x == other.x && y == other.y;
    }
};

// Custom hash: combine hashes of x and y
struct PointHash {
    std::size_t operator()(const Point& p) const {
        // Combine hashes using a common technique
        auto h1 = std::hash<int>{}(p.x);
        auto h2 = std::hash<int>{}(p.y);
        return h1 ^ (h2 << 1);  // Simple hash combine
    }
};

C++
int main() {

1 std::unordered_map -- hash map (O(1) average lookup)

What

std::unordered_map -- hash map (O(1) average lookup).

When

Use instead of std::map when you don't need sorted order.

Why

Use instead of std::map when you don't need sorted order.

Use

Use instead of std::map when you don't need sorted order.

C++ Version Not explicitly specified in this example.

Use instead of std::map when you don't need sorted order.

Never depend on element order.

Watch out: iteration order can change after rehashing.

C++
std::cout << "--- std::unordered_map ---\n";
    std::unordered_map<std::string, double> prices;

    prices["apple"] = 1.29;
    prices["banana"] = 0.79;
    prices["cherry"] = 3.49;
    prices.emplace("date", 5.99);

    // Lookup
    if (auto it = prices.find("banana"); it != prices.end()) {
        std::cout << std::format("Banana: ${:.2f}\n", it->second);
    }

    // contains() -- C++20
    std::cout << std::format("Has grape? {}\n", prices.contains("grape"));

    // Iterate (order is NOT guaranteed!)
    std::cout << "All prices:\n";
    for (const auto& [item, price] : prices) {
        std::cout << std::format("  {}: ${:.2f}\n", item, price);
    }

    // Hash table internals
    std::cout << std::format("Buckets: {}\n", prices.bucket_count());
    std::cout << std::format("Load factor: {:.2f}\n", prices.load_factor());
    std::cout << std::format("Max load factor: {:.2f}\n",
                              prices.max_load_factor());

2 std::unordered_set -- hash set (unique elements)

What

The standard library provides good hashes for built-in types and std::string, but for custom types you must supply your own (see PointHash above).

When

The standard library provides good hashes for built-in types and std::string, but for custom types you must supply your own (see PointHash above).

Why

O(1) average membership testing.

Use

Use it in code like: std::cout << "\n--- std::unordered_set ---\n";.

C++ Version Not explicitly specified in this example.

O(1) average membership testing.

The standard library provides good hashes for built-in types and std::string, but for custom types you must supply your own (see PointHash above).

Watch out: hash quality directly affects performance.

C++
std::cout << "\n--- std::unordered_set ---\n";
    std::unordered_set<std::string> seen;

    // Detect duplicates in a stream of words
    std::vector<std::string> words = {
        "hello", "world", "hello", "foo", "world", "bar"
    };

    std::cout << "Unique words: ";
    for (const auto& w : words) {
        auto [it, inserted] = seen.insert(w);
        if (inserted) {
            std::cout << w << ' ';
        }
    }
    std::cout << '\n';
    std::cout << std::format("Total unique: {}\n", seen.size());

3 Using custom types as keys

What

Using custom types as keys.

When

Use this when it cleanly solves the problem in front of you.

Why

You must provide a hash function and operator==.

Use

Use it in code like: std::cout << "\n--- Custom Key Type ---\n";.

C++ Version Not explicitly specified in this example.

You must provide a hash function and operator==.

degrades all operations to O(n). Test with realistic data and check load_factor() to confirm even distribution.

Watch out: a bad hash function (e.g., always returns 0)

C++
std::cout << "\n--- Custom Key Type ---\n";
    std::unordered_map<Point, std::string, PointHash> labels;
    labels[{0, 0}] = "origin";
    labels[{1, 0}] = "east";
    labels[{0, 1}] = "north";

    Point query{0, 0};
    if (auto it = labels.find(query); it != labels.end()) {
        std::cout << std::format("({},{}) = {}\n",
                                  query.x, query.y, it->second);
    }

4 std::unordered_multimap -- allows duplicate keys

What

std::unordered_multimap -- allows duplicate keys.

When

Use this when it cleanly solves the problem in front of you.

Why

It improves correctness, clarity, and maintainability.

Use

Use it in code like: std::cout << "\n--- std::unordered_multimap ---\n";.

C++ Version Not explicitly specified in this example.
C++
std::cout << "\n--- std::unordered_multimap ---\n";
    std::unordered_multimap<std::string, int> scores;
    scores.insert({"Alice", 95});
    scores.insert({"Alice", 87});
    scores.insert({"Alice", 92});
    scores.insert({"Bob", 88});

    // Get all scores for Alice
    auto [begin, end] = scores.equal_range("Alice");
    std::cout << "Alice's scores: ";
    for (auto it = begin; it != end; ++it) {
        std::cout << it->second << ' ';
    }
    std::cout << '\n';

5 Performance comparison note

What

practice due to O(1) average operations AND better cache locality (the bucket array is a contiguous allocation).

When

However, worst-case is O(n) when many keys collide.

Why

practice due to O(1) average operations AND better cache locality (the bucket array is a contiguous allocation).

Use

Use it in code like: std::cout << "\n--- When to use what ---\n";.

C++ Version Not explicitly specified in this example.

practice due to O(1) average operations AND better cache locality (the bucket array is a contiguous allocation). However, worst-case is O(n) when many keys collide. Ordered containers guarantee O(log n) in all cases.

Watch out: unordered containers are often faster in

C++
std::cout << "\n--- When to use what ---\n";
    std::cout << "unordered_map/set: O(1) avg, use when order doesn't matter\n";
    std::cout << "map/set:           O(log n), use when you need sorted order\n";
    std::cout << "Worst case for unordered: O(n) if many hash collisions\n";

    return 0;
}