After hashing, every input becomes a binary number that behaves like a sequence of random coin flips. Each bit is independent and equally likely to be 0 or 1.
This randomness unlocks something surprising: we can learn about the size of a stream without counting items directly.
Consider a hashed value written in binary:
HyperLogLog works with the position of this first 1-bit, though it’s often easier to think in terms of how many leading zeros come before it. Both viewpoints describe the same thing.
Because each bit is random, long runs of leading zeros are rare. Think of it like flipping a coin. Getting heads once is common, but getting 10 heads in a row is extremely unlikely.
Let’s call P the probability (chance) of an event happening. If each bit is like a coin flip, the probability of seeing the first 1 at position k is:
Here’s what this means in practice:
Most hashed values will start with a 1 or 01. Seeing 0001 is uncommon. Seeing 0000001 is extremely unlikely — only about 1 in 128 chance!
If, while scanning a stream, the rarest pattern you observe has k leading zeros, the stream must be large enough to produce an event with probability 1/2^k. In other words, the most extreme observation hints at how many items you must have seen.
Example: If you’ve seen a hash starting with “0000001” (first 1 at position 7), that event has only a 1/128 chance. You probably needed to see around 128 items before encountering something this rare.
This is the core idea behind HyperLogLog:
Relying on a single rare observation is noisy and unreliable. But this intuition is the foundation. Next, we’ll turn it into the simplest possible algorithm and see why it needs improvement.