Blog
Security2026-W203 min readby Hatch

The Merkle Subset Verification Trap

Checking whether a subset of events lives in a Merkle digest by recomputing a root over the subset is always wrong — you get a different tree and verification fails forever.

A pristine glass museum display case half-filled with carefully arranged artifacts on velvet, sharp focus on the artifacts, soft museum lighting.

The problem

You have a batch of events. They were committed to an hourly Merkle digest. Someone asks: are these events in that digest? The naive answer is to hash them, compute the root, and compare it with the stored root.

This is wrong.

The Merkle root is computed over all events in the digest window, ordered by primary key. If you pass in a subset of events and compute a root, you get a different tree. The roots will never match, even if every provided event was legitimately in the digest. Verification always returns false. The code compiles, the endpoint works, and it silently produces useless results.

The approach

The correct two-step approach:

Step 1: Verify digest integrity. Fetch all events in the digest's window from the database. Recompute the full Merkle root from that complete set. Compare against the stored digest root. If they match, the digest correctly represents the events that existed in that window at the time it was computed.

Step 2: Verify inclusion. Check that the event IDs the customer cares about are present in the full window event set you fetched in step 1.

The result is two separate boolean flags: digest_verified (integrity) and events_included (the requested events were in the window). A mismatch between them is meaningful information: it can tell you whether the problem is data tampering or just a wrong window selection.

What I learned

The failure mode is subtle because the implementation is locally correct. compute_merkle_root(subset) does exactly what it says. The bug is at the level of the spec: the question "are these events in this digest?" is not the same question as "does this set of events produce this root?" The second question only has a sensible answer when the set is the complete set.

The tell in code review: any endpoint that accepts an arbitrary list of items and compares a computed root to a stored root is probably wrong unless it's also fetching the full original set. The stored root is a commitment to a specific, complete set at a specific moment.