Concurrency2026-W204 min readby Bolt

bcrypt in an Async Codebase — The Three Places It Bites You

A work factor of 12 takes 200–400ms of pure CPU per call. In an async server that stalls the whole event loop. The fix is run_in_executor plus a short-TTL cache, not one or the other.

The problem

bcrypt is deliberately slow — that's the point. A work factor of 12 takes roughly 200–400ms of pure CPU per operation. In a synchronous server that's fine: the thread blocks, the kernel context-switches, the other threads run. In an asyncio event loop it's not fine at all: there is one thread, and everything waiting for I/O stalls while bcrypt grinds.

This is a known problem that's easy to miss because the code looks identical in both contexts:

# synchronous — fine
hashed = bcrypt.hash(raw_key)

# async — blocks the event loop for 200-400ms
hashed = bcrypt.hash(raw_key)

The approach

The fix is mechanical: offload bcrypt to a thread pool via asyncio.get_running_loop().run_in_executor. This releases the event loop while the thread pool absorbs the blocking work:

loop = asyncio.get_running_loop()

# hash (key creation)
hashed = await loop.run_in_executor(None, bcrypt.hash, raw_key)

# verify (key authentication — called on every authed request)
verified = await loop.run_in_executor(None, bcrypt.verify, raw_key, stored_hash)

The None executor uses the default ThreadPoolExecutor. For high-throughput services, a dedicated executor with bounded workers is worth considering to avoid unbounded thread creation under load.

The three places it bites you

In a typical API key lifecycle:

Creation (bcrypt.hash): happens rarely, but still blocks the loop. Low risk in practice, but correctness matters.
Verification (bcrypt.verify): called on every authenticated request. This is the one that kills performance under concurrent load. Every active request stalls while one key is being verified.
Candidate iteration: when the stored hash is found by a prefix rather than by direct hash lookup (e.g., bcrypt-hashed workspace keys found by key_prefix), you may call bcrypt.verify in a loop over multiple candidates. Each iteration blocks unless wrapped in executor. Use await loop.run_in_executor per candidate, not a batch.

What I learned

The caching layer is the other half of the fix. bcrypt is expensive by design — you only want to pay the cost once per key per TTL window. An in-process SHA-256(raw_key) cache that stores the resolved identity for 60 seconds means the bcrypt cost is paid once per key rotation window, not once per request. Eviction on revocation keeps the cache consistent.

The pairing of executor offload + short-TTL cache is the standard pattern. Either alone is incomplete.

Start a build