Concurrency Races

Stale overwrites, out-of-order invalidation, and version-aware safeguards for cache refresh races.

Concurrency races and lost freshness appear when several actors read, write, invalidate, or refresh the same cached value concurrently. The cache may momentarily contain newer data and then regress to older data because a slower refresh finishes last. An invalidation may arrive out of order. A write may complete after a refresh started from a stale snapshot.

These bugs are subtle because the cache logic can look reasonable in isolated steps. The failure only appears when timing interleaves operations in an unlucky order.

    sequenceDiagram
	    participant WorkerA
	    participant WorkerB
	    participant Source
	    participant Cache
	
	    WorkerA->>Source: read old version 7
	    WorkerB->>Source: write new version 8
	    WorkerB->>Cache: invalidate key
	    WorkerA->>Cache: write cached value from version 7
	    Note over Cache: stale value reappears after a correct invalidation

Why It Matters

Races undermine trust because they create stale reappearance. The team thinks invalidation worked, but a delayed refresh or out-of-order writer silently restores older state. These problems are common in asynchronous refill systems, queue-driven invalidation, and multi-instance refresh pipelines.

Typical race patterns include:

  • stale refresh writes after a newer write committed
  • older invalidation or event arriving after newer data is already cached
  • two writers competing to refresh the same key with different snapshots
  • compare-unaware cache sets that accept any last arrival as truth

Safer Coordination Patterns

The most reliable protections add ordering information to cache writes:

  • store version numbers, timestamps, or sequence tokens with the value
  • only overwrite when the incoming version is newer or equal by policy
  • use compare-and-set semantics where the cache technology allows it
  • carry write ordering through invalidation events instead of sending blind deletes alone

The goal is not perfect synchronization everywhere. The goal is to stop older state from winning simply because it arrived later.

Example

This example refuses to overwrite the cache with an older snapshot.

 1type VersionedValue = {
 2  version: number;
 3  payload: object;
 4};
 5
 6async function setIfNewer(
 7  key: string,
 8  incoming: VersionedValue,
 9  getCurrent: (key: string) => Promise<VersionedValue | null>,
10  put: (key: string, value: VersionedValue, ttlSeconds: number) => Promise<void>
11) {
12  const current = await getCurrent(key);
13
14  if (!current || incoming.version >= current.version) {
15    await put(key, incoming, 300);
16  }
17}

What to notice:

  • the cache write is no longer blind
  • freshness depends on an ordering signal the system trusts
  • this does not replace invalidation, but it prevents stale overwrite races during refill

Trade-Offs

Race protection adds coordination and metadata costs.

  • Writers need comparable version semantics.
  • Some cache backends do not support atomic compare-and-set, so ordering checks may need extra round trips.
  • Timestamp-based ordering can fail if clocks drift or if logical ordering differs from wall time.
  • Stronger coordination lowers race risk but can reduce throughput or increase complexity.

Still, on hot or correctness-sensitive keys, blind last-writer-wins behavior is usually too risky.

Common Mistakes

  • treating invalidation as enough even when stale refresh workers can still write afterward
  • using timestamps as if they were always safe ordering tokens
  • forgetting that queue delivery order and source commit order may differ
  • assuming one process-local mutex protects a multi-instance race

Design Review Question

What is the difference between “cache invalidation happened” and “older state cannot reappear”?

The stronger answer is that invalidation removes trust in a current value, but it does not by itself constrain later writes to the cache. Preventing older state from reappearing requires ordering-aware refill logic, such as version checks, compare-and-set updates, or stronger sequencing guarantees.

Quiz Time

Loading quiz…
Revised on Thursday, April 23, 2026