Shared and Distributed Caches

Networked caches shared by many instances and the trade-offs between coherence, latency, and operational overhead.

A shared distributed cache sits on the network and is reused by many application instances or services. Unlike per-process caches, it provides one logically shared lookup layer. That often makes invalidation, warm-up, and reuse more predictable across a fleet. If one instance populates a key, other instances can benefit from the same cached result.

The trade-off is that the cache is now its own distributed system. It introduces network latency, availability concerns, memory management, eviction behavior, and operational dependencies that local caches do not have. A shared cache is usually slower than an in-process one but much easier to reason about across multiple app instances.

    flowchart LR
	    A["App instance A"] --> C["Shared distributed cache"]
	    B["App instance B"] --> C
	    D["App instance C"] --> C
	    C --> E["Origin database or service"]

Why It Matters

This cache type appears in many production architectures because it balances two competing goals:

  • it is still much cheaper than hitting the origin repeatedly
  • it is shared enough that the system does not need one independent cache per application instance

That makes it especially useful for horizontally scaled services, API gateways, session-like reusable state, and shared read-heavy data that several workers or services need.

What Shared Caches Solve Well

Shared distributed caches are especially good when:

  • the same answers are reused by many instances
  • a local cache would duplicate too much memory
  • cache warm-up should benefit the whole fleet
  • invalidation needs one central cache keyspace rather than many local copies

They are often used for:

  • shared object and query caches
  • response caches behind APIs
  • reference data reused by several services
  • coordination primitives such as rate-limit counters or idempotency keys, though those require extra care because not every cache product is ideal for correctness-sensitive coordination

What They Cost

A shared cache is not free just because it is faster than the database. It adds:

  • network hops on every lookup
  • cluster availability and failover concerns
  • its own metrics, alerts, and capacity planning
  • noisy-neighbor and hot-key problems inside the cache itself

It can also become a critical dependency. If the cache fails badly and the origin cannot absorb the redirected load, the whole system may degrade.

Example

This configuration sketch shows a shared cache policy for a service fleet. The point is that the cache is a shared infrastructure component with its own limits and fallback decisions.

1cache:
2  kind: distributed
3  key_prefix: catalog
4  default_ttl_seconds: 300
5  connection_pool_size: 100
6  on_cache_unavailable: bypass-to-origin
7  hot_key_protection:
8    jitter_ttl: true
9    singleflight_fill: true

What to notice:

  • the cache has connection and availability behavior of its own
  • fallback policy matters because the shared cache can fail independently
  • hot-key protection belongs in the design from the start, not only after an incident

Common Mistakes

  • treating a distributed cache as “just memory somewhere”
  • assuming shared caches eliminate all invalidation complexity
  • forgetting that a cache outage may become an origin overload event
  • using one cluster for unrelated workloads without considering noisy-neighbor effects

Design Review Question

Why might a shared distributed cache be preferable to purely local caches in a service with dozens of replicas?

The stronger answer is not only memory efficiency. A shared cache lets one population benefit the whole fleet, reduces duplicated warm-up, and gives invalidation one central keyspace to target. The cost is that the cache becomes a network dependency with its own failure modes.

Quiz Time

Loading quiz…
Revised on Thursday, April 23, 2026