Shared and Distributed Caches

March 26, 2026

Networked caches shared by many instances and the trade-offs between coherence, latency, and operational overhead.

A shared distributed cache sits on the network and is reused by many application instances or services. Unlike per-process caches, it provides one logically shared lookup layer. That often makes invalidation, warm-up, and reuse more predictable across a fleet. If one instance populates a key, other instances can benefit from the same cached result.

The trade-off is that the cache is now its own distributed system. It introduces network latency, availability concerns, memory management, eviction behavior, and operational dependencies that local caches do not have. A shared cache is usually slower than an in-process one but much easier to reason about across multiple app instances.

    flowchart LR
	    A["App instance A"] --> C["Shared distributed cache"]
	    B["App instance B"] --> C
	    D["App instance C"] --> C
	    C --> E["Origin database or service"]

Why It Matters

This cache type appears in many production architectures because it balances two competing goals:

it is still much cheaper than hitting the origin repeatedly
it is shared enough that the system does not need one independent cache per application instance

That makes it especially useful for horizontally scaled services, API gateways, session-like reusable state, and shared read-heavy data that several workers or services need.

What Shared Caches Solve Well

Shared distributed caches are especially good when:

the same answers are reused by many instances
a local cache would duplicate too much memory
cache warm-up should benefit the whole fleet
invalidation needs one central cache keyspace rather than many local copies

They are often used for:

shared object and query caches
response caches behind APIs
reference data reused by several services
coordination primitives such as rate-limit counters or idempotency keys, though those require extra care because not every cache product is ideal for correctness-sensitive coordination

What They Cost

A shared cache is not free just because it is faster than the database. It adds:

network hops on every lookup
cluster availability and failover concerns
its own metrics, alerts, and capacity planning
noisy-neighbor and hot-key problems inside the cache itself

It can also become a critical dependency. If the cache fails badly and the origin cannot absorb the redirected load, the whole system may degrade.

Example

This configuration sketch shows a shared cache policy for a service fleet. The point is that the cache is a shared infrastructure component with its own limits and fallback decisions.

1cache:
2  kind: distributed
3  key_prefix: catalog
4  default_ttl_seconds: 300
5  connection_pool_size: 100
6  on_cache_unavailable: bypass-to-origin
7  hot_key_protection:
8    jitter_ttl: true
9    singleflight_fill: true

What to notice:

the cache has connection and availability behavior of its own
fallback policy matters because the shared cache can fail independently
hot-key protection belongs in the design from the start, not only after an incident

Common Mistakes

treating a distributed cache as “just memory somewhere”
assuming shared caches eliminate all invalidation complexity
forgetting that a cache outage may become an origin overload event
using one cluster for unrelated workloads without considering noisy-neighbor effects

Design Review Question

Why might a shared distributed cache be preferable to purely local caches in a service with dozens of replicas?

The stronger answer is not only memory efficiency. A shared cache lets one population benefit the whole fleet, reduces duplicated warm-up, and gives invalidation one central keyspace to target. The cost is that the cache becomes a network dependency with its own failure modes.

Quiz Time

Loading quiz…

Revised on Wednesday, June 3, 2026

3.1 In-Process and Local Memory Caches

3.3 CDN, Edge, and Proxy Caches