Networked caches shared by many instances and the trade-offs between coherence, latency, and operational overhead.
A shared distributed cache sits on the network and is reused by many application instances or services. Unlike per-process caches, it provides one logically shared lookup layer. That often makes invalidation, warm-up, and reuse more predictable across a fleet. If one instance populates a key, other instances can benefit from the same cached result.
The trade-off is that the cache is now its own distributed system. It introduces network latency, availability concerns, memory management, eviction behavior, and operational dependencies that local caches do not have. A shared cache is usually slower than an in-process one but much easier to reason about across multiple app instances.
flowchart LR
A["App instance A"] --> C["Shared distributed cache"]
B["App instance B"] --> C
D["App instance C"] --> C
C --> E["Origin database or service"]
This cache type appears in many production architectures because it balances two competing goals:
That makes it especially useful for horizontally scaled services, API gateways, session-like reusable state, and shared read-heavy data that several workers or services need.
Shared distributed caches are especially good when:
They are often used for:
A shared cache is not free just because it is faster than the database. It adds:
It can also become a critical dependency. If the cache fails badly and the origin cannot absorb the redirected load, the whole system may degrade.
This configuration sketch shows a shared cache policy for a service fleet. The point is that the cache is a shared infrastructure component with its own limits and fallback decisions.
1cache:
2 kind: distributed
3 key_prefix: catalog
4 default_ttl_seconds: 300
5 connection_pool_size: 100
6 on_cache_unavailable: bypass-to-origin
7 hot_key_protection:
8 jitter_ttl: true
9 singleflight_fill: true
What to notice:
Why might a shared distributed cache be preferable to purely local caches in a service with dozens of replicas?
The stronger answer is not only memory efficiency. A shared cache lets one population benefit the whole fleet, reduces duplicated warm-up, and gives invalidation one central keyspace to target. The cost is that the cache becomes a network dependency with its own failure modes.