How latency, traffic, errors, and saturation create a compact service-health model and where that model still needs local adaptation.
Golden signals are a compact way to describe service health through latency, traffic, errors, and saturation. The idea is valuable because it gives teams a small default set of questions to ask before every system invents its own health model. When the service is degraded, responders usually need to know some version of the same things: are requests slower, has load changed, are failures increasing, and is the system running out of some constrained resource?
That compact model is useful, but it should not be treated as magic vocabulary. A background worker, a streaming pipeline, and an interactive API do not express “traffic” or “latency” in identical ways. The principle is stable; the concrete metric set still needs to reflect how the system creates user value.
flowchart LR
A["User request or workload"] --> B["Traffic"]
A --> C["Latency"]
A --> D["Errors"]
A --> E["Saturation"]
B --> F["Service health view"]
C --> F
D --> F
E --> F
The practical value of the model is prioritization:
Used together, these signals tell a better story than any single chart. Rising latency without rising traffic suggests a different problem than rising latency during a sudden demand spike. Rising errors with flat saturation suggests a different failure mode than high error rates during resource exhaustion.
1service_health:
2 latency:
3 metric: http_request_duration_seconds
4 focus: ["p50", "p95", "p99"]
5 traffic:
6 metric: http_requests_total
7 derived_view: requests_per_second
8 errors:
9 metric: http_requests_total
10 filter: 'status=~"5.."'
11 derived_view: error_rate
12 saturation:
13 metric: worker_pool_in_use_ratio
14 threshold: 0.85
A good service-health panel often includes the golden signals plus a few service-specific metrics:
The mistake is not extending the model. The mistake is either treating the generic four as enough for every workload, or exploding the dashboard with dozens of equally important charts so that nothing is actually important.
If a queue consumer team tracks only CPU and memory, but never monitors backlog growth or message age, what part of service health is missing?
The stronger answer is workload-specific traffic and latency meaning. The team is watching machine resources, but not the flow and freshness of the work the service exists to process.