How to preserve identity and causality across queues, topics, and background workers where no direct call stack survives.
Asynchronous context propagation is harder than synchronous propagation because the call stack disappears. A message is published, stored, delayed, retried, redelivered, or consumed by another worker long after the original request has finished. If context is not explicitly carried in the message or job envelope, the workflow becomes nearly impossible to reconstruct later.
This means async propagation needs more than a copied trace header. It needs clear rules about which identifiers are inherited, which are newly created for the background step, and how causality is recorded when one event produces another. A worker may legitimately start a new span or even a new trace segment, but it should still preserve enough linkage to tell responders how the work is related.
flowchart LR
A["Ingress request"] --> B["Publish event with context"]
B --> C["Queue or topic"]
C --> D["Worker consumes message"]
D --> E["Worker span or child workflow"]
E --> F["Logs, traces, and follow-on events stay correlated"]
Strong async propagation usually preserves at least:
1{
2 "event_type": "order.created",
3 "message_id": "msg_9218",
4 "correlation_id": "req_71bd",
5 "causation_id": "msg_9134",
6 "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
7 "tenant_id": "tenant_42",
8 "operation": "checkout",
9 "payload": {
10 "order_id": "ord_4441"
11 }
12}
What to notice:
A message queue can delay work, reorder work, or replay work. That means responders often need to answer different questions than in synchronous systems:
If those links are absent, async systems feel especially opaque under pressure.
If a worker emits errors about a failed order-processing job but the team cannot determine which original request or upstream event created that job, what is the deeper observability gap?
The stronger answer is weak asynchronous context propagation. The work is visible, but its place in the larger workflow is not.