Correlated Request/Reply

March 23, 2026

A practical lesson on request-reply over messaging, including correlation IDs, reply channels, and when the pattern is a fit versus a disguised form of RPC.

Correlated request/reply uses messaging for an interaction where one party still expects a response related to a specific request. Instead of keeping an HTTP connection open, the requester sends a message with a correlation identifier and usually a reply channel or subject. The responder later publishes a matching reply that the requester can associate with the original request.

This pattern is useful when the environment is already message-centric, when temporary disconnects make direct request/response awkward, or when reply latency is variable enough that a broker-mediated interaction is operationally cleaner. It becomes a smell when teams use it for almost every service-to-service call even though the interaction still behaves like synchronous RPC.

    sequenceDiagram
	    participant R as Requester
	    participant B as Broker
	    participant S as Responder
	
	    R->>B: request with correlationId and replyTo
	    B-->>S: request
	    S->>B: reply with correlationId
	    B-->>R: matched reply

What to notice:

the interaction is still request-shaped even though the transport is asynchronous
correlation data is part of the contract, not an implementation detail
timeout and reply-matching logic move into the application layer

Where the Pattern Helps

Correlated request/reply is strong when:

the caller needs an answer, but not necessarily on an open synchronous connection
transport-level buffering or broker routing is already part of the architecture
intermittent connectivity or workflow delay makes direct RPC inconvenient
request and reply should be observable in the event infrastructure

Examples include asynchronous inventory reservation, long-running document generation, or message-driven workflows where the next step depends on a discrete result.

Correlation IDs and Reply Channels

The pattern fails quickly if request and reply matching are vague. A robust request message usually includes:

request ID
correlation ID
reply destination
timeout or expiry expectation
requester identity where needed

 1{
 2  "requestType": "inventory.reserve",
 3  "requestId": "req_01882",
 4  "correlationId": "ord_48392",
 5  "replyTo": "inventory-replies",
 6  "expiresAt": "2026-03-23T16:30:00Z",
 7  "data": {
 8    "sku": "SKU-19",
 9    "quantity": 2
10  }
11}

The reply must preserve enough correlation context for the requester to match it safely, especially if many requests are in flight at once.

Why It Can Become Disguised RPC

The pattern becomes weak when teams standardize on it for nearly every interaction. If every service sends a request message and then waits immediately for one reply before doing anything else, the architecture has not really become event-driven. It has rebuilt request/response with:

more hops
more timeout logic
more tracing complexity
more correlation-state handling

This is not automatically wrong. It is just not automatically simpler because the transport uses a broker.

Timeouts and Late Replies

One of the hardest operational parts of correlated request/reply is dealing with non-answers and late answers. The requester needs a policy:

how long to wait
what to do if no reply comes
how to treat a reply that arrives after the caller timed out
whether duplicate replies are possible and how they are ignored

1type PendingRequest = {
2  correlationId: string;
3  expiresAt: Date;
4  status: "waiting" | "completed" | "timed_out";
5};

This small state model illustrates the real cost of the pattern: the requester now owns request lifecycle tracking.

Common Mistakes

using request/reply over messaging for interactions that are really just low-latency RPC
omitting expiry and timeout policy from the contract
failing to define how late replies are handled
confusing correlation IDs with business IDs and mixing their purposes
assuming messaging transport removes the need for clear caller dependency design

Design Review Question

A platform team requires all inter-service traffic to use correlated request/reply over the broker, even for simple read calls that need immediate answers. Why is that not automatically a good standard?

Because transport standardization does not reduce dependency shape by itself. If most calls are still prompt request/response interactions, the system may simply be rebuilding RPC with extra broker hops, reply routing, and timeout complexity instead of using the most direct integration style for the problem.

Quiz Time

Loading quiz…

Revised on Wednesday, June 3, 2026

10.2 Event-Carried State Transfer

10.4 Aggregation and Scatter-Gather