Correlated Request/Reply

A practical lesson on request-reply over messaging, including correlation IDs, reply channels, and when the pattern is a fit versus a disguised form of RPC.

Correlated request/reply uses messaging for an interaction where one party still expects a response related to a specific request. Instead of keeping an HTTP connection open, the requester sends a message with a correlation identifier and usually a reply channel or subject. The responder later publishes a matching reply that the requester can associate with the original request.

This pattern is useful when the environment is already message-centric, when temporary disconnects make direct request/response awkward, or when reply latency is variable enough that a broker-mediated interaction is operationally cleaner. It becomes a smell when teams use it for almost every service-to-service call even though the interaction still behaves like synchronous RPC.

    sequenceDiagram
	    participant R as Requester
	    participant B as Broker
	    participant S as Responder
	
	    R->>B: request with correlationId and replyTo
	    B-->>S: request
	    S->>B: reply with correlationId
	    B-->>R: matched reply

What to notice:

  • the interaction is still request-shaped even though the transport is asynchronous
  • correlation data is part of the contract, not an implementation detail
  • timeout and reply-matching logic move into the application layer

Where the Pattern Helps

Correlated request/reply is strong when:

  • the caller needs an answer, but not necessarily on an open synchronous connection
  • transport-level buffering or broker routing is already part of the architecture
  • intermittent connectivity or workflow delay makes direct RPC inconvenient
  • request and reply should be observable in the event infrastructure

Examples include asynchronous inventory reservation, long-running document generation, or message-driven workflows where the next step depends on a discrete result.

Correlation IDs and Reply Channels

The pattern fails quickly if request and reply matching are vague. A robust request message usually includes:

  • request ID
  • correlation ID
  • reply destination
  • timeout or expiry expectation
  • requester identity where needed
 1{
 2  "requestType": "inventory.reserve",
 3  "requestId": "req_01882",
 4  "correlationId": "ord_48392",
 5  "replyTo": "inventory-replies",
 6  "expiresAt": "2026-03-23T16:30:00Z",
 7  "data": {
 8    "sku": "SKU-19",
 9    "quantity": 2
10  }
11}

The reply must preserve enough correlation context for the requester to match it safely, especially if many requests are in flight at once.

Why It Can Become Disguised RPC

The pattern becomes weak when teams standardize on it for nearly every interaction. If every service sends a request message and then waits immediately for one reply before doing anything else, the architecture has not really become event-driven. It has rebuilt request/response with:

  • more hops
  • more timeout logic
  • more tracing complexity
  • more correlation-state handling

This is not automatically wrong. It is just not automatically simpler because the transport uses a broker.

Timeouts and Late Replies

One of the hardest operational parts of correlated request/reply is dealing with non-answers and late answers. The requester needs a policy:

  • how long to wait
  • what to do if no reply comes
  • how to treat a reply that arrives after the caller timed out
  • whether duplicate replies are possible and how they are ignored
1type PendingRequest = {
2  correlationId: string;
3  expiresAt: Date;
4  status: "waiting" | "completed" | "timed_out";
5};

This small state model illustrates the real cost of the pattern: the requester now owns request lifecycle tracking.

Common Mistakes

  • using request/reply over messaging for interactions that are really just low-latency RPC
  • omitting expiry and timeout policy from the contract
  • failing to define how late replies are handled
  • confusing correlation IDs with business IDs and mixing their purposes
  • assuming messaging transport removes the need for clear caller dependency design

Design Review Question

A platform team requires all inter-service traffic to use correlated request/reply over the broker, even for simple read calls that need immediate answers. Why is that not automatically a good standard?

Because transport standardization does not reduce dependency shape by itself. If most calls are still prompt request/response interactions, the system may simply be rebuilding RPC with extra broker hops, reply routing, and timeout complexity instead of using the most direct integration style for the problem.

Quiz Time

Loading quiz…
Revised on Thursday, April 23, 2026