A practical lesson on request-reply over messaging, including correlation IDs, reply channels, and when the pattern is a fit versus a disguised form of RPC.
Correlated request/reply uses messaging for an interaction where one party still expects a response related to a specific request. Instead of keeping an HTTP connection open, the requester sends a message with a correlation identifier and usually a reply channel or subject. The responder later publishes a matching reply that the requester can associate with the original request.
This pattern is useful when the environment is already message-centric, when temporary disconnects make direct request/response awkward, or when reply latency is variable enough that a broker-mediated interaction is operationally cleaner. It becomes a smell when teams use it for almost every service-to-service call even though the interaction still behaves like synchronous RPC.
sequenceDiagram
participant R as Requester
participant B as Broker
participant S as Responder
R->>B: request with correlationId and replyTo
B-->>S: request
S->>B: reply with correlationId
B-->>R: matched reply
What to notice:
Correlated request/reply is strong when:
Examples include asynchronous inventory reservation, long-running document generation, or message-driven workflows where the next step depends on a discrete result.
The pattern fails quickly if request and reply matching are vague. A robust request message usually includes:
1{
2 "requestType": "inventory.reserve",
3 "requestId": "req_01882",
4 "correlationId": "ord_48392",
5 "replyTo": "inventory-replies",
6 "expiresAt": "2026-03-23T16:30:00Z",
7 "data": {
8 "sku": "SKU-19",
9 "quantity": 2
10 }
11}
The reply must preserve enough correlation context for the requester to match it safely, especially if many requests are in flight at once.
The pattern becomes weak when teams standardize on it for nearly every interaction. If every service sends a request message and then waits immediately for one reply before doing anything else, the architecture has not really become event-driven. It has rebuilt request/response with:
This is not automatically wrong. It is just not automatically simpler because the transport uses a broker.
One of the hardest operational parts of correlated request/reply is dealing with non-answers and late answers. The requester needs a policy:
1type PendingRequest = {
2 correlationId: string;
3 expiresAt: Date;
4 status: "waiting" | "completed" | "timed_out";
5};
This small state model illustrates the real cost of the pattern: the requester now owns request lifecycle tracking.
A platform team requires all inter-service traffic to use correlated request/reply over the broker, even for simple read calls that need immediate answers. Why is that not automatically a good standard?
Because transport standardization does not reduce dependency shape by itself. If most calls are still prompt request/response interactions, the system may simply be rebuilding RPC with extra broker hops, reply routing, and timeout complexity instead of using the most direct integration style for the problem.