Event-Driven Architecture Pattern Selection Matrix

March 26, 2026

Decision matrix for matching event-driven patterns to coordination, delivery, and scaling needs.

This appendix is a decision aid for pattern choice. It does not replace the chapter lessons, because pattern selection always depends on trade-offs. What it does provide is a fast way to move from a system problem to the small set of patterns most likely to fit, along with the warning signs that should slow the decision down.

The strongest way to use this matrix is to start from the actual problem shape, not from the pattern you already want to use. Many event-driven mistakes happen because teams begin with “we should use events here” rather than with “what dependency, failure, or scaling problem are we actually solving?”

    flowchart TD
	    A["Start with the problem"] --> B{"Is the problem fan-out, work distribution, reliability, workflow, read modeling, analytics, or governance?"}
	    B --> C["Select a candidate pattern family"]
	    C --> D["Check trade-offs and weak-fit signals"]
	    D --> E["Choose the narrowest pattern that solves the real problem"]

What to notice:

pattern selection should begin with the failure mode or dependency shape
several problems have more than one plausible pattern, but one is usually a stronger default
the weakest decisions usually come from overusing one favorite pattern everywhere

Problem-to-Pattern Matrix

Problem Shape	Strong Candidate Pattern	Why It Fits	What to Watch	Weak-Fit Signal
One business fact should trigger many independent downstream reactions	Publish/subscribe	It lets several consumers react independently to the same fact	Schema stability and fan-out governance	Consumers are not independent and actually need work distribution or a reply
Background work needs to be spread across workers	Work queue or competing consumers	It distributes units of work across worker instances efficiently	Idempotency, retries, and hot partitions	Several consumers all need the same fact rather than one of them doing the work
State change and event publication can drift apart	Transactional outbox	It aligns local commit with later safe publication	Relay monitoring and duplicate publish handling	The team publishes “after commit” and hopes the broker call never fails
Consumers keep making repeated callbacks to the source service for the same details	Event-carried state transfer	It improves downstream autonomy by carrying useful state once	Payload growth and contract pressure	The event grows into a producer-internal dump rather than a useful domain contract
One system still needs a response, but messaging transport is already central	Correlated request/reply	It preserves a request-response shape over asynchronous transport	Timeouts, late replies, and correlation state	It is being used to hide normal low-latency RPC for almost every interaction
A long-running workflow spans several local transactions	Saga with choreography or orchestration	It models progress and recovery across distributed business steps	Compensation design and visibility	The team only describes the happy path and has no failure or compensation model
Read concerns need different shapes from write-side correctness concerns	Projections or CQRS-style read models	They let the read side optimize for query usefulness	Replay safety and eventual consistency	The read model is quietly being treated as the source of truth
Rolling metrics, joins, or near-real-time insight are needed	Stream processing with windows	It supports continuous computation over live event flow	Event-time modeling, state, and backpressure	The problem is simple reporting that a batch job or projection could solve more safely
A few bad events should not block normal processing	Dead-letter queue or quarantine path	It isolates repeated failures from the main path	Ownership, diagnosis, and replay discipline	The DLQ is treated as a trash bin with no review model
Contract changes risk breaking unknown or lagging consumers	Schema registry and compatibility checks	They make ownership and safe evolution explicit	Overly casual renames and semantic drift	The team still treats event changes like internal refactors
Tenant or data-sensitivity boundaries must be enforced	Scoped ACLs, stream governance, and tenant-isolation design	They keep shared event platforms from becoming leakage surfaces	Over-broad consume rights and shared-tooling risk	Tenant ID exists in payloads but no real isolation controls exist

Quick Decision Paths

Some pattern choices repeat so often that short decision rules help.

Use publish/subscribe when one fact should reach many independent consumers. Use a queue or competing-consumer model when one work item should be handled by one worker from a pool. Use event-carried state transfer when callback-heavy notification is preserving runtime coupling. Use correlated request/reply only when a response relationship is truly required and broker-mediated interaction is still useful. Use projections when read shape and read latency matter more than reusing the write model directly. Use a saga only when one business process really spans several local commits and failure recovery must be explicit. Use stream processing when the problem is continuous analytics or rolling computation, not just asynchronous integration.

Pattern Combinations That Often Work Well Together

Some of the strongest event-driven designs combine patterns instead of treating them as mutually exclusive.

Transactional outbox plus domain events: strong when business state and reliable publication must stay aligned.
Event-carried state transfer plus projections: strong when several consumers need local autonomy and tailored read views.
Saga plus orchestration: strong when workflow visibility and compensation discipline matter more than maximum decentralization.
Publish/subscribe plus stream processing: strong when operational consumers and analytics both need the same business fact for different reasons.
Schema governance plus replay tooling: strong when retained history will be rebuilt or reprocessed over time.

The key is not to combine patterns for sophistication. Combine them only when each one closes a different real risk.

Questions to Ask Before Choosing

Before locking in a pattern, ask:

Is the problem mainly fan-out, work distribution, workflow control, read modeling, or continuous analytics?
Does the consumer need a fact, a task, or a reply?
Does duplicate handling matter at the business-effect boundary?
Will retained history or replay make this contract more permanent than the team expects?
Is the team choosing a pattern because the problem demands it, or because the pattern feels architecturally fashionable?

These questions often eliminate weak-fit patterns quickly.

Design Review Question

A team says they want one standard solution for all integration, so they plan to use correlated request/reply over the broker for user notifications, batch work dispatch, search aggregation, and normal service reads. What is the strongest challenge?

The strongest challenge is that transport standardization is being mistaken for architectural fit. Those workloads have different dependency shapes. Some need publish/subscribe, some need work distribution, some may need request/reply, and some may still be ordinary RPC. One pattern for everything usually means the actual problem shapes are being ignored.

Check Your Understanding

Loading quiz…

Revised on Wednesday, June 3, 2026

Appendix A

Appendix C