Event-Driven Architecture Common Anti-Patterns

March 23, 2026

A practical synthesis of recurring event-driven failure modes, including hidden choreography, callback storms, schema drift, unsafe replay, and platform-wide overengineering.

Anti-patterns are useful because they usually sound plausible in local design conversations. They often begin as reasonable optimizations: keep the event light, let services react independently, standardize on one transport, future-proof the domain. The problem is that these local choices can accumulate into systems that are hard to explain, unsafe to replay, or impossible to govern.

The strongest anti-pattern review is not moralistic. It asks what failure mode the design is quietly normalizing. Most recurring event anti-patterns boil down to one of five risks:

hidden coupling
hidden workflow logic
hidden duplication risk
hidden contract instability
hidden platform sprawl

    flowchart LR
	    A["Thin notification + constant callbacks"] --> E["Hidden runtime coupling"]
	    B["Many reacting services, no owner"] --> F["Hidden workflow logic"]
	    C["Non-idempotent side effects"] --> G["Hidden duplicate risk"]
	    D["Schemas change like private code"] --> H["Hidden contract instability"]
	    E --> I["Fragile event system"]
	    F --> I
	    G --> I
	    H --> I

What to notice:

anti-patterns often hide their cost in another part of the system
the architecture can look modern while still failing in old ways
the most dangerous problems are frequently the least visible during happy-path demos

Hidden Choreography

One of the most common anti-patterns is hidden choreography: several services react across a workflow, but nobody can explain the end-to-end process, timeout policy, or compensation path. Each local handler looks reasonable. The overall business process becomes opaque.

This is especially dangerous in workflows that are:

business critical
long-running
branch-heavy
difficult to correct after partial success

Choreography is not the anti-pattern. Hidden choreography is.

Notification-Only Callback Storms

Another recurring anti-pattern is overusing notification events with payloads so thin that most consumers immediately call back for the same details. The system looks asynchronous in diagrams but is still tightly coupled in availability and latency terms.

This is often a sign that event-carried state transfer, a projection, or a different read model would be healthier than repeated fetch-after-notify behavior.

Schema Drift as Internal Refactor

A classic event anti-pattern is treating event contracts like internal code. Fields get renamed because domain language changed internally. Meanings shift quietly under the same event name. Older consumers “still parse” while becoming semantically wrong. This creates slow corruption rather than loud failure.

That is why schema drift is more dangerous than some visible breakages. Silent compatibility loss can live in production for a long time.

Non-Idempotent External Effects

Systems often handle retries and replay safely until the moment a consumer sends an external side effect:

billing call
partner webhook
customer email
entitlement grant

If those effects are not duplicate-safe, the event system’s reliability story is incomplete. This anti-pattern often appears in teams that trust transport guarantees more than application effect boundaries.

1antiPatternChecklist:
2  hiddenChoreography: true
3  callbackStorms: true
4  schemaAsPrivateCode: true
5  nonIdempotentSideEffects: true
6  eventSourcingEverywhere: true

Event Sourcing Everywhere

Another anti-pattern is turning specialized tools into default architecture identity. Event sourcing, CQRS, or saga orchestration may be justified in some domains. Making them the default for every bounded context often creates platform-wide complexity without proportional business value.

This anti-pattern is usually motivated by architectural fashion rather than by domain need.

Design Review Question

A platform team is proud that nearly every service “uses events,” but most business flows still depend on callback-heavy notification, hidden choreography, and non-idempotent partner integrations. What is the strongest critique?

The strongest critique is that the system has adopted event transport more successfully than it has adopted event-safe operating patterns. The architecture may look event-driven in diagrams while still preserving the same coupling and failure risks underneath.

Quiz Time

Loading quiz…

Revised on Wednesday, June 3, 2026

16.1 Core Good Patterns

16.3 Small-Team Reference Architecture