A practical synthesis of recurring event-driven failure modes, including hidden choreography, callback storms, schema drift, unsafe replay, and platform-wide overengineering.
Anti-patterns are useful because they usually sound plausible in local design conversations. They often begin as reasonable optimizations: keep the event light, let services react independently, standardize on one transport, future-proof the domain. The problem is that these local choices can accumulate into systems that are hard to explain, unsafe to replay, or impossible to govern.
The strongest anti-pattern review is not moralistic. It asks what failure mode the design is quietly normalizing. Most recurring event anti-patterns boil down to one of five risks:
flowchart LR
A["Thin notification + constant callbacks"] --> E["Hidden runtime coupling"]
B["Many reacting services, no owner"] --> F["Hidden workflow logic"]
C["Non-idempotent side effects"] --> G["Hidden duplicate risk"]
D["Schemas change like private code"] --> H["Hidden contract instability"]
E --> I["Fragile event system"]
F --> I
G --> I
H --> I
What to notice:
One of the most common anti-patterns is hidden choreography: several services react across a workflow, but nobody can explain the end-to-end process, timeout policy, or compensation path. Each local handler looks reasonable. The overall business process becomes opaque.
This is especially dangerous in workflows that are:
Choreography is not the anti-pattern. Hidden choreography is.
Another recurring anti-pattern is overusing notification events with payloads so thin that most consumers immediately call back for the same details. The system looks asynchronous in diagrams but is still tightly coupled in availability and latency terms.
This is often a sign that event-carried state transfer, a projection, or a different read model would be healthier than repeated fetch-after-notify behavior.
A classic event anti-pattern is treating event contracts like internal code. Fields get renamed because domain language changed internally. Meanings shift quietly under the same event name. Older consumers “still parse” while becoming semantically wrong. This creates slow corruption rather than loud failure.
That is why schema drift is more dangerous than some visible breakages. Silent compatibility loss can live in production for a long time.
Systems often handle retries and replay safely until the moment a consumer sends an external side effect:
If those effects are not duplicate-safe, the event system’s reliability story is incomplete. This anti-pattern often appears in teams that trust transport guarantees more than application effect boundaries.
1antiPatternChecklist:
2 hiddenChoreography: true
3 callbackStorms: true
4 schemaAsPrivateCode: true
5 nonIdempotentSideEffects: true
6 eventSourcingEverywhere: true
Another anti-pattern is turning specialized tools into default architecture identity. Event sourcing, CQRS, or saga orchestration may be justified in some domains. Making them the default for every bounded context often creates platform-wide complexity without proportional business value.
This anti-pattern is usually motivated by architectural fashion rather than by domain need.
A platform team is proud that nearly every service “uses events,” but most business flows still depend on callback-heavy notification, hidden choreography, and non-idempotent partner integrations. What is the strongest critique?
The strongest critique is that the system has adopted event transport more successfully than it has adopted event-safe operating patterns. The architecture may look event-driven in diagrams while still preserving the same coupling and failure risks underneath.