Show a more advanced architecture with event routing, stream processing, workflow orchestration, replay safety, and stronger operational controls.
An event-heavy platform needs a different serverless shape than a small request-first product. Once many facts are being published, consumed, projected, replayed, and correlated across services, the architecture needs stronger controls around schema evolution, routing, workflow state, replay safety, and observability. The platform can still be serverless, but it is no longer simple because the compute is managed. It is now a distributed event system with managed compute.
A healthy event-heavy architecture usually includes:
flowchart LR
A["APIs and producers"] --> B["Event routing layer"]
B --> C["Operational consumers"]
B --> D["Projection consumers"]
B --> E["Workflow starter"]
E --> F["Workflow engine"]
C --> G["Transactional stores"]
D --> H["Read models"]
B --> I["Replay / quarantine path"]
C --> J["Tracing and metrics"]
D --> J
F --> J
What to notice:
This shape is appropriate when the platform has:
It is not the right default for every product. The cost of operating this architecture is justified only when the event workload is truly central.
1event_heavy_platform:
2 routing:
3 event_bus: true
4 schema_controls: true
5 consumers:
6 operational_handlers: true
7 projection_handlers: true
8 workflows:
9 orchestration_engine: true
10 safety:
11 dlq: true
12 replay_controls: true
13 observability:
14 tracing: true
15 lag_metrics: true
Event-heavy platforms become fragile when teams treat replay as simple republishing. A good architecture decides:
The anti-pattern is sophisticated event flow with casual event governance.
An event-heavy platform is healthier when handlers are separated by job, not only by topic subscription. In practice, three consumer classes should usually stay distinct:
Mixing those roles into one consumer often creates hidden coupling. A replay that is safe for a projection may be unsafe for an operational side effect. A workflow starter may need correlation and deduplication controls that a reporting consumer does not. Treating them as different architectural responsibilities makes the platform easier to reason about and easier to recover.
1consumer_classes:
2 operational:
3 mutates_business_state: true
4 replay_requires_controls: true
5 projection:
6 rebuildable: true
7 side_effect_free: preferred
8 workflow:
9 starts_or_updates_process_state: true
10 correlation_required: true
This architecture needs stronger runbooks and controls around:
If the team cannot observe and contain those behaviors, the platform may be technically elegant but operationally weak.
This shape assumes more than managed compute. It assumes that the organization can own contracts, review schemas, monitor lag, isolate failures, and decide when replay is safe. Without those disciplines, the platform tends to become event-rich but decision-poor: lots of topics, lots of consumers, and very little confidence in what can be changed safely.
That is why the right review question is not only “can we build this?” It is “can we operate this for the next year without losing trust in the event model?”
A team wants to move from a simple API-plus-queue design to an event-heavy platform because “events are more scalable.” They do not yet have schema governance, trace propagation, or replay procedures. What should the review challenge first?
The stronger answer is operational readiness and contract discipline. Event-heavy serverless is powerful, but without schema controls, replay safety, and lag visibility, the team is adding distributed complexity faster than it can govern it.