A realistic lesson on the operational, technical, and organizational costs introduced when boundaries become distributed rather than remaining in-process.
Distributed boundaries introduce costs that many teams underweight when they first adopt microservices. Once a boundary crosses process or network lines, the architecture must handle latency, partial failure, eventual consistency, contract governance, telemetry, and a larger operational surface. None of those costs automatically make distribution wrong. They do mean that distribution should be justified as carefully as any other expensive design decision.
This is why it is useful to think of distributed architecture as an architectural cost center. The question is not “Can we split this out?” The question is “What autonomy or resilience benefit do we gain that exceeds the new coordination tax?”
flowchart LR
A["In-process boundary"] --> B["Distributed boundary"]
B --> C["Network latency"]
B --> D["Partial failure"]
B --> E["Eventual consistency"]
B --> F["Contract governance"]
B --> G["Observability overhead"]
B --> H["More operational surface"]
Inside one process, calls are usually cheap and failure modes are narrow. Across services, every interaction can time out, retry, return partial results, or fail in ways that are hard to reproduce locally. This changes design discipline:
Teams often underestimate this shift because the call site can still look simple in code.
A local transaction in a monolith can keep several changes consistent with relatively low complexity. Across distributed services, that same workflow often becomes a coordination problem. Teams need to decide:
This is one reason some boundaries should remain in-process longer. If a workflow still wants one strong transaction model, forcing it across services can create more complexity than value.
Once boundaries become distributed, interfaces become products in their own right. Teams need:
Without that discipline, the system regains coupling in a more expensive form. Instead of shared code and schemas, the coupling now lives in breaking integrations and emergency release coordination.
Each new service usually means:
That is why teams with weak platform support or low operational maturity often struggle after early microservice adoption. The boundary may be conceptually right, but the organization still has to run it reliably.
One simple architecture note can expose whether the trade-off is being taken seriously.
1candidate_service: recommendations
2benefits:
3 - independent scaling
4 - separate experimentation cadence
5costs:
6 - new async event contracts
7 - extra observability and on-call surface
8 - increased stale-data tolerance in UI
9decision: worthwhile because benefits are persistent and team can operate the cost
A team wants to extract a service because “it is cleaner” but cannot yet explain the expected release, scaling, reliability, or ownership benefit. It also lacks contract tests and trace propagation. What is the strongest review outcome?
The stronger answer is to reject or delay the extraction. The proposed benefit is still aesthetic, while the operational and integration costs are concrete. A boundary should not become distributed until the team can justify why the new cost is worth paying.