Explain parallel processing patterns where many tasks are launched and later aggregated. Discuss timeout handling, partial results, and cost implications.
Fan-out/fan-in workflows parallelize work by splitting one job into many independent tasks, then collecting their results into one aggregated outcome. In serverless systems, this is a natural fit for image processing, record enrichment, document analysis, and bulk notification pipelines because short-lived functions can process slices of work concurrently without requiring a large permanently running worker fleet.
The design challenge is not parallelism itself. It is deciding how much parallelism the dependencies can tolerate, what counts as “done,” and how partial failure should be handled. Fan-out/fan-in is a performance pattern, but it is also a coordination pattern, and the aggregation rules matter as much as the worker code.
flowchart LR
A["Start job"] --> B["Split into N tasks"]
B --> C["Worker 1"]
B --> D["Worker 2"]
B --> E["Worker N"]
C --> F["Result store"]
D --> F
E --> F
F --> G["Aggregator"]
G --> H["Final status"]
What to notice:
This pattern is strongest when:
It is weaker when every step depends tightly on the previous one or when downstream systems cannot absorb bursty parallelism.
Teams often spend too much time on the fan-out side and too little time on the fan-in side. The important questions are:
These are product and operational questions, not just programming questions.
1workflow:
2 name: batch-image-analysis
3 fan_out:
4 chunk_size: 25
5 max_parallel_tasks: 20
6 fan_in:
7 required_completion_ratio: 0.95
8 timeout_seconds: 300
9 on_timeout: mark-partial
1type ChildResult = {
2 itemId: string;
3 status: "complete" | "failed" | "timed_out";
4 output?: string;
5};
6
7export function summarizeBatch(results: ChildResult[]) {
8 const complete = results.filter((r) => r.status === "complete").length;
9 const failed = results.filter((r) => r.status !== "complete").length;
10
11 return {
12 complete,
13 failed,
14 overallStatus: failed === 0 ? "complete" : "partial",
15 };
16}
What this demonstrates:
Serverless makes fan-out easy enough that teams sometimes forget cost and downstream capacity. Launching 10,000 parallel function invocations may reduce completion time, but it can also:
That is why bounded parallelism usually beats unlimited parallelism. The strongest designs separate “the total number of items” from “the maximum concurrency the system may use right now.”
A report-generation workflow splits one customer request into 500 analysis tasks. The team assumes the job is complete only when all 500 finish successfully. In practice, a few slow tasks frequently block the entire result for too long. What should be revisited first?
The stronger answer is the completion contract. The team should decide whether partial completion, quorum, or separate handling for stragglers would satisfy the product better than a rigid all-or-nothing fan-in rule. The problem may not be worker performance. It may be an aggregation rule that is harsher than the business actually needs.