Saga Pattern for Distributed Transactions in Clojure

Learn how sagas coordinate local transactions across Clojure microservices, when orchestration beats choreography, and why compensation is not the same as perfect rollback.

Saga: A distributed workflow made of local transactions, where each successful step may later need a compensating action if a later step fails.

Sagas are useful because microservices rarely share one transaction manager or one database. A checkout flow may reserve inventory, create an order, authorize payment, and send a confirmation event, but each step lives in a different service boundary. A two-phase commit across all of them is usually too expensive, too fragile, or simply unavailable. A saga accepts that reality and coordinates progress through explicit forward steps plus explicit recovery behavior.

Compensation Is Not Magic Rollback

The most important thing to understand about sagas is that compensation is not the same as rewinding time. If a service has already sent an email, triggered a warehouse job, or exposed data to another system, a later compensation cannot pretend the event never happened.

Compensation should instead:

  • move the business process back to an acceptable state
  • publish clear follow-up events when recovery matters to other services
  • preserve auditability of what happened

That is why the business meaning of a compensation action matters more than the fact that one exists.

Orchestration and Choreography Solve Different Coordination Problems

  • orchestration uses a central coordinator that decides which step runs next
  • choreography relies on services reacting to events without one central conductor

Orchestration is often easier to reason about when the workflow has strict sequencing, multiple compensation paths, or strong reporting requirements. Choreography can work well when the domain is already event-driven and each participant can react independently without creating hidden loops.

In practice, teams often underestimate the observability cost of choreography. The control flow looks loosely coupled in code but can become hard to understand during incidents.

The Diagram Below Shows Forward Progress and Compensation

Saga flow showing forward steps, failure, and compensating actions

The visual highlights the real shape of a saga: local success moves the workflow forward, but a later failure causes compensations to run in reverse business order for the steps that already committed.

Model Each Step as Data

In Clojure, one clean approach is to represent the workflow as data. Each step can name its action and compensation, while the runner keeps track of completed work.

 1(ns myapp.order-saga)
 2
 3(def saga-steps
 4  [{:name :reserve-inventory
 5    :run reserve-inventory!
 6    :compensate release-inventory!}
 7   {:name :create-order
 8    :run create-order!
 9    :compensate cancel-order!}
10   {:name :authorize-payment
11    :run authorize-payment!
12    :compensate void-payment!}])
13
14(defn run-saga [ctx steps]
15  (loop [remaining steps
16         completed []]
17    (if-let [{:keys [run] :as step} (first remaining)]
18      (let [result (run ctx)]
19        (if (:ok? result)
20          (recur (rest remaining) (conj completed step))
21          {:status :failed
22           :failed-step (:name step)
23           :completed completed}))
24      {:status :completed
25       :completed completed})))
26
27(defn compensate! [ctx completed]
28  (doseq [{:keys [compensate]} (reverse completed)]
29    (compensate ctx)))

The code is simple on purpose. The harder engineering questions are elsewhere:

  • how to persist progress between retries or restarts
  • how to make each step idempotent
  • how to resume after partial failure
  • how to report workflow state for operators and customer support

Idempotency and Ordering Are Core Design Requirements

Saga steps often run in environments where the same command or event may be delivered more than once. If a payment void, reservation release, or order cancellation cannot tolerate duplicates, recovery becomes risky.

Design for:

  • idempotent command handlers
  • stable workflow identifiers
  • stored step status
  • explicit timeout and retry policy per step

The question is not whether duplication can happen. The question is whether the workflow survives it cleanly.

Choreography Needs Stronger Event Discipline

Choreography can look elegant until many services start reacting to each other in ways the team cannot visualize. Event contracts, retry semantics, dead-letter handling, and observability become mandatory.

Use choreography carefully when:

  • services already publish durable business events
  • the domain benefits from looser coupling
  • each participant can act independently

Prefer orchestration when:

  • the flow has strict sequencing
  • compensation logic is complex
  • operators need one obvious place to inspect progress

Common Failure Modes

Treating Compensation as Full Undo

Some actions have side effects that can only be offset, not erased. Design the business process around that truth.

Retrying Without Idempotency

If a repeated command can double-charge, over-release inventory, or emit duplicate state transitions, the workflow is not operationally safe.

Letting Workflow State Live Only in Logs

Operators need durable state, not just scattered log lines, to understand where a saga stopped and what has already happened.

Practical Heuristics

Use sagas when a business workflow crosses service boundaries and needs explicit recovery semantics. Prefer local transactions per service, persist workflow state, design compensation honestly, and choose orchestration when the team needs clearer control flow than choreography can provide. A good saga is not one that “never fails.” It is one that fails in a way the business and operators can still understand.

Ready to Test Your Knowledge?

Loading quiz…
Revised on Thursday, April 23, 2026