1. The Problem: Distributed Transactions in Microservices

In a monolithic application, you can wrap a complex operation in a single database transaction: either all steps commit or they all roll back. In a microservices architecture, each service has its own database. There is no single transaction that spans Order Service, Inventory Service, and Payment Service.

The naive solution is Two-Phase Commit (2PC), but it requires a distributed coordinator, locks all participants during the prepare phase, and fails catastrophically if the coordinator goes down mid-commit. In microservices, 2PC is considered an anti-pattern because it couples services tightly and severely limits availability.

2. What is a Saga?

A Saga is a sequence of local transactions. Each step performs a local database transaction within a single service and publishes an event or message to trigger the next step. If any step fails, the Saga executes compensating transactions in reverse order to undo the effects of the previously completed steps.

Sagas achieve eventual consistency rather than the strong (ACID) consistency of a database transaction. During the Saga execution, other services may observe intermediate states — for example, stock is reserved but payment has not yet been charged. Good Saga design accounts for these transient states.

3. E-Commerce Order Saga Example

  ORDER SAGA — HAPPY PATH:
  ┌─────────────────────┐
  │  1. Create Order    │  Order Service: INSERT order (status=PENDING)
  │     ↓ OrderCreated  │
  ├─────────────────────┤
  │  2. Reserve Stock   │  Inventory Service: reserve items (stock -= N)
  │     ↓ StockReserved │
  ├─────────────────────┤
  │  3. Charge Payment  │  Payment Service: charge credit card
  │     ↓ PaymentCharged│
  ├─────────────────────┤
  │  4. Ship Order      │  Shipping Service: create shipment
  │     ↓ OrderShipped  │
  └─────────────────────┘
  Order Service: UPDATE order status=COMPLETED ✓

  ORDER SAGA — FAILURE AT STEP 3 (Payment fails):
  Steps 1 and 2 have committed locally.
  Compensating transactions (reverse order):
  ← Compensate Step 2: Release Stock  (stock += N)
  ← Compensate Step 1: Cancel Order   (status=CANCELLED)

4. Choreography-Based Saga

In a choreography-based Saga, there is no central coordinator. Each service listens for events and decides what to do next. Services communicate via a message bus (e.g. Kafka, RabbitMQ).

Choreography — event flow (pseudocode) // Order Service on HTTP POST /orders: insert order (status=PENDING) publish "OrderCreated" event to Kafka // Inventory Service (listens for OrderCreated) on event "OrderCreated": try reserve_stock(order_id, items): publish "StockReserved" event catch InsufficientStock: publish "StockReservationFailed" event // triggers compensation // Payment Service (listens for StockReserved) on event "StockReserved": try charge_payment(order_id, amount): publish "PaymentCharged" event catch PaymentFailed: publish "PaymentFailed" event // triggers compensation // Inventory Service (listens for PaymentFailed — compensate) on event "PaymentFailed": release_stock(order_id) // compensating transaction publish "StockReleased" event // Order Service (listens for StockReleased — compensate) on event "StockReleased": update order status = CANCELLED

Choreography Advantage: No Single Point of Failure

Choreography is fully decentralised — no orchestrator process can fail and halt the entire workflow. Services are loosely coupled and can be deployed independently. The downside is visibility: to understand a Saga's state, you must correlate events across multiple Kafka topics. Distributed tracing (Jaeger, Zipkin) becomes essential.

5. Orchestration-Based Saga

In an orchestration-based Saga, a central Saga Orchestrator directs each service step. The orchestrator knows the entire workflow and handles failure scenarios centrally.

  ORCHESTRATION FLOW:

  Saga Orchestrator
       │
       ├──▶ Order Service:     "CreateOrder"     ──▶  ack
       │
       ├──▶ Inventory Service: "ReserveStock"    ──▶  ack (or FAIL)
       │         FAIL ──▶ Orchestrator triggers compensation:
       │                  Order Service: "CancelOrder"
       │
       ├──▶ Payment Service:   "ChargePayment"   ──▶  ack (or FAIL)
       │         FAIL ──▶ Orchestrator triggers compensation:
       │                  Inventory Service: "ReleaseStock"
       │                  Order Service:     "CancelOrder"
       │
       └──▶ Shipping Service:  "ShipOrder"       ──▶  ack
                 DONE: Orchestrator marks Saga complete ✓

6. Choreography vs Orchestration Comparison

AspectChoreographyOrchestration
CoordinationDecentralised (events)Centralised (orchestrator)
CouplingLoose (services know events)Tighter (orchestrator knows all services)
VisibilityHard (trace across topics)Easy (orchestrator has full state)
Single point of failureNoYes (orchestrator must be HA)
Complexity with many stepsHigh (event spaghetti)Manageable (centralised logic)
Best toolsKafka, RabbitMQ, EventBridgeTemporal, Step Functions, Axon
DebuggingHard (distributed logs)Easy (orchestrator state machine)

7. Compensating Transactions

A compensating transaction is not a technical rollback — it is a business operation that reverses the effects of a previously committed local transaction. Every Saga step should have a defined compensating transaction designed upfront:

  • Reserve Stock → compensate: Release Stock
  • Charge Payment → compensate: Refund Payment
  • Create Shipment → compensate: Cancel Shipment
  • Send Confirmation Email → compensate: Send Cancellation Email (not a true undo, but the closest business equivalent)

Note that some operations cannot be compensated meaningfully (e.g. you cannot "unsend" an email). In these cases, design the Saga so that irreversible operations occur last, after all reversible steps have succeeded.

Sagas Do Not Provide Isolation

Unlike database ACID transactions, Sagas do not provide isolation. Other transactions can read intermediate states. For example, a concurrent read might see an order with reserved stock but no payment. Use the "countermeasures" approach: design your data model with PENDING/RESERVED/CONFIRMED states that make partial progress visible but safe. Never expose PENDING records to end users as if they were complete.

8. Saga vs 2PC

PropertySaga2PC (Two-Phase Commit)
Consistency modelEventual consistencyStrong consistency (ACID)
IsolationNone (intermediate states visible)Full isolation during transaction
AvailabilityHigh (local transactions)Low (coordinator is SPOF)
PerformanceHigh (no distributed locks)Low (locking blocks all participants)
RollbackCompensating transactions (business logic)True atomic rollback
Use in microservicesRecommendedAnti-pattern

How We Research and Update This Guide

We test the underlying formula or workflow, compare outputs with reliable references, and revise examples whenever the page content changes.

  • The workflow or formula is tested directly in the tool and compared against independent reference examples.
  • Examples are kept practical so readers can verify the result without hidden assumptions.
  • Pages are revised whenever the interface, calculation flow, or surrounding guidance materially changes.

Frequently Asked Questions — Saga Pattern