Saga Orchestration And Choreography for Microservices

This story talks about how we can exchange data between microservices in event-driven architecture and how the saga pattern helps us.

distributed local transaction

imagine that you are building an e-commerce store where customers have a credit limit. The application must ensure that a new order will not exceed the customer’s credit limit and since orders and customers are in different databases owned by different services (Each service has its own database) the application cannot simply use a local ACID transaction.

A naive implementation typically uses chained HTTP call between services. That’s fine if you’re working on a quick demo.

if the payment fails, we need to roll back the stock reservation and the order creation.

Concretely, if the inventory system managed to reserve some stocks, but the payment system timed out for whatever reason, we cannot say that the payment has failed. If we treat timeout as a failure, we would have rolled back the stock reservation and canceled the order, but the payment actually did go through.

1- No explicit inter-service communication.
2- Global transaction as a series of local ACID transactions.
3- Transaction is always in a defined state.
4- Transaction state is not managed.
5- Eventually consistent, but consistent nonetheless.
6- Reactive.

saga is a sequence of local transactions. Each service in a saga performs its own transaction and publishes an event. The other services listen to that event and perform the next local transaction.

1- Orchestration-Based Saga.
2- Choreography-Based Saga.

Orchestration-Based Saga:

a coarse-grained service that exists only to facilitate the Saga.
In this approach, there is a Saga orchestrator that manages all the transactions and directs the participant services to execute local transactions based on events. This orchestrator can also be thought of as a Saga Manager.
It is responsible for coordinating the global transaction flow, that is, communicating with the appropriate services that involve in the transaction, and orchestrate the necessary compensation action.

1- Lack of visibility across workflows so the company needs to know the instances of their workflows, how many of them run per hour, or if it’s on a high scale, how many will run per min. Additionally how many workflows have failed and what was the reason behind failure. If there are any exception scenarios that are being missed. So, whether you implement orchestration or choreography this problem is prevalent.

2- Another aspect is the ambiguous error handling so if say one server has raised the event and the other couldn’t process it, the message will be lost in the no man’s land. And we run into a situation where one party will assume the other would take care of this exception and the other assumes the first one will take care of it but then no one really takes care of it. So these are the two most prevalent problem areas.

3- Third major challenge is scaling, the typical workflow engine the way they are implemented they use databases behind them. Scaling up is a problem when you have a lot of disguises. So that was another problem. For example, Amazon has to implement microservices and if they have to implement workflows’ there are thousands and thousands of orders that get placed and they have to be processed really fast. Hence, scaling up is a major challenge with traditional services.

Choreography-Based Saga:

In this approach, there is no central orchestrator. Each service participating in the Saga performs their transaction and publish events. The other services act upon those events and perform their transactions. Also, they may or not publish other events based on the situation.

1- You have multiple point-to-point communication
2- Each one will be 100% dependent on the other one
3- There may be complex failure implementation to handle
4- For example, service “D” fails and as a result of that service “B” fails, then ultimately the end result will not really work
5- It is incredibly difficult to debug the entire process. If any service fails you will not know where to look for the failure
6- Testing will be a challenge because you will need all of the services running at the same time to be able to test. Due to all the dependencies, you will have a heavy dependency structure.

The main benefit of the Saga Pattern is that it helps maintain data consistency across multiple services without tight coupling. This is an extremely important aspect of microservices architecture.

However, the main disadvantage of the Saga Pattern is the apparent complexity from a programming point of view. Also, developers are not as well accustomed to writing Sagas as traditional transactions. The other challenge is that compensating transactions also have to be designed to make Sagas work.

In my opinion, Sagas can help solve certain challenges and scenarios. They should be adopted or explored if the need arises. However, I would love to hear if others have also used Saga Pattern and how was the experience? What frameworks (if any) did you use?

A combined example of both models at the same time

as Software Engineer Trying to permanently progress and being up to date is my most important interest