Why should I use event sourcing?

We already have some blog posts here on when you could use event sourcing with PHP or how you could combine it with doctrine ORM, but we have not yet talked about why you should even do it. Today, we will be talking about the benefits of event sourcing!

No Data Loss

In a traditional CRUD ORM approach, you save the current state in the database. This means that every update or delete action taken will result in losing data. Over the lifetime of a software project, there are tons of these update or delete actions, which means a lot of data is lost - forever. Well, maybe not forever if you have excellent logging, then you could try to search through all the logs for the data you need. Trust me, no one wants to do that kind of work. So the conclusion is: The data is forever lost.

With event sourcing, every data change is saved as an event in the append-only event store. What does that mean? It means that we track every state change and not just the end state in the database, and therefore we don't lose any data - ever! As a bonus, you can easily add meta information to these events with event sourcing. So, we also have more context for each state change, which will further improve our understanding of why these changes were made.

Reporting & Analytics

Since event sourcing enables us to save events that are facts and reflect business decisions, we are building a great data pool. With this large and rich dataset, we can provide meaningful insights and features. We can replay the saved events for temporal analytics or permanent reporting use cases.

Due to the events, we have a great understanding of what the user is doing in our system and in which timeframe they did it. If we take the example of a shop system, we could investigate how long it takes for the user to put an item into the cart. Another useful piece of information would be if the user goes into checkout but decides at the last second to remove an item from the cart. This information is interesting and valuable for businesses, which can use it to decide on, for example, restructuring product landing pages or the checkout flow.

An example of how meta information could help is marketing campaigns. Be it classical discounts or the more modern variant through affiliate partners and influencer campaigns, this extra information, such as where the user came from, can be saved as additional meta information in the event. This can help analyze how these campaigns are performing and how the user behaves in these situations. And the best part: we don't need an external service for that, since we have all the data in place and the tools at hand to create our reports. For our affiliate example, we could create a message decorator that adds the affiliate ID to the message.

use Patchlevel\EventSourcing\Message\Message;
use Patchlevel\EventSourcing\Repository\MessageDecorator\MessageDecorator;
use Symfony\Component\HttpFoundation\RequestStack;

final readonly class AffiliateIdDecorator implements MessageDecorator
{
    public function __construct(private RequestStack $requestStack)
    {}

    public function __invoke(Message $message): Message
    {
        if ($message->event()::class !== UserVisitedForTheFirstTime::class) {
            return $message;
        }

        $affiliateId = $this->requestStack->getMainRequest()?->query->get("affiliate_id");

        if ($affiliateId === null) {
            return $message;
        }

        return $message->withHeader(new AffiliateIdHeader($affiliateId));
    }
}

Speaking of marketing campaigns, there is often the urge not only to temporarily review them but also to create reports on them. Whether it's for the affiliate partner, the marketing team, or even the management team, each of these groups may require a dedicated report. And most of the time, you only know what you want after it has already happened. But with event sourcing, this is no problem at all! You can create dedicated projections for each use case and replay all events, as if this reporting feature had been there from the very beginning of the shop!

These reports can even be fed with real-time data. This is often not possible with a classical CRUD ORM approach due to the lack of designated tables for these reports. As a result, we would need to query nearly everything in the database to create meaningful reports, which leads to high database load and long computation times. Most of the time, this is done through a CLI command on demand or on a regular basis with a cron job. However, creating these reports often leads to huge SQL queries and a lot of code that is written purely for that one use case.

With event sourcing, there is no need to create a cron job or similar process to accumulate all the data, as we have a dedicated projection for each use case. These projections are then super fast to query, enabling on-demand, data-rich reports! This helps businesses quickly obtain high-quality information about the system and user behavior, which they can use to improve their operations. In our case, the subscription engine, running in a separate process, will manage all our projections to ensure they are up to date.

Improved Software Architecture & Scalability

When using event sourcing, our application will leverage an event-based architecture. With this, we can easily slice our application into bounded contexts and decouple the core business logic from the framework and external services. This leads to a cleaner software project, which enables us to refactor parts of the application more easily. While this is also possible with the classical CRUD ORM approach, it isn't as straightforward as with an event-based architecture. The use of events for communication between different parts of our application allows us to split our monolithic application into smaller applications with ease.

Splitting a larger application into smaller parts brings significant improvements in scalability. I'm not a fan of assuming the project will become the next Amazon or Netflix in a week and attempting to scale the application to that level. However, with an event-driven architecture, it is significantly simpler to split the application, and having this option at hand is always beneficial, especially when it doesn't require extra resources.

With event sourcing, you can create highly efficient asynchronous processes with dedicated projections for every use case. These projections can be very specific and the resulting read model is highly specialized, even utilizing the right database for the job. This opens up a wide range of possibilities in terms of features, performance optimizations, and scalability options. For example, you might use ElasticSearch for search or MongoDB for geofencing. If you want to switch to another technology for these cases, you can easily create a new projection using the new technology.

Testing & Bug Analysis

We all know that bugs can and will happen. When they do, we need to know what happened and why. Most of the time, we only know what happened due to error tracking tools like Sentry. The trickier part is often determining why the bug occurred and finding the root cause of the issue. Here, we have a significant advantage when using event sourcing. We can see what happened before the incident disrupted our process in the event store and even replay the events in our development setup to reproduce the bug. This is not easily possible with the classical approach, as I mentioned. Most data there is lost, or you may be lucky to have log entries with all the data and context available, but this is rarely the case.

The next topic is automatic testing to prevent bugs from happening again. Setting up these tests is easier and more intuitive since we can fully embrace a best practice when writing tests: "Arrange, Act, Assert," or sometimes called "Given, When, Then". We already have the needed events from the bug at hand and we know what the user did. So, we know how to act, and we know the expected output that we can assert. We are currently working on a testing utility to make this setup even easier for our users.

Auditing

The last advantage we are discussing here today is auditing. This is a topic that has come up in so many projects throughout my career - it's crazy. Every time we struggled with it because it is extremely hard to build a robust system in a traditional CRUD ORM approach. There are possibilities to build such auditing systems, but they come with huge costs, both technical and financial, of course.

With doctrine, for example, we can use the lifecycle events to hook into the saving process before it really happens and update what should be saved. This works if you just want to add information to the entity itself, but as soon as you try to update another entity, like a dedicated log table, it can quickly get messy and slow. In fact, you should not do it, since you would need to call EntityManager::flush() in the listener, which is highly discouraged by the doctrine maintainers.

And with event sourcing? It's free! Our event store is our auditing. Maybe we need to add some more meta information to some or all events, for example, who triggered them, but as mentioned earlier, this is easily possible with a message decorator.

Conclusion

Most of these advantages are based on the fact that we have much more data at hand due to the difference in what data we store in the database. In our current time, data is king. Most of the big businesses, if not all, are thriving due to the massive data they possess. This data enables businesses to make better decisions, whether it's creating new products, understanding why customers back out and don't buy, or figuring out what they need to change to reach their goals. With event sourcing, you can provide business value in the future because you have the data that you would normally not have. In the end, this is the biggest advantage you have when using event sourcing, and it comes for free.

Other Recent Posts

RSS

The Performance Factor in Event Sourcing: What You Need to Know

This article addresses the common concern regarding the performance of event sourcing, particularly the speed at which long-living aggregates with many events are loaded. It explores solutions such as snapshotting and stream splitting to optimize aggregate loading. Furthermore, projections allow for the creation of highly flexible and optimized read models, each can be tailored to specific needs.

Daniel Badura
Daniel Badura
Software Entwickler

What is new in patchlevel/event-sourcing in version 3.7

We’re excited to announce the release of our php library patchlevel/event-sourcing version 3.7.0. This release features better testing capabilities with InMemorySubscriptionStore::clear, improved subscription performance and a new #[Stream] attribute for micro aggregates!

Daniel Badura
Daniel Badura
Software Entwickler

What is new in php event sourcing 3.6

We are happy to announce the release of the php event sourcing library in version 3.6.0. This release contains several exciting new features like Pipe and Reducer. In this blog post, we will provide you with an overview of the changes.

David Badura
David Badura
Software Entwickler