We already have some blog posts here on when you could
use event sourcing with PHP or how you
could combine it with doctrine ORM, but we have not yet talked about why you
should even do it. Today, we will be talking about the benefits of event sourcing!
No Data Loss
In a traditional CRUD ORM approach, you save the current state in the database. This means that every update or delete
action taken will result in losing data. Over the lifetime of a software project, there are tons of these update or
delete actions, which means a lot of data is lost - forever. Well, maybe not forever if you have excellent logging, then
you could try to search through all the logs for the data you need. Trust me, no one wants to do that kind of work. So
the conclusion is: The data is forever lost.
With event sourcing, every data change is saved as an event in the append-only event store. What does that mean? It
means that we track every state change and not just the end state in the database, and therefore we don't lose any
data - ever! As a bonus, you can easily add meta information to these events with event sourcing. So, we also have
more context for each state change, which will further improve our understanding of why these changes were made.
Reporting & Analytics
Since event sourcing enables us to save events that are facts and reflect business decisions, we are building a great
data pool. With this large and rich dataset, we can provide meaningful insights and features. We can replay the saved
events for temporal analytics or permanent reporting use cases.
Due to the events, we have a great understanding of what the user is doing in our system and in which timeframe they did
it. If we take the example of a shop system, we could investigate how long it takes for the user to put an item into the
cart. Another useful piece of information would be if the user goes into checkout but decides at the last second to
remove an item from the cart. This information is interesting and valuable for businesses, which can use it to decide
on, for example, restructuring product landing pages or the checkout flow.
An example of how meta information could help is marketing campaigns. Be it classical discounts or the more modern
variant through affiliate partners and influencer campaigns, this extra information, such as where the user came from,
can be saved as additional meta information in the event. This can help analyze how these campaigns are performing and
how the user behaves in these situations. And the best part: we don't need an external service for that, since we have
all the data in place and the tools at hand to create our reports. For our affiliate example, we could create
a message decorator that adds the affiliate ID to the
message.
use Patchlevel\EventSourcing\Message\Message;
use Patchlevel\EventSourcing\Repository\MessageDecorator\MessageDecorator;
use Symfony\Component\HttpFoundation\RequestStack;
final readonly class AffiliateIdDecorator implements MessageDecorator
{
public function __construct(private RequestStack $requestStack)
{}
public function __invoke(Message $message): Message
{
if ($message->event()::class !== UserVisitedForTheFirstTime::class) {
return $message;
}
$affiliateId = $this->requestStack->getMainRequest()?->query->get("affiliate_id");
if ($affiliateId === null) {
return $message;
}
return $message->withHeader(new AffiliateIdHeader($affiliateId));
}
}
Speaking of marketing campaigns, there is often the urge not only to temporarily review them but also to create reports
on them. Whether it's for the affiliate partner, the marketing team, or even the management team, each of these groups
may require a dedicated report. And most of the time, you only know what you want after it has already happened. But
with event sourcing, this is no problem at all! You can create dedicated projections for each use case and replay all
events, as if this reporting feature had been there from the very beginning of the shop!
These reports can even be fed with real-time data. This is often not possible with a classical CRUD ORM approach due to
the lack of designated tables for these reports. As a result, we would need to query nearly everything in the database
to create meaningful reports, which leads to high database load and long computation times. Most of the time, this is
done through a CLI command on demand or on a regular basis with a cron job. However, creating these reports often leads
to huge SQL queries and a lot of code that is written purely for that one use case.
With event sourcing, there is no need to create a cron job or similar process to accumulate all the data, as we have a
dedicated projection for each use case. These projections are then super fast to query, enabling on-demand, data-rich
reports! This helps businesses quickly obtain high-quality information about the system and user behavior, which they
can use to improve their operations. In our
case, the subscription engine, running in a separate
process, will manage all our projections to ensure they are up to date.
Improved Software Architecture & Scalability
When using event sourcing, our application will leverage an event-based architecture. With this, we can easily slice our
application into bounded contexts and decouple the core business logic from the framework and external services. This
leads to a cleaner software project, which enables us to refactor parts of the application more easily. While this is
also possible with the classical CRUD ORM approach, it isn't as straightforward as with an event-based architecture. The
use of events for communication between different parts of our application allows us to split our monolithic application
into smaller applications with ease.
Splitting a larger application into smaller parts brings significant improvements in scalability. I'm not a fan of
assuming the project will become the next Amazon or Netflix in a week and attempting to scale the application to that
level. However, with an event-driven architecture, it is significantly simpler to split the application, and having this
option at hand is always beneficial, especially when it doesn't require extra resources.
With event sourcing, you can create highly efficient asynchronous processes with dedicated projections for every use
case. These projections can be very specific and the resulting read model is highly specialized, even utilizing the
right database for the job. This opens up a wide range of possibilities in terms of features, performance optimizations,
and scalability options. For example, you might use ElasticSearch for search or MongoDB for geofencing. If you want to
switch to another technology for these cases, you can easily create a new projection using the new technology.
Testing & Bug Analysis
We all know that bugs can and will happen. When they do, we need to know what happened and why. Most of the time, we
only know what happened due to error tracking tools like Sentry. The trickier part is often determining why the bug
occurred and finding the root cause of the issue. Here, we have a significant advantage when using event sourcing. We
can see what happened before the incident disrupted our process in the event store and even replay the events in our
development setup to reproduce the bug. This is not easily possible with the classical approach, as I mentioned. Most
data there is lost, or you may be lucky to have log entries with all the data and context available, but this is rarely
the case.
The next topic is automatic testing to prevent bugs from happening again. Setting up these tests is easier and more
intuitive since we can fully embrace a best practice when writing tests: "Arrange, Act, Assert," or sometimes called
"Given, When, Then". We already have the needed events from the bug at hand and we know what the user did. So, we know
how to act, and we know the expected output that we can assert. We are currently working on a testing utility to make
this setup even easier for our users.
Auditing
The last advantage we are discussing here today is auditing. This is a topic that has come up in so many projects
throughout my career - it's crazy. Every time we struggled with it because it is extremely hard to build a robust system
in a traditional CRUD ORM approach. There are possibilities to build such auditing systems, but they come with huge
costs, both technical and financial, of course.
With doctrine, for example, we can use the lifecycle events to hook into the saving process before it really happens and
update what should be saved. This works if you just want to add information to the entity itself, but as soon as you try
to update another entity, like a dedicated log table, it can quickly get messy and slow. In fact, you should not do it,
since you would need to call EntityManager::flush()
in the listener, which is highly discouraged by the doctrine
maintainers.
And with event sourcing? It's free! Our event store is our auditing. Maybe we need to add some more meta information
to some or all events, for example, who triggered them, but as mentioned earlier, this is easily possible with
a message decorator.
Conclusion
Most of these advantages are based on the fact that we have much more data at hand due to the difference in what data we
store in the database. In our current time, data is king. Most of the big businesses, if not all, are thriving due
to the massive data they possess. This data enables businesses to make better decisions, whether it's creating new
products, understanding why customers back out and don't buy, or figuring out what they need to change to reach their
goals. With event sourcing, you can provide business value in the future
because you have the data that you would normally not have. In the end, this is the biggest advantage you have when
using event sourcing, and it comes for free.