This article will look at the terminology of event-driven architecture and reactive programming, and the problems they are trying to solve. In the next article in this two-part series, we will provide a real-life example architecture, putting the concepts into practice. Sneak preview: you can find the demo application here.
Anyone who’s done some basic application architecting on the web in the last five years has heard the terms “event-driven architecture” and “reactive programming”. While the terms may sound simple, they can be ambiguous, and swim about in a soup of service terminology like Kafka, SNS, SQS, messaging queues, messaging brokers, and the like—all of which can quickly become overwhelming.
In a world without event-driven architecture, we end up with bugs everywhere! If we put all our logic handlers into monolithic functions, we end up with spaghetti code full of if/then statements that take hours to debug, iterate on, and reason about.
If we fire all our events without any kind of chaining, we end up with race conditions where everything happens at once, and we face bugs like initiating payments before the sales tax is calculated, moving the robot arm before the gantry is in position, or enabling a button click before the form is persisted to state.
Event-driven architecture lets us practice good code hygiene by breaking up our handlers into small pieces of logic that can be composed together for larger, net effects. They’re easier to reason about, easier to test, and easier to iterate on.
Event-driven architecture is a manner of structuring an app that reacts to asynchronous inputs. I have an application, and it responds to a user event by processing changes in data, or mutating state–which can be event triggers. A simple idea, though often expressed in non-simple terms like “data streams” and “reactivity”.
Structuring an application to best utilize this “act on data change” model is not so simple. It starts with choosing eventable sources like Postgres database events, event streaming services, and message-queue services like Kafka or AWS SQS.
Modern applications these days embrace event-driven architectures where a smaller component handles all the listening and then is the actor on other events, triggering behavior throughout the application that otherwise doesn’t need to be listening for data changes.
Gluing the services together is what event-driven architecture is all about; making sure you have the right components that are listening to the right events, and executing in a predictable manner. Exactly how that data interchange occurs is a matter of reactive programming.
Up until a few years ago, coding (or programming) to handle changing data was full of challenges. Programming for this paradigm is known as reactive programming. Solutions typically fall into one of two connection schemes with the server. The first scheme involves asking the server every few seconds for new data (polling). The second scheme involves having a persistent handshake (WebSocket connection) with the server where the latest data becomes immediately available via message events.
Across these two schemes, three reactive patterns emerge. These patterns are push, pull, and push-pull.
Pushing involves the server sending data to the event handler which can occur via a WebSocket event message, or through a discrete handler like a webhook. Pull is exclusive to polling, where the client is repeatedly asking for changes in data.
Push-pull is a hybrid of these two. It uses both mechanisms to have the server send just enough information to allow our service to then respond by looking up (pulling) what changed. This allows us to optimize expensive data operations while getting the freshest content possible. Where WebSockets or webhook handlers are not available, this gets adapted to a type of “little pull, big pull” pattern, where in the absence of a real-time connection thin data transactions over the wire to check for data changes lead to a larger request once the existence of new data has been confirmed.
At the time of writing (2021) WebSockets are not widely adopted by the majority of service providers as they require a good deal of custom code to work for each individual instance. Service providers solve for the widest cross-section of users, which is, as of yet, not a good fit for implementing WebSockets. As the technology and tooling stabilize, we can expect to see improvements in the future.
For the remainder of this article, we’ll be looking at solutions limited to the polling scheme of reactive programming.
Polling—and this type of active listening to the server for changes—lead to a waterfall of dependencies in your application code that are all waiting for the new data to arrive. And, in many cases, one service is polling another service, that in turn, is polling another service, leading to excessive network requests and extensive boilerplate needed to handle network failures and event timeouts.
If you add to this waterfall behavior the isolated nature of event handler code (individual functions operating on discrete inputs and outputs) it can be very difficult to debug the flow of data through an application, and handle errors predictably or gracefully.
TL’DR: this is not an entirely solved problem. WebSocket approaches can be bug-prone and difficult to scale. Polling can lead to race conditions, be expensive, and impact performance. Modern entrants like the HTTP2 spec, where small bits of data are streamed from the server to the client, can solve some of these issues, but aren’t widely supported.
So, given the finicky nature of reactive programming, how do we move forward with building apps with event-driven architectures?
Event-driven architecture is what we want to build, and reactive programming is how we build it.
Achieving the best event-driven architecture with a maintainable, cost-effective, and performant code base requires reconciling what kind of data you are sending, how often you expect it to change, and which services you have available.
For example, if the data isn’t changing frequently, i.e., with the roster of an NBA team, a polling architecture is a valid pattern where the polling interval is able to be customized by the client. If the data is real time, like the scores of an NBA game, you need something that can react to real-time events and may very well need a solution that can implement WebSockets.
We’ve managed to make the case that event-driven programming is good but also difficult to architect, potentially expensive to program for, and not a fully solved problem—so why do it?
In two words, user experience.
Customers expect real-time experiences. Users today expect smooth, uninterrupted experiences that are part of the now-ubiquitous “single-page-application” experience, as opposed to white flashes of the browser page between actions of yesteryear.
Think about yourself as a customer: when was the last time you refreshed a checkout page to see if the transaction went through? In fact, you’d be sure that you lost your transaction if you tried refreshing the page (and, quite possibly, you might.)
There is also the performance perspective. If a system monitoring active tickets in a major enterprise had to be refreshed to look for new content, you’d need to hire someone whose job was to refresh the page.
When you execute correctly, you can also reduce the number of bugs present in the user flow. Keeping the browser session current allows the maximum optionality by maintaining user state across actions on the website, from filled-out form data to selected items—you can work with the data the user has submitted without needing to ask them to fill out form data again, or saving that data to an intermediary database.
However, if you can’t rely on the browser-session lifecycle to fetch and update your local state, you’ll need a different mechanism to hook into for updating the UI, notifying your user of changes, and performing the tasks your application is designed to perform. This is what event-driven architecture is all about.
Event-driven architecture is an important tool to add to your toolbelt as a developer. Whether you are building on the serverless stack or the low code stack, event-driven architecture leads to more robust applications that are easier to reason about, debug, and maintain.
In the next part of this min-series, we’ll walk through building a demo application, looking at the specific features that make event-driven architecture approachable and simple to adopt.
Reader