Time Machines Should Support LINQ
Let's talk about CQRS, Time Machines, events and their value for the business.
Yesterday I had to do a quick integrity test of events that took place in one of our projects at Lokad, since its deployment.
How do you normally test this kind of things in your project?
At Lokad, we just build time machine, go back in time and run LINQ queries againt all events.
Literally.
var machine = new TimeMachine(server, token);
var totalForecasts = machine
.EventsFromTheBeginningOfTime()
.OfEventType<IForecastsDowloaded>()
.Sum(f => f.Data.Values);
Of course, you can do all sorts of more complex things: plotting, grouping, aggregating...
var accountsRegisteredByDate = machine
.EventsFromTheBeginningOfTime()
.OfEventType<IAccountRegisteredEvent>()
.GroupBy(e => e.Recorded.Date)
.Select(g => new
{
X = g.Key,
Y = g.Count()
});
As you can guess, business value of persisting such domain events is enormous. Not only can you go back in time and investigate what happened when and why (event stream analysis comes here), you can also define new types of view representations and views and have them generated instantly.
With LINQ generating such a query and getting results is almost instant, providing instant exploratory insight into your business. All this brings much better understanding of your company and environment in dynamics, than just the data about "today".
Alternative to that (in the classical architecture) is to spend one development iteration augmenting solution to start capturing some events to a relational table, and then wait for some time till enough evidence is captured in order for the report to be of any use. This requires time - one of the most scarce resources, that often translates directly into the "competitive advantage".
However, this is not the end. Ability to travel back in time actually gives us better chances of anticipating the future as well.
The simplest approach is to project event stream onto a 2D graph and then use forecasting tools to continue the trends into the future. This will give us decision support for planning our actions as well as executing them in this rapidly changing world. Sun Tzu would be proud.

More on injecting this type of business intelligence in software.
To get things even more interesting, you can use some sort of fuzzy logic to project events into multi-dimensional space along the timeline (including information about visibility and effect horizon of an event) and then use these projections as optional inputs into the forecasting and analysis models. Sometimes interesting results can show up here.
You can use Excel, custom models or any other way. Alternatively you can also upload subsets of such events directly to Lokad Forecasting API, where events and their projections will get handled automatically in order to improve the overall forecasts.
How is this implemented?
Implementation is rather simple and follows the usual CQRS pattern.

In the green-field CQRS application, you have domain events already. In the brown-field applications the most important step towards this kind of approach is to start actually capturing the intent and context of the actions at the topmost level (usually that the User Interface). Afterwards you just work your way down in the architecture, letting the domain to start publishing events about all things that have happened, as they happen.
These events could be used to build and update views, trigger notifications, integrate with other departments or enterprises, etc.
One of the event subscribers is "SaveEvents : ConsumerOf
Just make sure that you use some sort of forward-compatible serialization method here, as your event schema is going to evolve along the life-cycle of the application and company.
You can stream domain events to multiple data centers around the world, where they get denormalized into the views, yielding really fast read performance (although for small-to-medium cloud solutions, writing to a single Azure Table Storage should be enough).
In push-style replication you send messages to all subscribers via the service bus on top of Azure Queues, in pull-style you just run query: "give me all events that happened since that last event X". And since given the events we can build and update all the views, that's how simple replication can be (note that different consumers can do different things with events - it's up for them to decide).
Also, for example, events, could be replicated to the local machine to be kept in the memory for faster querying and analysis. This brings us all the way back to the LINQ and time machines.
If we attach performance statistics to the domain events, this would allow us to investigate behavior of the system within the context of the processes that took place, allowing to answer thoughtful questions like:
- What's the average item retrieval speed from mySQL databases? How often do we encounter timeouts and deadlocks?
- How many seconds does it take to sync 100k products from SQL Azure database in the same data-center?
- What's the average upload speed to Lokad Forecasting API for datasets larger than 10k series, after that API upgrade in the last iteration?
See Importance of Tooling and statistics in CQRS world.
There will probably be more articles like this, as we explore CQRS, DDD and all the benefits they bring, when employed in the cloud computing solutions along the lines of xLim 4 arhictecture. You can subscribe to this Journal to stay tuned to the updates.
What do you think? Do you have any questions? I'd love to hear you thoughts about this article!
Thursday, June 3, 2010 at 14:01