DevOps and Event-Driven Design

My third week with SkuVault was 10 hours long. It involved both infrastructure concerns and software design.

First, I had to secure our new devops server:

  • encrypt TCP traffic with logs and performance stats;
  • serve web dashboards via HTTPS;
  • add authentication to the web UI.

Afterwards we started planning evolution of software design at SkuVault.

TLS for Heka

Fortunately, encryption is easy to setup with Heka. It works same way on Windows and on Linux.

All Heka communications now go through TLS connection, using self-signed client-server certificates. That secures both log messages and performance stats.

HTTPS

SkuVault already had a wild-card certificate bought for their domain. Securing the web UI was just a metter of copying certificate chains and keys to devops PC, then telling Nginx to accept only HTTPS connections.

We also perform HTTP redirect to HTTPS for all insecure connections.

Along with that I added basic HTTP authentication to devops server. It is good enough, since we encrypt all traffic anyway.

Event-Driven Design

During the week we started planning design improvements with Slav. We had in mind following requirements in mind:

  1. Better scalability to sustain business growth.
  2. Higher availability.
  3. Simplify existing code.
  4. Make all changes in small steps (no big rewrites).
  5. Leverage strong points of Windows Azure.

At the current moment SkuVault consists from multiple modules representing bounded contexts in the domain and running as individual azure worker roles. Modules are event-sourced, their interchange contracts shaped by CQRS principles. Each module has its own private event store and a set of projected views, which are publicly accessible.

CQRS at SkuVault

We can make this design more steady and decoupled by shifting from integration by sending commands and querying view state to integrate modules via events and crafted APIs.

This shift would mean that:

  • We hide internal state of the servers from the outside world, making it easier to evolve them and change the implementation.
  • We can replace the majority of the commands in existing design with synchronous API calls. This would also simplify client code.
  • APIs would act out as a natural Anti-Corruption Layer, besides we need to start introducing public APIs to the sytem anyway.
  • Events are less fragile than commands (if crafted properly), they work well in pub/sub integration.

Ideally, we would introduce all these changes in small steps. Longest iteration would be 2 weeks.

One way to start this process is to introduce a small piece of software, responsible for gathering events from all the private event stores and publishing them to all interested subscribers.

This software will need to be simple, fast and highly available.

At the end of the week I had 3 possible implementation options for this piece in mind:

  1. FoundationDB + Golang event storage.
  2. Implementation on top of Azure Blob Storage (essentially, next version of Lokad.CQRS event store).
  3. Thin layer on top of Azure Table Storage.

Each option has its own set of trade-offs. We would need to pick one that fits SkuVault project the best. That is what the next week going to be about.