Retrospective of 2015 Q1


First quarter of 2015 is behind. We managed to achieve some impressive results at SkuVault. This will allow us to offer new features and a better product to customers in the second quarter of 2015.

Last week Catalyst 2015 took place, where SkuVault was a bronze sponsor and had an exhibition booth.

This conference was a big milestone for us: we had an opportunity to demo a completely new feature there - Interactive WavePicking.

It was hard to implement this feature in SkuVault v1 design (based on CQRS/ES architecture), so a different approach to design and development was required. We aimed at two goals simultaneously:

  • come up with a new software design, tailored for SkuVault project to use in further development and constructed to serve customers better in the upcoming years;

  • implement a new feature using that design.

Implementation of the feature was an indicator of success or failure of the new design. If development of the feature works out, then the new design is good enough to move forward with. If it doesn’t, then we fail early and save the effort of a major rewrite and a cost of a big failure.

I think, we did well. SkuVault team demonstrated WavePicking at Catalyst 2015, and soon this feature will be available to all customers. Benefits of the new design will come forth in the next quarters of 2015.

Interactive WavePicking in SkuVault

Major thanks go to:

  • Slav Ivanyuk - for managing the process and doing all super-helpful code reviews
  • Chris Witt - for picking up ReactJS/Flux and doing all the work in UIv2
  • Jason Henson - for helping with the demo setup, testing and deployment

Feature design was driven by SkuVault experts, who had to do a very complex job of identifying the feature essence and figuring out which parts would need to make it into MVP: Andy Eastes, Slav Ivanyuk and Danny Shaw.

In order to prepare path for Interactive WavePicking we had to:

  • Design and implement a new infrastructure for the next UI version and backend with APIs.
  • Provide a way to integrate v2 with v1 in various environments: production, where we have more than 20GBs of events coming at high throughput, QA with auto-deployable backend and demo with in-memory backend hosted inside IIS process.
  • Figure out an approach for gradual migration of all existing features from v1 to v2.
  • Figure out a way to improve testability, scalability in new design, while making it easier to develop and learn.

So here is what we achieved so far.

New Infrastructure

If we forget about integration with v1 code, the new infrastructure is quite simple.

Backend is a collection of .NET modules with an API (JSON over HTTP, provided by NancyFx). Each module can subscribe to events and publish events (async, batched). Modules behaviour can be captured in use cases, which are then applied to verify the correctness of the implementation.

Middleware to pass events between modules and also from v1 to v2 is MessageVault (available on github). It is a simple Kafka-inspired event bus, which uses Windows Azure to maintain a highly-available transaction log. Reads are served by Windows Azure storage (via a .NET client library), while writes are handled by a cluster of worker roles (master election is done via Azure blob locks).

SkuVault Middleware

Front-end in v2 is an absolute pleasure to work with. Previous version is a single project based on ASP.NET MVC with AngularJS and Lokad.CQRS client libraries. New version is a collection of single-page web applications (one web app per major feature), which are statically compiled into JS and CSS bundles. These apps are stateless (web server simply serves the content) and get all data directly from backend API.

Internally we chose to use ReactJS with Fluxible, since that was the stack we had arrived at during HappyPancake. This software design is backed up by the work done at Facebook and Instagram (it is always good to stand on the shoulders of giants).

Chris Witt jumped right into the development process and tackled the UI side of feature development since then. He did a very good job.

For example, the UI below allows warehouse managers to create picklists from pending sales. They can apply a dozen of filters to these sales, pick individual sales or batches, reorder sales in the picklist.

New WavePicking Session

With ReactJS, we decomposed a relatively complex UI into domain-specific reusable components; FLUX architecture pattern provided a consistent way to capture event-driven UI and client-server interactions in the code. It was easy for multiple developers to work in the resulting codebase, both to continue each other’s work and to develop UI elements in parallel.

Software Design

Feature decoupling is the most important aspect in our new design. Even though wave picking is quite complex to implement (that’s why it is rarely handled properly in warehouse management), in SkuVault this complexity is isolated from the rest of the system.

WavePicking backend API is a separate event-driven module, which can be tested, deployed and scaled independently from the rest of the backend (it can also run in-process with all the rest, for the demo, development and on-premises deployments).

Similarly, Wave-picking UI is a separate web application composed from reusable ReactJS components, Flux actions and stores. This application benefits from shared elements (e.g. styles, login/logoff, UI components, build process), but it can be developed and deployed separately.

This gives us the benefit of controlled system evolution: we can take existing features and transition them to our new design one by one, minimising the risks and avoiding the total rewrite. In parallel, we can add completely new features.

Splitting the system into modules and UI features with well-defined boundaries also simplifies team management and resource allocation. It is easier to manage development of several isolated features than to coordinate development within a single tightly coupled product. It is also simpler to scale the development process, too.

Explicit separation between the UI and backend with API was an important design decision with a long-lasting impact on existing product. First, it allows us to divide (and conquer) development into two distinct contexts with very different specifics and challenges:

  • backend and API development focuses on the core domain, scalability and making it very easy for UI folks to build various front-end features. That is pure .NET with low-level optimisations for performance and scalability.
  • UI is going to be THE primary consumer of API, but not the only one: mobile clients and partners will be using it as well. It cares more about the User Experience, feedback, rapid development iterations and pure HTML/JS/CSS development (for the Web).

Another interesting side-effect of the design is that we get cheap UI deployments and foundation for A/B testing. One can simply copy feature UI files to a new directory on a web server, getting a different version deployment. If you point different users to different versions of a feature, you get the ability to do gradual roll-outs, per-user customizations and the grounds for A/B testing.

On the backend side, development process is enhanced by event-driven use-cases introduced to specify and verify API behavior scenarios. These use-cases improve upon existing specifications at SkuVault, making them less fragile and focused more on the public contract rather than internal implementation.

SkuVault Use-cases in C#

Unlike traditional unit tests, API verification with use-cases can pinpoint the problem in case of failure. As a result, this process saves development time and lends itself to “Getting Things Done” mentality.

SkuVault use-case verification

Use-cases additionally grant us other benefits:

  • API documentation can be generated automatically. It will always stay up-to-date and its quality will be better than what libraries like ServiceStack and NancyFx can auto-generate provide out-of-the-box.
  • Use-cases align very well with the development process, making it easier to manage. Especially well it works with Domain-Driven Design, which SkuVault already employs.
  • Sensible stress-tests can be auto-generated out of the use-cases. They allow running the system through all the scenarios from the specifications, but repeated 1000 times or more. This capability does not replace custom stress-testing scripts, but it comes for free. Build server can run stress tests on each commit, watching for performance regressions and correlating them with changes in the code.

Development Process

I think, we managed to reduce development friction in v2. UI features in v2 are incrementally recompiled on-the-fly whenever a file changes (thanks to the webpack). We also leverage webpack dev server to handle hot reload (when compilation happens in memory and changes are pushed to the browser). This speeds up web development dramatically.

UIv2 features no longer require ASP.NET MVC, so it is not bound to Visual Studio (or to Microsoft Windows itself). Developers are free to choose an environment that fits their needs.

Tools like WebPack, ESLint, ES transpilers work from the command line and are supported by all modern IDEs.

Editing UIv2 in SkuVault in Emacs

We also observed that ReactJS simplifies UI development, especially for user interfaces with complex interactions, and when compared to the MV* designs (MVC, MVVM, MVP). That increases productivity and lowers development risks.

To make UI development more productive, we use LESS preprocessor for styles (also managed by the webpack), lodash for functional helpers in JS and superagent for AJAX calls.

User interactions are all captured in vanilla JavaScript at the moment (later we might enable some ES6 features, which can be transpiled down to ES5 by webpack).

JavaScript can be tricky, to make development more reliable we run a linter (ESLint) with a rather strict set of rules. It forces all code to be written consistently, avoiding code smells and bad practices. These rules are enforced by the build server.

Real-time statistics and logs are still evolving at SkuVault. Although there are a few glitches (e.g. Hekad integration is less than perfect on Windows), we are consistently improving the experience. At the moment we have more than 80GB of searchable logs handled by ElasticSearch and visualized by Kibana. Carbon and Graphite take care of capturing and reporting dozens of stats from various cloud services running on Azure (starting from RAM/CPU consumption and down to a specific web request latency).

MessageVault dashboard

Statistics and logs aren’t a mission-critical piece of infrastructure, however they are extremely helpful in understanding software behaviour under production loads. SkuVault needs to scale a lot to serve new customers better, and this toolkit provides real-time insight for that (aside from helping us to debug any potential issues).

Instrumenting the existing code to write to the distributed log or to report a new stat metric is easy. API v2 comes with these capabilities from the start, since we are planning to go for aggressive scalability targets with it.

Learning process and complexity had to be factored in the new design as well. The simpler it is for developers to understand the design and become productive in it, the simpler it will be for the company to find new talent and to grow.

Here is the list of technologies which v2 aims to discard (for good):

  • ASP.NET MVC and all web development in .NET
  • Angular.JS with jQuery
  • ServiceStack API
  • Lokad.CQRS

Instead we introduce:

  • ReactJS/Flux
  • NancyFx

So far knowledge transfer for UIv2 development had been rather smooth within the company (thanks to the talented developers of SkuVault).

Long-term Impact

It is always nice to consider possible long-term benefits that could come either cheap or for free. With the new design we potentially get:

  • Well-used and tested API, which could support various clients.
  • Ability to reuse the experience and product knowledge of our web developers to build native clients for the modern mobile platforms: iOS and Android (thanks to React Native).
  • Path for scaling out the system (API is scaled by modules and then partitioned by tenants; UI is stateless and can scale infinitely).
  • Support for on-premises deployments of SkuVault, along with geo-affinity around the world.
  • Ability to deploy system to different clouds to provide higher availability guarantees to our customers.

What's Next?

For SkuVault, scaling and stability is the primary focus for Q2 2015. We are going to take existing features one by one and migrate them to the new design, while improving test coverage and performance.

New event-driven design gains solid APIs in UI (ReactJS) and on the backend (event-driven design), which reduces coupling and fragility, leading to a simpler and smaller codebase.

In order to achieve these goals, continuous integration and build process will also have to be enhanced. We are interested in fast builds and development feedback.

Fast builds in v2 are already a part of the design: UIv2 doesn't have .NET in the pipeline at all (plus webpack provides continuous builds), while backend gets faster builds due to the smaller solution footprint.

Build process itself could get feedback from:

  • Static analysis: ESLint and Flow in JavaScript.
  • Unit-Tests: testing a single component.
  • Integration tests: use-case verification, derived sanity checks and tests for interactions between the components.
  • Automatic UI/UX verification via scripted user interactions (e.g. WebDriver)
  • Performance tests: scenarios derived from use-cases and custom ones.
  • Codebase size and complexity tracking.

Ideally, developers would get this feedback within 5-10 minutes after a commit - while they can fix issues most efficiently.

As more features migrate from v1 to v2, we will pull more developers into the new environment. The knowledge transfer facilitation has already started, it will have to be managed more explicitly in Q2.

All in all, 2015 Q2 is going to be a very interesting time at SkuVault :)

Many thanks to Slav, Andy and Ksenia for reviews.

- by .