Home © Rinat Abdullin 🌟 AI Research · Newsletter · ML Labs · About

One Example of Using Message-Driven Design at Lokad

Lokad Salescast is an inventory optimisation platform for retail, capable of dealing with big datasets. It gets inventory and sales information and does some number crunching. Produced reports tell when you need to reorder your products (and how much) in order to serve forecasted demand and avoid overstocking.

One of the objectives of Salescast is to make it available and affordable for small customers. Hence we introduced "Express Plan", which is free for small customers, but comes without any support.

Making software free is easy. Making software usable without support is much harder. So Lokad developers had to create complicated heuristics to help customers deal with the problems. TSV parsing is one of problematic regions.

Even though the major scenario for big data transfer at Lokad is "upload TSV-formatted text files to FTP", there are multiple things that can go wrong with this simple setup. No matter how precise is tech documentation, people can always miss seemingly unimportant things that are critical for computers. Here are some examples:

  • text encoding of files;
  • culture-specific format of dates;
  • culture-specific format of numbers;
  • optional columns in invalid format;
  • required columns missing;
  • missing files;
  • non-standard separators.

Yet, we are trying to provide the best experience out-of-the-box even with improperly formatted data. This would require doing a lot of smart TSV analysis in code. Here's how an output of one analysis process would look like (latest log entries at the top):

Lokad Salescast Events

Message-driven design patterns help to develop and maintain such logic. Public contract of it in the code might look like a simple function (with complicated heuristic inside):

static IMessage[] AnalyseInput(SomeInput input) { .. }

Here messages are strongly-typed classes that explain return results of that function (unlike event sourcing, they are not used for persistence). For example:

public class UsedNonstandardExtension : ITsvFolderScanMessage
{
    public readonly string Extension;

    public UsedNonstandardExtension(string extension)
    {
        Extension = extension;
    }

    public virtual AdapterTweet ToHumanReadableTweet()
    {
        return new AdapterTweet
            {
                Severity = AdapterTweetSeverity.Hint,
                Tweet = String.Format("Salescast found Lokad TSV files using" 

                  + " non-standard extension {0}.", Extension),
            };
    }
}

Function would return one or more event messages. Various input scenarios might be unit-tested using given-when-expect approach, where we express test case as:

  • given certain inputs ;
  • when we invoke function;
  • expect certain outcomes and assert them (e.g. verify that we get expected messages).

Or in code:

public sealed class given_compressed_files_in_txt_format 
    : tsv_folder_analysis_fixture
{
    public given_compressed_files_in_txt_format()
    {
        // setup all expectations in constructor, using helper methods
        // from the base class
        given_files(
            "Lokad_Items.txt.gzip",
            "Lokad_Orders.TXT.gzip"
            );
    }

    [Test]
    public void expect_detection_with_extension_warning_and_compression_hint()
    {
        // assert expectations, using helper methods from the base class
        expect(
            new TsvFolderScanMessages.UsedNonstandardExtension("TXT"),
            new TsvFolderScanMessages.CompressedFilesDetected(),
            new TsvFolderScanMessages.StorageDetectionSucceeded(
                TsvInputFile.Item("Lokad_Items.txt.gzip").WithGzip(),
                TsvInputFile.Order("Lokad_Orders.TXT.gzip").WithGzip()
                ));
    }
}

This is an example of a single test scenario. There could be many others for a single function, reflecting complexity of heuristics in it:

2013 08 06 122329

Each of these test scenarios shares same "when" method and helpers to setup "given" and "expect", so they are pushed to the base fixture class, which can be as simple as:

public abstract class tsv_folder_analysis_fixture
{
    readonly List<string> _folder = new List<string>();
    ITsvFolderScanMessage[] _messages = new ITsvFolderScanMessage[0];

    protected void given_files(params string[] files)
    {
        _folder.AddRange(files);
    }

    [TestFixtureSetUp]
    public void when_run_analysis()
    {
        // this is our "When" method. It will be executed once per scenario.
        _messages = TsvFolderScan.RunTestable(_folder);
    }

    static string TweetToString(ITsvFolderScanMessage message)
    {
        var tweet = message.ToHumanReadableTweet();
        var builder = new StringBuilder();
        builder.AppendFormat("{0} {1}", tweet.Tweet, tweet.Severity);
        if (!string.IsNullOrEmpty(tweet.OptionalDetails))
        {
            builder.AppendLine().Append(tweet.OptionalDetails);
        }
        return builder.ToString();
    }

    protected void expect(params ITsvFolderScanMessage[] msg)
    {
        CollectionAssert.AreEquivalent(msg
            .ToArray(TweetToString),_messages.ToArray(TweetToString));
    }
}

If you look closely, then you'll find a lot of resemblance with specification testing for event sourcing. This is intentional. We already know that such tests based on event messages are non-fragile as long as events are designed properly.

This additional design effort pays off itself really quickly when we deal with complicated heuristics. It makes development process incremental and iterative, without fear of breaking any existing logic. Step by step, one can walk around the world.

In essence, we go through all the hoops of expressing behaviours via messages just to:

  • express diverse outcomes of a single function;
  • provide simple functional contract for this function;
  • make this function easily testable in isolation;
  • ensure that tests are easily maintainable and atomic.

Downstream code (code which will use components like this one) might need to transform a bunch of event messages into a some value object before further use, but that is a rather straight-forward operation.

Interested to dive deeper into Lokad development approaches? We are looking for developers in Paris and Ufa. You can also learn some things by subscribing to BeingTheWorst podcast which explains development ways of Lokad.

Published: August 06, 2013.

🤗 Check out my newsletter! It is about building products with ChatGPT and LLMs: latest news, technical insights and my journey. Check out it out