Latest Replies
Tuesday
Oct122010

Teach Visual Studio Your Own Language - Easy!

In this article we'll see how simple it is to extend Visual Studio with a custom language. This language will have a real syntax that will be evaluated and transformed into the C# code on-the-fly, saving a few lines of repetitive code!

In my case, I was able to get rid of 83 files in a project (command and event definitions) by replacing them with a single dll and a T4 file (native to Visual Studio 2010) with a DSL.

83 is just the number of message contracts I had in my CQRS theory project before getting bored with repetitive typing. Your mileage may vary.

Interested? That's what we'll talk about today (with some downloadables).

This article continues CQRS series of xLim 4 body of knowledge. Still, techniques and concepts described below could be applied to the other platforms and architectures, where you have a lot of repetitive code to type.

For the record, let's have a brief overview of the most interesting things that we've covered so far within the CQRS series:

Getting Started

In many message-based systems there are quite a lot of repetitive bits of code to be written: message contracts used for serialization and deserialization. If we go further down the road of Domain-Driven Design and distributed solutions, number of these contracts will increase. With event sourcing added to the mix it's easy to hit 100 message contracts in a simple project.

At the same time each contract is a simple class that follows rather boring pattern:

[ProtoContract]
public sealed class CreateProject :  ICommand
{
  [ProtoMember(1)] public readonly Guid ProjectId;
  [ProtoMember(2)] public readonly string Name;
  [ProtoMember(3)] public readonly int Rank;
  [ProtoMember(4)] public readonly SecurityRequest Request;
  private CreateProject () {}
  public CreateProject (Guid projectId, string name, int rank, SecurityRequest request)
  {
    ProjectId = projectId;
    Name = name;
    Rank = rank;
    Request = request;
  }
}

In this case we are using ProtoBuf serialization (for it's superior speed, size and cross-platform capabilities). But obviously in every case, the serialization format would be different. Some prefer to use Json, others - DataContracts etc.

Whenever there is a command in CQRS world, events will follow:

[ProtoContract]
public sealed class ProjectCreated : IEvent
{
  [ProtoMember(1)] public readonly Guid ProjectId;
  [ProtoMember(2)] public readonly string Name;
  [ProtoMember(3)] public readonly int Rank;
  [ProtoMember(4)] public readonly SecurityDetails Security;
  private ProjectCreated () {}
  public ProjectCreated (Guid projectId, string name, int rank, SecurityDetails security)
  {
    ProjectId = projectId;
    Name = name;
    Rank = rank;
    Security = security;
  }
}

And these are just a few classes out of the many. They are seemingly similar, follow the same pattern and yet are hard to express in a more concise way using C#.

In order to deal with the repetitive code we'll just need to create a new language (already done) and tell Visual Studio T4 template processing to use this custom DSL. This will allow to shrink the code above to something like:

// common fragments
let projectId = Guid ProjectId;
let name = string Name;
let security = SecurityDetails Security;
let auth = SecurityRequest Request;

// projects
CreateProject? (projectId, name, int Rank, auth)
ProjectCreated! (projectId, name, int Rank, security)

There is a reference implementation I put on Google Code. It demonstrates the technique. You should be able to grab the latest download, open the Sample project in Visual Studio 2010 (2008 might work as well, although I'm not sure) and see the magic in action.

This project includes all the source code and some basic introduction on the front page.

How It Works?

The custom DSL syntax was created in a few steps.

well, in reality it took more than a few steps and failures with different technologies, but these are the ones that worked (and the ones that count).

First, we define our grammar in ANTLR syntax. Antlr (ANother Tool for Language Recognition) is a java-based framework that can help to generate parsers, lexers and AST transformers in multiple languages. Syntax for our language starts like this (just a snippet is included below, see the project for the full listing):

program 
  :  declaration+
  ;

declaration
  : type_declaration
  | frag_declaration
  ;

frag_declaration
  :  'let' ID '=' member ';' -> ^(FragmentEntry ID member)
  ;

ANTLRWorks is a friendly IDE to work with ANTLR syntax. It will help to build and debug everything.

CQRS DSL with ANTLR

Once, there is a syntax, we can ask ANTLR to generate parser and lexer code in C#. This code will be a complicated mess, but we don't care, since it will not even be referenced by our production code (only used by Visual Studio at the coding time). I've included the beginning of the long lexer class below, so you could get the idea:

public class MessageContractsLexer : Lexer
{
  public const int T__28 = 28;
  public const int FRAGS = 15;
  public const int T__27 = 27;
  public const int T__26 = 26;
  public const int BlockToken = 8;
  public const int T__25 = 25;
  public const int T__24 = 24;
  public const int Modifier = 14;
  public const int T__23 = 23;
  public const int T__22 = 22;

 // a lot of code skipped for sanity and brevity.

Then I had to write a simple AST walker class that travels the syntax tree and builds our own document model out of it. This document model will be passed to writer class that is actually responsible for generating code. Implementation for ProtoBuf is included. You can easily add your own favorite schema of POCO and serialization flavour by creating class that inherits from the interface below (and just does a few WriteLine-s to the text writer):

public interface IGenerateCode
{
  void Generate(Context context, IndentedTextWriter writer);
}

The nice touch would be to merge all generation classes into a single DLL that could be redistributed with the original project (already done and included in the sample).

Once we have the library, it could be used in a T4 template to generate the code on-the-fly. We just need to add a new TT file to our project:

Adding T4 template

Then we make sure that "CustomTool" is set to "TextTemplatingFileGenerator":

Afterwards the template is filled with some support code and the actual DSL (note how we use MSBuild SolutionDir shortcut to be able to reference the code generation library from the template):

<#@ template language="C#" #>
<#@ assembly name="$(SolutionDir)\Library\MessageContracts.dll" #>
<#@ import namespace="MessageContracts" #>
using System;
using ProtoBuf;

namespace Sample 
{
  <# var generator = new WriteForProtoBufNet(); 
  var dsl = @"
// generic contracts
SecurityDetails(Guid UserId, string UserName, string Rule)
SecurityRequest(Guid UserId)

// common fragments
let projectId = Guid ProjectId;
let name = string Name;
let security = SecurityDetails Security;
let auth = SecurityRequest Request;

// projects
CreateProject? (projectId, name, int Rank, auth)
ProjectCreated! (projectId, name, int Rank, security)

RenameProject? (projectId, name, auth)
ProjectRenamed! (projectId, name, security)

DeleteProject? (projectId, auth)
ProjectDeleted! (projectId, security)
  ";    #>
  <#= GeneratorUtil.Build(dsl, generator) #>
}

This 32 LOC file will generate events and commands spanning 123 LOC adn containing 6 files (in real world scenarios there usually are more than 50 contract classes resulting in better "savings" in LOC). On every change, when we save the file (Ctrl+S), the corresponding CS file will be regenerated by Visual Studio.

Generation will either succeed or fail with the syntax error (showing in the Error List). If generation succeeds, then freshly generated code will be immediately available for the IntelliSense within ReSharper or AutoTest.NET. If you have solution-wide analysis turned on in the ReSharper, then it will immediately show any breaking changes that you might have created.

Resharper immediately picks the DSL changes

Although this is not the solution for every single case with repetitive code, the approach might be useful if you have more than 100 of repetitive commands and events in your CQRS project.

Please, keep in mind that this project (project home) is just a sample reference demonstrating the techniques (although I'm already using it as it is). You are welcome to reuse it, take it apart, and basically do whatever you want. Hopefully the project and this article will give you some ideas about simplifying repetitive code in your systems (generating event and command contract classes is just the tip of the iceberg).

This is not the last post in the series on xLim and CQRS (since there is too much to learn), so you are welcome to subscribe in order to stay tuned. Comments and feedback are even more welcome!

So what do you think? Did the sample even work out on your machine?

PS: Next article in the series is called Cloud CQRS Lifehacks From Lokad - Part 2.

« Release of Lokad-CQRS for Windows Azure, Community Credits. | Main | CQRS Lifehacks From Lokad Production »

Reader Comments (3)

I have experience to do almost same things on C++ language, but in C# it is much simpler.

October 12, 2010 | Unregistered CommenterSlava

It took me a while to figure out that MessageContracts.dll is simply Lokad.CodeDsl.dll renamed to MessageContracts.dll. Also the method WriteForProtoBufNet has been changed to TemplatedGenerator. I also had to change the assembly directive to these two:

<#@ assembly name="$(SolutionDir)..\lib\Lokad.CodeDsl\Lokad.CodeDsl.dll" #>
<#@ assembly name="$(SolutionDir)..\lib\Lokad.CodeDsl\Antlr3.Runtime.dll" #>

I can't see how to get the template to point to the ANTLR assembly as retrieved by NuGet so I just keep the CodeDsl, ANTLR, and ProtoBuf stuff in lib\Lokad.CodeDsl which I'll have to maintain by hand, but no big deal.

Also, had to change the import directives to:

<#@ import namespace="MessageContracts" #>
<#@ import namespace="Lokad.CodeDsl" #>

Not a big fan of having a namespace called MessageContracts inside a VS project called Lokad.CodeDsl, but can live with it. Without the second import statement, the GeneratorUtil class is not found (presumably it used to be in the MessageContracts namespace).

And with those changes in place, I can edit my .tt file and hit ^S and a lot of boilerplate code is generate for me. Very nice! This is my first real use of T4 and it won't be my last. And CQRS is proving to be fun (productive) so thanks for that!

July 13, 2012 | Unregistered CommenterGlenn Doten

Glenn, I recommend to have a look at the latest DSL approach in Lokad.CQRS (using stand-alone console running in background as opposed to T4). I found it to be more simple and appealing. Besides, that version of DSL has more features.

July 13, 2012 | Registered CommenterRinat Abdullin

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>