Software Design Blog

Journey of Rinat Abdullin

CouchDB in the Cloud - Persisting From .NET Code

This article about CouchDB is a research that continues topics of xLim and Cloud Computing in .NET series.

In the previous article we’ve talked about deploying CouchDB servers in the cloud and locally. Now it is time to outline a simple .NET adapter for communicating with these servers.

In order to talk to CouchDB we need a few pieces:

  • JSON serializer - ISerializer;
  • Service for getting and posting REST commands - IRestClient;
  • Actual implementation of CouchDB client - CouchServer and CouchDatabase.

These pieces might fall together in an extremely simple (yet extensible) fashion:

Simplified CouchDB wrapper for .NET

Let’s go with the most simple implementation that will let us execute code like this:

[Test]
public void RoundTrip_Class()
{
    // connect to the database
    var database = CouchFactory.ConnectToDatabase(DBUrl);

    // create new document objecy
    var user = new SampleUser
    {
        Name = "user@unknown.com",
        Tags = new[] { "User", "Can have attachments"}
    };
    // save document, specifying the identifier
    var revision = database.SaveDocument(user, u => u.Name);
    Assert.AreEqual(user.Name, revision.Id);

    // retrieve the document
    var retrievedUser = database.GetDocument<SampleUser>(user.Name);
    Assert.AreEqual(user.Name, retrievedUser.Name);

    // double check nested objects
    CollectionAssert.AreEqual(user.Tags, retrievedUser.Tags);

    // check out the attachment round-trip
    var info = database.AddAttachment(revision, "my.txt", WriteFile("my.txt"));
    Assert.AreEqual(info.Ok, true);
    database.GetAttachment(info, "my.txt", ContentsShouldBeEqual("my.txt"));
}

JSON Serializer for CouchDB

That’s how JSON (JavaScript Object Notation) looks like. As you can see, it is yet another human-readable and platform-independent alternative for exchanging serialized data over the wire.

{
     "firstName": "John",
     "lastName": "Smith",
     "address": {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": 10021
     },
     "phoneNumbers": [ "212 555-1234", "646 555-4567" ]
 }

There are quite a few JSON serializers for .NET available out there. Some of these include:

Each JSON serializer has its own advantages (primary one being - they already are implemented) and disadvantages for the purpose of integrating with CouchDB.

If I were to implement a proper .NET wrapper for CouchDB form the start, I would definitely go with my own serializer tailored for CouchDB needs (using JayRock architecture as a guideline but making it more strongly-typed and simple). However in this research article we’ll stick with the DataContractJsonSerializer from Windows Communication Foundation.

This serializer has following major disadvantages that should be kept in mind:

  • This serializer does not handle anonymous types. This takes away all fun of working with schema-less databases.
  • Serializer requires classes to be declared with a DataContractAttribute that has members decorated with DataMemberAttribute. This does not allow custom mapping routines.

Let me explain why Attribute-based type declaration creates too much development friction for such a flexible database as CouchDB.

Just compare these two snippets below. First one is about declaring objects for CouchDB API in WCF-compatible (and C#-styled) way:

public sealed class DatabaseInfo
{
    [DataMember(Name="name")]
    public string Name { get; set; }
    [DataMember(Name="doc_count")]
    public int DocCount { get; set; }
    [DataMember(Name = "doc_del_count")]
    public int DocDelCount { get; set; }
    [DataMember(Name="update_seq")]
    public int UpdateSeq { get; set; }
    [DataMember(Name = "compact_running")]
    public bool IsCompacting { get; set; }
    [DataMember(Name = "disk_size")]
    public int DiskSize { get; set; }
}

Second snippet is about declaring objects using F# projects (which are first-class citizens in Visual Studio now) and style:

type db_information = 
  {
    db_name : string;
    doc_count : int;
    doc_del_count : int;
    update_seq : int;
    compact_running : bool;
    disk_size : int
  }

In my opinion, second type of declaration feels somewhat closer to the ideas behind CouchDB. More than that, F# compiler would automatically implement CompareTo, Equals and GetHashCode for these .NET objects. However using this kind of object declaration would require either extending an existing Json serializer or writing a new one.

Given all that, DataContractJsonSerializer from WCF is still perfect for this specific research article.

RestClient

The most simple implementation of a RestClient class is a wrapper around WebRequest class that could look like this:

public sealed class RestClient
{
    public readonly Uri Url;

    public RestClient(Uri url)
    {
        Url = url;
    }

    public Restponse Do(string query, string method, Action<Stream> writer);    
    public Restponse Do(string query, string method);
}

Two Do overloads are almost the same. The second one just does not create a stream for writing contents. Second overload looks like:

public Restponse Do(string query, string method, Action<Stream> content)
{
    try
    {
        var request = (HttpWebRequest) WebRequest.Create(new Uri(Url, query));
        request.Method = method;
        request.ContentType = "application/json";

        using (var writer = request.GetRequestStream())
        {
            content(writer);
        }

        return new Restponse((HttpWebResponse) request.GetResponse());
    }
    catch (WebException e)
    {
        // till we get to rewrite web requests to live without exceptions
        return new Restponse((HttpWebResponse) e.Response);
    }
}

where the Restponse is a simple wrapper class used to ensure that code stays decoupled from HttpWebResponse class and that developer does not forget to dispose the request:

public sealed class Restponse : IDisposable
{
    public readonly int StatusCode;
    readonly HttpWebResponse _response;

    public Restponse(HttpWebResponse response)
    {
        _response = response;
        StatusCode = (int)_response.StatusCode;
    }    

    public void RunReader(Action<Stream> read)
    {
        using (var s = _response.GetResponseStream())
        {
            read(s);
        }
    }
    public T RunReader<T>(Func<Stream,T> read)
    {
        using (var s = _response.GetResponseStream())
        {
            return read(s);
        }
    }    
    public void Dispose()
    {
        ((IDisposable)_response).Dispose();
    }
}

By the way, as we’ve mentioned in the schematics before, RestClient is a good place to hide behind the IRestClient interface and squeeze in a wrapper in between.

Thise way you’ll be able to add exception handling and reliability layer to all requests. This layer could be implemented using IoC-injectable Action Policies intercepting all communication failures (i.e.: network break-downs) and retrying the last request a few times before giving up.

CouchServer

Now we have two simple pieces needed to talk to CouchDB. Let’s actually try doing this. Our knowledge about the server-level CouchDB API will be encapsulated in CouchServer and CouchDatabase classes. CouchServer starts like this:

public sealed class CouchServer
{
    readonly ISerializer _serializer;
    readonly IRestClient _client;    

    public CouchServer(IRestClient client, ISerializer serializer)
    {
        _serializer = serializer;
        _client = client;
    }

As we remember, ISerializer is a simple interface abstracting away DataContractJsonSerializer.

In the environment without Inversion of Control CouchServer could be created with a factory method:

public static CouchServer ConnectToServer(string url)
{
  var rest = new RestClient(new Uri(url));
  return new CouchServer(rest, new WCFJsonSerializerWrapper());
}

By the way, that’s a good approach for creating developer-friendly libraries. We implement all functionality as separated and decoupled components (as in component-driven development) ready to be consumed by whatever IoC Container the developer likes. Extensibility and flexibility come in there natively.

Should the project be too simple for wring up the proper IoC infrastructure, developer-friendly library will have some static factory class to use. This class would serve as a logical entry-point into the library and would host hard-coded creation of components (using recommended configurations and settings).

Let’s get back to our code. That’s how the CouchServer class goes on in C#:

public string[] GetDatabaseNames()
{
    using (var restponse = _server.Do("_all_dbs", "GET"))
    {
        return restponse.RunReader(r => _serializer.Deserialize<string[]>(r));
    }
}

public void CreateDatabase(string name)
{
    using (var restponse = _server.Do(name + "/", "PUT"))
    {
        if (restponse.StatusCode != 201)                
            throw CouchException.From(restponse);
    }
}

public ServerInfo GetInformation()
{
    using (var restponse = _server.Do("", "GET"))
    {
        if (restponse.StatusCode != 200)
            throw CouchException.From(restponse);

        return restponse.RunReader(r => _serializer.Deserialize<ServerInfo>(r));
    }
}

public void DeleteDatabase(string name)
{
    using (var restponse = _server.Do(name + "/", "DELETE"))
    {
        if (restponse.StatusCode != 200)
            throw CouchException.From(restponse);            
    }
}

As you can see, we are merely executing commands according to the CouchDB RestAPI documentation here.

Software architect, that had to deal with the user input before, would definitely notice one really important thing that is missing here - input validation (CouchDB has quite a few naming rules).

If needed, such validation could be easily added with the help of Lokad Rules. Any other framework that allows to define concise, readable and composable rules in .NET would, obviously, work as well.

When we need to start working with the specific database we just use this method to create another friendly CouchDB wrapper class CouchDatabase, by passing to it existing serializer and a RestServer instance with a database url:

public CouchDatabase ConnectToDatabase(string databaseName)
{
    var dbUrl = new Uri(_server.Url, databaseName + "/");
    var server = new RestServer(dbUrl);
    return new CouchDatabase(server, _serializer);
}

CouchDatabase class is similar to CouchServer but encapsulates database-level routines. Here’s how it might look like in C#:

public sealed class CouchDatabase
{
  readonly IRestClient _client;
  readonly ISerializer _serializer;

  public CouchDatabase(IRestClient restClient, ISerializer serializer)
  {
    _client = restClient;
    _serializer = serializer;
  }


  public DocumentInfo SaveDocument<T>(T item, Func<T, string> id)
  {
    using (var r = _server.Do(id(item), "PUT", 
      tw => _serializer.Serialize(tw, item)))
    {
      return r.RunReader(tw => _serializer.Deserialize<DocumentInfo>(tw));
    }
  }

  public DocumentInfo AddAttachment(DocumentInfo info, string name, 
    Action<Stream> data)
  {
    var query = string.Format("{0}/{1}?rev={2}", 
      info.Id, name, info.Revision);
    using (var r = _server.Do(query, "PUT", data))
    {

      return r.RunReader(tw => _serializer.Deserialize<DocumentInfo>(tw));
    }
  }

  public void GetAttachment(DocumentInfo info, string name, 
    Action<Stream> reader)
  {
    var query = string.Format("{0}/{1}?rev={2}", info.Id, name, info.Revision);
    using (var r = _server.Do(query, "GET"))
    {
      r.RunReader(reader);
    }
  }

  public T GetDocument<T>(string id)
  {
    using (var r = _server.Do(id, "GET"))
    {
      return r.RunReader(tw => _serializer.Deserialize<T>(tw));
    }
  }

  public DatabaseInfo GetInfo()
  {
    using (var r = _server.Do("", "GET"))
    {
      return r.RunReader(tw => _serializer.Deserialize<DatabaseInfo>(tw));
    }
  }
}

OK, this should be enough to give you an idea, how we can interact with CouchDB server instances from .NET.

By the way, interesting part of this simple approach is that it could be reused in other REST adapters as well. For example, one could keep all the infrastructure intact but add different wrapper implementations in order to talk to some other RESTful Cloud APIs (i.e. Rackspace Cloud or Windows Azure) directly.

Source code behind the idea at the moment is based on a single usage scenario prototype. Thus, it is not ready for any kind of publicly shared library or framework, since any functionality in the shared library should be backed up by real usage scenarios coming from, at least, two different projects. Otherwise we might get inefficient code and bad development practices affecting everybody using the framework.

There might be another research article on the topic about using simple .NET layer (hosted within Mono runtime and implemented as IHttpHandler) to provide cost-effective and scalable server-side functionality in the cloud with the same RESTful API (suitable for a specific set of scenarios). Stay tuned for the updates on the topic.

Related links: