CouchDB - Document Persistence Powered by CQRS
I've been asked about my opinion on the CouchDb (MongoDb) multiple times. Here's my current analysis and opinion.
As any .NET developer I've tried working with the document-oriented databases. In my case the introduction started with the CouchDB. There was some research and development including the NCouch project.
CouchDB is a database famous for its distributed nature, great read performance and schema-less nature of the documents (CouchDb overview and introduction for .NET).
These nice features come right of the box, when you deploy CouchDB. And they are actually available out there for you because CouchDB partially implements Command-Query Responsibility Segregation principles. These benefits are inherent to the CQRS:
- Free schema
- High read performance
- Replication
Indeed:
- Free schema in CouchDb - persistence ignorance in the CQRS.
- Multiple view engines in CouchDb - multiple subscribed event denormalizers in the CQRS.
- High read performance of CouchDb views - high read performance against denormalized query tables.
- Auto replication in CouchDb - Publish/Subscribe in CQRS.
From this standpoint CouchDb is an extremely nice base for building lightweight and high-performance document-oriented web applications, since you get all the performance features packed nicely with the REST API, JSON and ability to write view engines in multiple languages.
CouchDb might work for you even if you are not working on the web. If your solution is heavily into the domain of documents and unstructured data, then CQRS has chances of fitting nicely there.
However, if your domain deals with the business processes, where changes could be more important than just the data, then it is better to implement a CQRS-based solution yourself. You'd get similar performance and scalability benefits as CouchDb.
Let's take a moment to think about the logical differences between the data (state) and changes.
State is a version of a document, static snapshot or immutable representation of something. Change is a step in the process, that describes "how did we get to the version 2 from version 1" or "what happened between the lines". As such, change captures much more contextual information about the process, than the mere difference between the two versions.
In business contextual information is the king. It is so important, that changes are often treated like full-class citizens. Sometimes you can meet them under the names of events, commands or workflows. We even got persistence pattern that is based on events, rather than on the state - Event Sourcing.
Given all that, my impression about CouchDB is: CouchDB is a young document-oriented database that leans towards the web applications and is based on the CQRS principles. The latter gives it nice performance, distribution and scalability.
Unfortunately, as a .NET developer focused on the distributed enterprise applications targeting business scenarios, I don't have a lot of use cases for such an interesting tool.
This article concludes my study of the CouchDB, for now. It also represents implementation study for the CQRS research, which is a part of xLim 4 body of knowledge.
Thursday, April 8, 2010 at 16:09
Reader Comments (6)
Rinat, what is your opinion to store event history in CQRS model in document oriented DB for example CouchDB?
Slava, it will complicate the solution. And I don't see the benefits that are worth this.
Do you have a specific use case in mind?
Hi Renat,
Thanks for you explanation (via Gbuzz) Change monitoring seems extremely powerful in many use cases ("Dashboarding" type reporting for example might be much more natural, for example?).
Another thing that I'm sensing as a novice weighing my options is the degree of complexity in the document set and the types of relational changes (including meta-data) that need to be able to track?
To explain why I care, let me describe what I think I need so far.
I could do this the way i know RDBMS but would like to learn something that offers some advantages its appropriate.
I have a large set of completed xml forms that I would like to keep read-only buy make indexible and widely accessible for read-only user analysis in ways that are yet to be determined. The matadata is encoded in these documents via an xml schema definition and its associated version. The future will include new surveys coming in that slightly vary in structure as a few questions are modified, added, or deleted (say at most on version change every six months) The existing data will not be modified at all though and the old schema versions will remain available to maintain the contextual data. \
So for individual documents it seems like I want write once read many times. I'd like to be able to grow the database and add xml versions. The data of how users access and summarize these things would be interesting but totally beyond the scope of the project. So its almost more of a digital document archive that I need with the ability to add relevant indexes and mappings to aggregate data from several versions of a single survey. I suppose also some users might want to combine accross surveys and versions of a survey though I haven't been asked to create a system capable of that type of meta-analysis that would desire more global relationships to be described.
I thought an xml database or document database like couchdb might be great rather than an RDBMS with xml, since the users hate the concept of normalizing a survey and very few relationships seem necessary on these single sitting survey results document,; but the accessibility increase I'm hoping to get by allowing intranet web based submissions and querying seems pretty awesome.
Any advice, critiques, or comments would be appreciated. Thanks again.
Hi John,
Yes, document-oriented databases seem to be a perfect fit for managing such scenarios that deal with ever-changing forms or surveys that are written only once.
Basically, for each survey schema, you just need an UI MVC representation (i.e.: view layout and validation schemas) and the system would be able to manage lots of various surveys without too much of an increase in complexity. CouchDb views would act as an indexing and flexible querying solution here (either within version, across versions or even types). That's what CouchDb had been designed for and that's what it is good at.
Persisting schema itself in the database would create implicit versioning and collaboration logic around.
The scenario should work nicely as long as the data stays read-only and without any business context or workflows around. In the latter case using document-oriented database might hurt.
Best regards,
Rinat
As you proably already read, Ayende thinks that his RavenDB is perfect for event storage. what make CouchDB not ideal for this type of usage? http://ayende.com/Blog/archive/2010/05/30/raven-amp-event-sourcing.aspx
RavenDB is a bit better than CouchDB due to the .NET extensibility. Yet, I believe, the actual concept of document databases just does not fit into the event storage, at least in the CQRS approach. I wouldn't use either of these for the event storage.