CouchDB in the Cloud - Cheap and Flexible Persistence For .NET
This post about CouchDB is a research that continues topics of xLim and Cloud Computing in .NET series.
We’ll walk over one of the possible approaches for using CouchDB from .NET applications in a cloud scenario.
What is CouchDB?
CouchDB is a database engine designed for persisting documents without a predefined schema. It uses HTTP REST API for communicating with JSON-serialized messages. CouchDB claims to be designed for scalability (this includes multi-master replication model).
Simply put, you talk to CouchDB server with documents like this:
{
"Subject":"I like Plankton",
"Author":"Rusty",
"PostedDate":"2006-08-15T17:30:12-04:00",
"Tags":["plankton", "baseball", "decisions"],
"Body":"I decided today that I don't like baseball. I like plankton."
}
Documents do not have predefined schema, so this is a valid one as well:
{
"Author":"Rusty",
"About": "Former baseball fan."
}
NB: although documents are unstructured, they always have a unique identifier associated with them and a revision number (CouchDB features Multi-Version-Concurrency-Control model).
In order to run queries against all this unstructured data, CouchDB features uses of View Servers that process and update structured views expressed in MapReduce queries (yes, that's the scalability opportunity here). These queries are like SQL in the relational world, yet they could be written in various languages (Javascript, Python, Ryby and Python are supported at the moment).
CouchDB as a project is a top-level open-source project within the Apache Foundation.
CouchDB is written in Erlang and is capable of running on POSIX systems. Theoretically it could even run on Windows, but the experience is far from being smooth. Yet, it is extremely easy to get yourself a cheap DB instance running in the cloud (as we'll see later in the article).
Primary disadvantages of the project are:
- CouchDB is a young project and it has not gone through heavy stress usage;
- .NET adapters are not present.
As always, using this technology in scenario where it does not fit well, will turn it into one big disadvantage as well.
How to run CouchDB in the Cloud?
At the moment there is no public CouchDB hosting available, yet. One future option worth of mentioning is a Couch.IO that will offer 10GB size databases for 30 USD per month.
However, let's see how we can get us a virtual CouchDB server at a fraction of this cost.
We'll use Rackspace cloud Virtual Machine slice in this scenario. You'll find the sequence for setting up a temporary development machine below.
Important: this configuration is not designed for production scenarios. You would at least need to adjust the firewall settings, change the ports, assign proper users and configure couchdb for auto-start.
Create new virtual cloud server:
- Properties: 256 MB - $0.015 per hour - 10 GB
- OS: Ubuntu 9.04 (jauntu)

Wait till confirmation email comes (a couple of minutes) and launch Putty (or any other SSH client) using the IP address from your confirmation email. Security warning will show up - Just hit "No" to accept the key temporarily. Username and password are provided in the email.
Tip: right-click with mouse acts as "paste" in Putty. This works even when system expects password input.
After a successful logon you should see something like:

Update everything to the latest by typing following command and hitting enter:
apt-get update
Install everything that is required for running latest CouchDB:
aptitude -y install build-essential
apt-get -y install libmozjs-dev libicu-dev libcurl4-openssl-dev erlangGet the latest release and compile it:
wget http://mirrors.24-7-solutions.net/pub/apache/couchdb/0.9.0/apache-couchdb-0.9.0.tar.gz
tar zxvf apache-couchdb-0.9.0.tar.gz && cd apache-couchdb-0.9.0
./configure && make && sudo make installNB: You can install CouchDB 0.8 by simply executing apt-get couchdb. But it will miss some of the 0.9 features.
Allow the instance to listen to public IP:
nano /usr/local/etc/couchdb/local.ini
In this window you will need to change httpd section to look like the snippet below and save:
[httpd] port = 5984 bind_address = 0.0.0.0Start CouchDB by typing:
couchdb -b
Verify in the command prompt that the instance is running locally (it should return "hi" from CouchDB):
curl http://localhost:5984
Verify from your local machine that CouchDB is accessible and ready by opening this Url in your browser :
http://[IP of CouchDB Server here]:5984/_utils/
How to install CouchDB on Windows?
Working with a real and cheap server in the cloud might be interesting. Yet it could be impractical in certain scenarios (i.e.: when working in offline mode or testing complex deployment scenarios). For that we need to learn how to install CouchDB on a local machine.
Although it is possible to run CouchDB directly on Windows, the solution is too bulky and unreliable. We can simply use Ubuntu virtual machine instead, sticking to the routine above.
Unfortunately Windows 7 Virtual PC is extremely limited (its primary goal is to support legacy applications by virtualizing Windows XP). So in order to create a new VM we'll use Sun VirtualBox
The process looks like this:
- Download latest Ubuntu server distribution.
- Create new VM instance for Ubuntu, named "Ubuntu".
- Set up NAT networking for this instance and create sufficiently large HD (2 GBs should be enough).

- Mount downloaded Ubuntu ISO into the virtual DVD and go through the bare server installation.
Logon into the instance and install SSH server:
sudo apt-get install openssh-server
Shut down the instance and configure virtual SSH forwarding so that we could use Putty to talk to the OS locally (we are assuming that the VM instance name is 'Ubuntu'). These commands have to be executed from the VirtualBox directory. Note, that there are no line breaks between "Ubuntu" and "VBoxInternal..." (I've introduced them for readability).
VBoxManage setextradata "Ubuntu" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/Protocol" TCP VBoxManage setextradata "Ubuntu" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/GuestPort" 22 VBoxManage setextradata "Ubuntu" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/HostPort" 2222This binding will redirect all calls from localhost:2222 to the virtual Ubuntu:22.
Configure CouchDB forwarding, so that we could talk to the database, as if it were installed on the localhost:
VBoxManage setextradata "Ubuntu" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestcdb/Protocol" TCP VBoxManage setextradata "Ubuntu" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestcdb/GuestPort" 5984 VBoxManage setextradata "Ubuntu" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestcdb/HostPort" 5984Create short-cut to the GUI-less interface of VirtualBox. Launching it will start our server-in-a-box:
VBoxHeadless.exe --startvm Ubuntu --vrdp=off
Connect Putty to localhost:2222 (port number we've configured for SSH forwarding) and perform the install routine from the previous section, starting from:
apt-get update
Now you should be able to talk to your CouchDB (don't forget to start it) using the following address:
http://localhost:5984/
You can open Futon interface in the browser with:
http://localhost:5984/_utils/

That should be it. Now we have a local CouchDB server that could be used as a development sandbox. We could also replicate this deployment in any cloud environment that supports Infrastructure as a Service scenario with VMs.
This database server does not have a predefined schema, is easy and cheap to deploy (once you know the drill) and has been designed with the scalability in mind. All this makes Couch DB an interesting technology that might fit well in xLim set of design principles.
In the next article we'll talk about using .NET to communicate with CouchDB in a strongly-typed manner. You can subscribe to this journal to stay tuned for any updates.
Related links:
Tuesday, July 21, 2009 at 5:44
Reader Comments (7)
An excellent article, thanks for taking the time to research and write it!
great article. I have a .NET adapter that uses JSON.NET if u want one.
@Pete, thanks for the offer.
After reading through JSON.NET codebase I felt it was too bloated. I ended up using JayRock and WCFJson serializers for the research. I'll talk more about that in the next article in the series.
Small misspelling:
OS: Ubuntu 9.04 (jantu)
WBR, Oleg.
Thanks, Oleg!
Nice article, I've been curious about hosting. I have a couch DB with 2.8 million records (HHS physician data) and it is very fast; apx 10x faster than oracle.
complex post. upright one unimportant where I quarrel with it. I am emailing you in detail.