Latest Replies
Friday
May212010

MSMQ Azure vs. Scalable Queue with Concurrency Locks

Not all queues are created equal.

While talking about the development friction points around Azure Queues I've mentioned using some types of locks. These locks are needed in order to perform message de-duplication which will guarantee, that this "ChargeCustomerCommand" will be processed once and only once.

We don't want to charge customer twice, do not we? So in order to guarantee that this never (ever) happens in our Azure application, we need to implement message de-duplication and locking.

But why don't we want to use some of these smart locking approaches with Azure Queues? Here are a few reasons.

First of all, this adds some complexity to the solution. Just take a look at the simplest implementation, that uses cloud storage (Azure Blob) to keep messages larger than 8KB, while relying on RDBMS (SQL Azure) to guarantee message de-duplication.

Scalable Azure Queue with Concurrency Locks

The implementation could even be more complex (i.e.: using RDBMS only for the concurrency locks, while keeping actual message states in the cloud storage). Yet, does your solution really need this level of complexity and potential scalability? How many transactions per second do you need to process?

Jonathan Oliver had written a lot on the concurrency locks, indempotency patterns and message de-duplication in the cloud environments.

But that's all complex stuff. Complex things in your solution tend to increase development costs exponentially and you probably don't want to do that.

Wouldn't it be simpler just to:

Simple message-based solution in the Windows Azure

Second, even if we are OK with the development tax of concurrency locking against Azure Queues, this does not save us from answering following questions:

  • How do we handle transactions and roll back on failures?
  • How exactly do we persist messages?
  • How do we handle visibility timeouts and message reprocessing in case of failure?
  • How do we roll back our message, if it consists of two parts - message header within the Azure Queue and actual serialized message within the Blob storage?
  • How do we redirect or publish large messages? Do we download and create new copies of every message or do try to be smart by copying only azure queue messages that reference serialized data in Azure Blob?
  • How do we keep track of this serialized data chunks and clean up these, when there are no references left? Do we implement our own garbage collector service for the messages?

In the middleware world such questions are normally answered by the middleware server, and not the developer. One of these established middleware servers is MSMQ. So why not let it handle the job? This will significantly simplify initial development for any message-based cloud solution, reducing complexity, development and maintenance costs.

These few cloud software applications that grow above the performance offered by an MSMQ server can move towards implementing purely scalable queues on top of Windows Azure or go towards refactoring to avoid such bottlenecks. NServiceBus users mentioned multi-server systems that processed millions of messages per hour, so this is achievable (NSB uses MSMQ as a messaging transport).

Third reason is pure performance. Throughput of Azure Queues is not that big at the moment. People had been reporting something like 500 messages per minute in total. That's 10 messages per second in the case, where we don't have any BLOBs to read or locks to manage.

Azure Queues, as a tool, have quite a few of uses as well. Especially given their pricing model and implicit scalability. Yet, in my opinion, they can't replace messaging transport with the MSMQ behavior. Especially in the case of xLim and CQRS architecture approaches.

Would your Windows Azure solution benefit more from the plain Azure Queue or MSMQ Azure?

« Salescast - Scalable Business Intelligence on Windows Azure | Main | Windows Azure Most Wanted - MSMQ Azure »

Reader Comments (2)

Very interesting article. It's easy to assume that Azure Queues will behave like MSMQ - I did!

May 21, 2010 | Unregistered CommenterSean Kearon

I definitely agree that having something like "Azure MSQM" would make things much, much easier in many respects. In the meantime us programmers have applications to build right now. If we are unable to wait upon the good graces of Microsoft and keeping hoping beyond hope that maybe someday they'll introduce Azure MSQM, we won't get anything done right now.

At the same time, most of the message de-duplication code could be written into a library without much difficulty and then used over and over again. This would keep your application-specific code clean and focused on processing messages. It is difficult or complex? No. Does it require a little bit of thinking? Yes.

So if we need message de-duplication and we happen to be using Azure Queues we've essentially got three good options:
1. Make the message truly idempotent so you can process it multiple times if necessary.
2. Write application-specific compensating code when it is discovered that a message has been handled several times.
3. Write to a central storage that supports locks, e.g. SQL Azure.

On the Azure platform, I'm going to guess that people are gravitating towards SQL Azure because it's familiar. This being the case, we can allow our worker nodes to process each message and write the results to the database table. In the event of a duplicate message, we'll get an optimistic concurrency exception after the other worker node commits thus releasing the lock on the particular rows in question and allowing the thread on our worker node to proceed.

As you said, this is a bit tricky when invoking a 3rd-party web service as part of the work being done by the worker. There are a few techniques around that too, but these apply for both cloud-based queues as well as traditional message queues on dedicated hardware:
(see bottom of the article: http://jonathan-oliver.blogspot.com/2010/04/idempotency-patterns.html)

May 24, 2010 | Unregistered CommenterJonathan Oliver
Comments for this entry have been disabled. Additional comments may not be added to this entry at this time.