Software Design Blog

Journey of Rinat Abdullin

Cloud Bursting Scenarios for Small Companies

Almost every software developer dreams about starting up his own MicroISV company that would keep bringing him money even during the sleep. Let’s see how cloud computing can help us out here by making it easier to succeed.

This article continues the Cloud Computing series.

These days it became so easy to start a one-man company in the IT industry, like it had never been before. A lot of people are doing that, while the MicroISV and Micropreneur are becoming rather hot topics on self-development blogs.

One of the reasons for this is that you no longer need to have a physical hardware or other resources in order to be able to provide software and services. Almost everything could be virtualized and hosting could be simply rented. It’s quite flexible and convenient, isn’t it?

As turns out, Cloud Computing can add even more flexibility and market maturity to small companies at a fraction of the cost. That’s because one of the major driving forces behind the Cloud Computing hype is the actual ability to buy hardware capacities and resources on demand without paying for them upfront.

Let’s have a look at first business scenario of pushing CPU-intensive computing into the cloud and the technological availability of implementing it today.

Note: all figures, names and numbers are purely imaginative. They are used just to show you the logic behind the cloud bursting and scale of savings it can bring.

Let’s imagine a small and young one-man company providing some software as a service. For example IdealDotNET that specializes in providing custom code quality reports and optimization recommendations for .NET development projects, based on some proprietary algorithms. These algorithms are running against assemblies being uploaded by the customers. First 10 assemblies are analyzed for free.

We’ll put aside financial feasibility of this endeavor and concentrate on the technological implementation side. It could be as simple as this:

ISV company without a cloud computing capabilities

As we all know, running complex introspection rules against any assembly is a rather CPU-intensive process, especially if these rules include some extensive checks. So this system will be able to handle only a limited amount of users per hour. Let’s assume that this setup can process 30 .NET assemblies per hour on a deployment that costs 100 EUR per month (entire system is located on one semi-dedicated server).

As long as the number of uploads does not exceed 30 per hour, this setup will provide solution that just works for its money. Here’s how the usage statistics might look like:

Sample upload statistics for an ISV company without a cloud computing capabilities

Everything will go just fine till IdealDotNET starts some marketing campaigns or simply gets mentioned by Scott Hanselman in his blog (he once twitted about the power to take down a small site by linking to it).

We’ll get into a nasty situation, when this happens. Number of upload attempts per hour will exceed our capacities:

Small ISV company is not capable of handling the Slashdot effect

If the system had been designed to handle such a situation, then it would not crash down everything. Instead, all excessive visitors would be simply turned away by saying something like “We are sorry, but upload limit has been reached, please come later”.

This approach means lost profit, because some of these visitors had a chance of liking the service and becoming customers. More than that, some of the existing customers may hit the limit, get discouraged by it and turn away to the competitors.

What could we do?

One option is to have more servers from the start. This way IdealDotNet will have more capacity to handle usage spikes before they happen. Let’s say that the company got two more servers from the very start, boosting upload limit to 100 per hour. Then the Slashdot effect would look much better for us:

Handling Slashdot effect by buying servers before it happens

That’s an easy solution, yet it comes with two problems:

  • Additional servers come at a cost (for example, additional 200 EUR per month) and they will still sit idle most of the time. If IdealDotNET had existed for mere 2 months, then total cost (simplified) of handling such a spike is 400 EUR.
  • Buying exactly the right processing capacities the start is impossible. Even with the best estimate you’ll either have excessive capacities or will miss some spikes.

That’s where the technological opportunities of Cloud computing come to the rescue. With some tweaking infrastructure of IdealDotNET could be modified to use only one server most of the time, while gracefully scaling up to handle the spikes.

ISV company enhanced with the capabilities to burst into the cloud

We’ll talk about the implementation details later. Let’s focus on the economical benefits first. With this configuration our processing limits start looking more interesting - they elastically adapt to handle the demand:

Sample usage statistics for an ISV company capable of bursting into the cloud

If IdealDotNET had used Amazon EC2 Medium CPU Virtual Machines to handle these spikes (their price is 0.20USD per hour each), then the total cost of handling this spike is around 40 EUR. That looks better than 400EUR, does not it? Plus we get the flexibility of elastically scaling to 1000 uploads per hour (or more) as needed.

This scenario is called Cloud Bursting or Bursting into the cloud.

Note: theoretically IdealDotNET could get a better deal from Mosso (their cloud machines start at 0.015 USD per hour). But it does not look like they have all the technological pieces in place for implementing cloud bursting scenario efficiently.

How hard is it to implement a system that can elastically handle as much stress as it is needed? Answer is: it is not extremely hard, if you change your way of thinking a little bit towards elastic capacities.

We start by adding a Manager component to our system (it could be a separate process or simply a loop in a background thread). It will continuously watch for the number of uploads pending to be processed. Should it become obvious that our processing capacities are not enough to keep up with the demand; it will issue requests to the Cloud Computing API to deploy new virtual machines (Amazon EC2 already has such REST-based API available, while Windows Azure claims to release it later).

Each virtual machine usually comes preconfigured with some worker role. This pre-configuration can be done by:

  • deploying an application or scripts to the cloud (if this cloud service is a Platform as a Service provider like Google App Engine or Windows Azure);
  • uploading preconfigured VM image (if the cloud is an Infrastructure as a Service provider like Amazon EC2).

When this worker boots up, it only needs the addresses of the task and result queues in order to start processing jobs. These parameters are usually passed as arguments along with the deployment calls.

In order to let the cloud workers access these queues in a secure way we will need to expose them via some encrypted service API, implemented in a flavor of communication framework of your choice.

When the Manager detects that we no longer need the some processing power from the cloud, it would gracefully shut down these workers, saving energy for the society and money for IdealDotNET.

Note: obviously, there are few more technological challenges that have to be dealt with. These might include: reliable cloud management, remote exception handling and monitoring. However these problems do not have any black holes in their domain space (at least from my perspective) and thus are almost guaranteed to resolve to a finite and defined work time which could be used in project management schedules. This work time could be further reduced by applying established frameworks and development principles.

Given all these advantages of the cloud computing why wouldn’t IdealDotNET move everything to the cloud? There are several reasons for that:

  • Cloud computing solutions are generally more expensive when it comes to long-term consumption (compared to existing offerings by hosters).
  • This would be a more complex solution to implement (from the delivery and maintenance standpoints).
  • This comes with a certain lockdown cost. Once you have your entire infrastructure in a single cloud, it is not that easy to move it to a different cloud (especially since we don’t have well-established cloud computing market, yet). It is safer to keep primary resources closer to you, while using the cloud only to handle the spikes.

In fact, in order to reduce the dependency on a single cloud provider, IdealDotNET company could use several cloud computing providers, picking whichever is better for the situation.

Does not this example of cloud bursting create an itch to use such a great resource to solve some business problems? There are more examples to follow in the Cloud Computing series.

You can subscribe to the updates, if you are interested.

Related links: