Monthly Archives: November 2010

Cloud Core Principles – Elasticity is NOT #Cloud Computing Response

Ok, I know that this is dangerous.  Randy is a very smart guy and he has a lot more experience on the public cloud side than I probably ever will.  But I do feel compelled to respond to his recent “Elasticity is NOT #Cloud Computing  …. Just Ask Google” post.

On many of the key points – such as elasticity being a side-effect of how Amazon and Google built their infrastructure – I totally agree.  We have defined cloud computing in our business in a similar way to how most patients define their conditions – by the symptoms (runny nose, fever, headache) and not the underlying causes (caught the flu because I didn’t get the vaccine…). Sure, the result of the infrastructure that Amazon built is that it is elastic, can be automatically provisioned by users, scales out, etc.  But the reasons they have this type of infrastructure are based on their underlying drivers – the need to scale massively, at a very low cost, while achieving high performance.

Here is the diagram from Randy’s post.  I put it here so I can discuss it, and then provide my own take below.

My big challenge with this is how Randy characterizes the middle tier.  Sure, Amazon and Google needed unprecedented scale, efficiency and speed to do what they have done.  How they achieve this are the tactics, tools and methods they exposed in the middle tier.  The cause and the results are the same – scale because I need to.  Efficient because it has to be.   These are the requirements.  The middle layer here is not the results – but the method chosen to achieve them.  You could successfully argue that achieving their level of scale with different contents in the grey boxes would not be possible – and I would not disagree.  Few need to scale to 10,000+ servers per admin today.

However, I believe that what makes an infrastructure a “cloud” is far more about the top and bottom layers than about the middle.  The middle, especially the first row above, impacts the characteristics of the cloud – not its definition.  Different types of automation and infrastructure will change the cost model (negatively impacting efficiency).  I can achieve an environment that is fully automated from bare metal up, uses classic enterprise tools (BMC) on branded (IBM) heterogeneous infrastructure (within reason), and is built with the underlying constraints of assumed failure, distribution, self-service and some level of over-built environment.  And this 2nd grey row is the key – without these core principles I agree that what you might have is a fairly uninteresting model of automated VM provisioning.  Too often, as Randy points out, this is the case.  But if you do build to these row 2 principles…?

Below I have switched the middle tier around to put the core principles as the hands that guide the methods and tools used to achieve the intended outcome (and the side effects).

The core difference between Amazon and an enterprise IaaS private cloud is now the grey “methods/tools” row.  Again, I might use a very different set of tools here than Amazon (e.g. BMC, et al).  This enterprise private cloud model may not be as cost-efficient as Amazon’s, or as scalable as Google’s, but it can still be a cloud if it meets the requirement, core principles and side effects components.  In addition, the enterprise methods/tools have other constraints that Amazon and Google don’t have at such a high priority.  Like internal governance and risk issues, the fact that I might have regulated data, or perhaps that I have already a very large investment in the processes, tools and infrastructure needed to run my systems.

Whatever my concerns as an enterprise, the fact that I chose a different road to reach a similar (though perhaps less lofty) destination does not mean I have not achieved an environment that can rightly be called a cloud.  Randy’s approach of dev/ops and homogeneous commodity hardware  might be more efficient at scale, but it is simply not the case that an “internal infrastructure cloud” is not cloud by default.

Tagged ,

If You’ve Never Used A Cloud, Can You Call Yourself An Expert?

A recurring challenge I have with a lot of enterprise vendor “cloud” solutions I get briefed on is that they seem to be designed and built without any real understanding of how and why customers are actually using the cloud today.  I suspect in most cases that this results from the fact that the people building these solutions have NEVER EVER used Amazon, Rackspace, or any other mainstream public cloud offering.

Chris Hoff points out his suspicion of this scenario in his frank assessment of the recently released FedRAMP documentation.

I’m unclear if the folks responsible for some of this document have ever used cloud based services, frankly.

When you gather together a group of product managers, architects, developers and self-styled strategists who have never used a public cloud, and ask them to design a cloud solution, more often than not their offering will not be a cloud solution (or any other kind of solution that customers want).  It’s not that these people are lacking in intelligence.  Rather, they lack the context provided through experience.  Oh, and many large enterprise vendors suck at the very basics of the “customer development process.”  So not only will their solution not be cloudy, it will be released to the market without them knowing this basic piece of information.

SQL In the Cloud

Despite the NoSQL hype, traditional relational databases are not going away any time soon. In fact, based on continued market evolution and development, SQL is very much alive and doing well.

I won’t debate the technical merits of SQL vs. NoSQL here, even if I were qualified to do so. Both approaches have their supporters, and both types of technologies can be used to build scalable applications. The simple fact is that a lot of people are still choosing to use MySQL, PostgreSQL, SQL Server and even Oracle to build their SaaS/Web/Social Media applications.

When choosing a SQL option for your cloud-based solution, there are typically three approaches as outlined below. One note – this analysis applies to “mass market clouds” and not the enterprise clouds from folks like AT&T, Savvis, Unisys and others. At that level you often can get standard enterprise databases as a managed service.

  1. Install and Manage – in this “traditional” model the developer or sysadmin selects their DBMS, creates instances in their cloud, installs it, and is then responsible for all administration tasks (backups, clustering, snapshots, tuning, and recovering from a disaster. Evidence suggests that this is still the leading model, though that could soon change. This model provides the highest level of control and flexibility, but often puts a significant burden on developers who must (typically unwillingly) become DBAs with little training or experience.
  2. Use a Cloud-Managed DBaaS Instance – in this model the cloud provider offers a DBMS service that developers just use. All physical administration tasks (backup, recovery, log management, etc.) are performed by the cloud provider and the developer just needs to worry about structural tuning issues (indices, tables, query optimization, etc). Generally your choice of database is MySQL, MySQL, and MySQL – though a small number of clouds provide SQL Server support. Amazon RDS and SQL Azure are the two best known in this category.
  3. Use an External Cloud-Agnostic DBaaS Solution – this is very much like the cloud-based DBaaS, but has a value of cloud-independence – at least in theory. In the long run you might expect to be able to use an independent DBaaS to provide multi-cloud availability and continuous operations in the event of a cloud failure. FathomDB and Xeround are two such options.

Here’s a chart summarizing some of the characteristics of each model:

In my discussions with most of the RDBMSaaS solutions I have found that user acceptance and adoption is very high. When I spoke with Joyent a couple of months ago I was told that “nearly all” of their customers who spend over $500/month with them use their DBaaS solution. And while Amazon won’t give out such specifics, I have heard from them (both corporate and field people) that adoption is “very robust and growing.” The exception was FathomDB was launched at DEMO2010. They seem to not have gained much traction, but I don’t get the sense they are being very aggressive. When I spoke with one of their founders I learned they were working on a whol new underlying DBMS engine that would not even be compatible with MySQL. In any event, they have only a few hundred databases at this point. Xeround is still in private beta.

The initial DBaaS value proposition of reducing the effort and cost of administration is worth something, but in some cases it might be seen to be a nice-to-have vs. a need-to-have. Inevitably, the DBaaS solutions on the market will need to go beyond this to performance, scaling and other capabilities that will be very compelling for sites that are experiencing (or expect to experience) high volumes.

Amazon RDS, for instance, just added the ability to provision read replicas for applications with a high read/write ratio. Joyent has had something similar to this since last year when they integrated Zeus Traffic Manager to automatically detect and route query strings to read replicas (your application doesn’t need to change for this to work).

Xeround has created an entirely new scale-out option with an interesting approach that alleviates much of the trade-offs of the CAP Theorem. And ScaleBase is soon launching a “database load balancer” that automatically partitions and scales your database on top of any SQL database (at least eventually – MySQL will be first, of course but plans include PostgreSQL, SQL Server and possibly even Oracle). My friends at Akiban are also innovating in the MySQL performance space for cloud/SaaS applications.

Bottom line, SQL-based DBaaS solutions are starting to address many (though not all) of the leading reasons why developers are flocking to NoSQL solutions.

All of this leads me to the following conclusions – I’m interested if you agree or disagree:

  • Cloud-based DBaaS options will continue to grow in importance and will eventually become the dominant model. Cloud vendors will have to invest in solutions that enable horizontal scaling and self-healing architectures to address the needs of their bigger customers. While most clouds today do not offer an RDS-equivalent, my conversations with cloud providers suggest that may soon change.
  • Cloud-Independent DBaaS options will grow but will be a niche as most users will opt for the default database provided by their cloud provider.
  • The D-I-Y model of installing/managing your own database will eventually also become a niche market where very high scaling, specialized functionality or absolute control are the requirements. For the vast majority of applications, RDBMSaaS solutions will be both easier to use and easier to scale than traditional install/manage solutions.

At some point in the future I intend to dive more into the different RDBMSaaS solutions and compare them at a feature/function level. If I’ve missed any – let me know (I’ll update this post too).

Other Cloud DBMS Posts:

Amazon RDS vs. SQL Azure: The birth of the DBMS Utility

Amazon Adds Consistency to SimpleDB

Databases and Cloud Computing Roundup

%d bloggers like this: