Although Software Design is further down my enterprise considerations list, when Gojko Adzoc’s post on lessons he has learned developing in Amazon’s AWS environment, I knew I had to pass it along.  The post describes new challenges for developers who have previously worked in a purpose-built, directly controlled, infrastructure environment.  These challenges range from server reliability to storage speed.  After articulating the challenges, Adzic offers advice on “How to keep your sanity”:

“It took me a while to understand that just deploying the same old applications in the way I was used to isn’t going to work that well on the cloud. To get the most out of cloud deployments, applications have to be designed up-front for massive networks and running on cheap unstable web boxes. But I think that is actually a good thing. Designing to work around those constraints makes applications much better – faster, easier to scale, cheaper to operate. Asynchronous persistence can significantly improve performance but I never thought about that before deploying to the cloud and running into IO issues. Data partitioning and replication make applications scale better and work faster. Sections of the system that can work even if they can’t see other sections help provide a better service to customers. This also makes the systems easier to deploy, because you can do one section at a time.

To conclude, there are three key ideas to keep in mind:

    • Partition, partition, partition: avoid funnels or single points of failure. Remember that all you have is a bunch of cheap web servers with poor IO. This will prevent bottlenecks and scoring an own-goal by designing a denial of service attack in the system yourself.
    • Plan on resources not being there for short periods of time. Break the system apart into pieces that work together, but can keep working in isolation at least for several minutes. This will help make the system resilient to networking issues and help with deployment.
    • Plan on any machine going down at any time. Build in mechanisms for automated recovery and reconfiguration of the cluster. We accept failure in hardware as a fact of life – that’s why people buy database servers with redundant disks and power supplies, and buy them in pairs. Designing applications for cloud deployment simply makes us accept this as a fact with software as well.”

In the post, Adzic maintains that he is a cloud computing advocate, his goal of the post, and the presentation it came from, was to “expose some of the things that you won’t necessarily find in marketing materials.”

Read Adzic’s post.  Remember the 4th Enduring Aspect of Cloud Computing.

Posted by brenda michelson at 5:28 pm in 100-days, Cloud Watch, networks, performance & reliability, readiness, software architecture, storage | Permalink | Comments(0)
| Trackback URL

Positive cloud adoption metrics from reddit:

“As most of you know, we moved reddit to EC2 back in May of 2009. Our experience there has been excellent so far. Since we moved to EC2, the number of unique users has gone up 50%, and pageviews are up more than 100%. To support this growth, we have added 30% more ram and 50% more CPU, yet because of Amazon’s constant price reductions, we are actually paying less per month now than when we started.”

The reddit blog post was in response to opinions that reddit’s site had slowed since the move to Amazon.  The post continues with a “nerd alert” section on the volume-based cause of the slowdown, and described the necessary changes to reddit’s database and caching architecture.

I won’t replicate the description here, but suffice it to say, scale doesn’t guarantee performance.

Posted by brenda michelson at 4:09 pm in adoption, Cloud Watch, elasticity & scale, performance & reliability, use cases | Permalink | Comments(1)
| Trackback URL