GigaOm Pro Blog provides a list of likely trends, mostly in the infrastructure that powers cloud computing. They don’t have a lot to say about software, which I believe will also be a critical turning point over the next few years for both public and private clouds.
Posts Tagged ‘Cloud’
Clouds computing is about buying just the amount of data center resources that you need, and having the ability to change your mind about that quickly. Any IaaS cloud provider worthy of the name will let you spin up a new virtual machine any time you want to add capacity to your data center. The hard part is getting your application to take advantage of that extra capacity. (Despite what I wrote here, this applies to private clouds, too.) Most off-the-shelf applications don’t go twice as fast if you have twice as many servers – in fact, most off-the-shelf applications are designed to run on just one server (or virtual machine). Even the typical n-tier application is constructed from a handful of special purpose system – e.g. web servers, database servers, application servers, file servers. Although the web layer in this architecture can often benefit from just adding more servers, the database layer typically doesn’t. If you are building something more specialized than the typical “web server that displays stuff from a database”, you probably have to invent a way to distribute your work across many machines, and then you have to build admin or automation tools to support adding and removing resources. Now you are hiring developers that are good at building distributed applications, and spending lots of time (and perhaps a limited supply of start-up capital or project budget) building out a robust platform that your real project will run on.
In Rework, the founders of 37signals suggest that you should not worry about the scalability of your application, because once you start making money, you can always buy a more powerful machine. I believe that their point was that instead of worrying about a hypothetical scaling problem, you should get some customers and generate some revenue, after which you will have some money to throw at the problem. I agree wholeheartedly with that point. But more and more of us are in environments where we know that if we cannot support a large number of users, or a large data set, or provide fast response times, the project will fail. And we sometimes realize that even if we buy the biggest server that Dell or HP makes, it isn’t going to be enough.
So what we need is a good cloud platform for developing distributed applications that can go faster when we add more hardware, but runs fine with a small amount of hardware. It should be something that doesn’t require super-human distributed computing development skills, that lets the developer focus on his application rather than the plumbing, and that an administrator can configure for whatever scale the occasion demands (e.g. seasonal load spikes).
Got a solution like this? I’d love to hear about it. If not, I might have to go build it – feature requests welcome 😉
This story suggests that a senator was able to coerce Amazon Web Services into dropping a paying customer (Wikileaks), without any legal due process.
Amazon refutes this, and claims that the decision was their own, because the customer violated the AWS terms of service.
From Amazon’s message, it seems like Wikileaks is in violation of the TOS:
It’s clear that WikiLeaks doesn’t own or otherwise control all the rights to this classified content. Further, it is not credible that the extraordinary volume of 250,000 classified documents that WikiLeaks is publishing could have been carefully redacted in such a way as to ensure that they weren’t putting innocent people in jeopardy. Human rights organizations have in fact written to WikiLeaks asking them to exercise caution and not release the names or identities of human rights defenders who might be persecuted by their governments.
The trouble is, Wikileaks has not actually been charged with or convicted of a crime, and the potential long-term effects of the leak are debatable, so these terms of service are fairly subjective. The whole idea of Wikileaks is, of course, controversial. The problem is that Amazon’s action feels like an activist action – like the Wikileaks service was suspended because Amazon doesn’t like them.
For those of us building a business based on the AWS Infrastructure as a Service model, this is very scary. The one thing that we can count on with our own servers is that they don’t judge our content. We don’t have to convince our disk drives that our cause is just, and our moral compass is intact. Whether AWS responded to pressure from Joe Lieberman or acted on their own doesn’t really matter. The message is that if AWS thinks you are up to something fishy, their policy is to drop your service first, and ask questions later.
When I am looking at a business plan, I worry about technology risk, market risk, and execution risk. Normally, I would think that building a business on top of a cloud platform like AWS would dramatically reduce the execution risk, but now I have to worry about the risk that Amazon won’t like what I am doing. Amazon has been leading the cloud computing charge, and has been saying every step of the way “don’t worry, you can trust us”. I think banning Wikileaks was a giant step backwards in their credibility as a utility partner.
Based on my recent work with Double-Take Software’s Cloud business, I guess I am now officially a cloud computing entrepreneur. I am looking for my next project, and just decided to go open source with this. Here are ten ideas off the top of my head – they definitely need some refinement, and some of them will probably not pan out, but feel free to steal them, offer improvements, or how about this – contact me to collaborate on one.
- Modify an existing open source cloud platform into a drop-in solution for hosting companies, with a global ecommerce site and management interface. Hosting companies can get into cloud easily; customers get broad geographic coverage with a single interface. (I know at least one company is already claiming this, but it is a big, big market).
- Take one (or more) existing open source web applications or frameworks (like MediaWiki, Django, Sugar CRM, etc.), optimize the deployment for a highly scalable distributed system (i.e. load-balanced front-end web farm, distributed / scalable db back-end, memcached, etc.), and make it available at a very low-cost for entry users, with the price scaling up with usage.
- Acquire an enterprise-class (or academic usage) simulation / analysis solution, and modify it to use map-reduce. Deliver the results of massive calculations in a few minutes (or seconds), and only bill for the usage. Commoditize massive calculations at a price smaller users can afford.
- Build a management platform to transparently migrate virtualized workloads from a private cloud up to a larger public cloud provider, and back down. (This can be expanded to cross-cloud, or cross-region migrations.)
- Create a encrypting, deduplicating network transport protocol and file system that minimizes the bandwidth and storage required to keep workloads synchronized between private and public clouds. (Useful for #4.)
- Use one of the open-source cloud platforms to build an Amazon compatible cloud in places where Amazon doesn’t have a data center. Amazon is expanding rapidly, but it is a big world, and most countries have regulations restricting businesses from hosting data outside the country. Europe is an especially fertile market for this.
- Build a GUI macro editor with building blocks that include Amazon (or another, or all) cloud resources, ecommerce, and maybe some social and / or mobile features. Let customers build new cloud-based web applications by drag & drop.
- Create a web app that lets a product manager or a sales person enter the typical problems their customers face, and the features of their product that solve those problems, and the ultimate benefit. Let their customer walk through a wizard, checking the boxes to describe their needs, and automatically generate a beautifully designed, customized proposal, based on their requirements. The whole thing is based on standardized templates, but it feels totally customer for both the vendor and the customers.
- Build a Linux-based file server appliance for SMB, with HSM and version archiving to the cloud. It basically has storage that never ends, and the most current / relevant files are always local.
- Add virtual machine recovery and remote access capabilities to an existing laptop backup solution. When your laptop blows up, you are happy to know that all your files are backed-up on-line. Wouldn’t you be even more excited if you could boot up a virtual machine of your laptop on-line, and finish the task you were already late on?
Information Technology has had a great impact on business productivity during my lifetime. My father ran his business with a manual typewriter. This was during the time when the IBM Selectric was popular, and personal computers were available, but not very good. Having a manual typewriter at that time was quirky, but it wasn’t that out-of-place in a small business where they just needed something simple , cheap, and functional.
My children (his granddaughters) do their elementary school homework using desktop publishing software and a color printer. They can definitely do a lot more than my dad could on his trusty old Smith-Corona.
During the first wave of personal computers in the workforce, lots of business people wondered if they would actually help the business be more productive, or if they would just be a distraction. After all, my dad seemed just fine with his old typewriter. Today, that question has been pretty well answered. We all know that modern businesses depend on their computers, and that if we took them away, we would need twice as many people, and maybe even then couldn’t get the work done with the same speed or level of quality. (Imagine modern phone bills being hand-typed!)
So we now think of IT as a potential for competitive advantage. Banks compete as much on their on-line banking system’s features as on the hours that their branches are open. Better IT systems can cut cost, improve speed and quality of service, and make your business more attractive to customers. But this isn’t true of all IT operations. Some of them are necessary evils, like backup and disaster recovery (DR).
If you design your backup solution perfectly, or you build an elaborate DR plan, it does nothing. It doesn’t get you any more customers, it doesn’t cut your cost, it doesn’t let you raise your prices. From a business perspective, and investment in DR doesn’t matter. Until it does. If you have a significant failure, and need to recover systems and data, DR matters. In fact, the general consensus is that large businesses that have a significant outage lose customers and reputation (i.e. future customers), and small businesses that are down for several days may never come back.
So the right way to think about DR is not by how much you will gain by doing it right, but by how much you will lose if you screw it up. That’s right – if you get it right, you get nothing, but if you get it wrong, you potentially lose everything. The question is – how much time, energy, and money do we want to invest into getting good at this subset of information technology? Wouldn’t we rather put more focus on the areas that actually can create value for the business? For areas where we cannot make a difference, we should think about outsourcing.
This is where cloud computing earns a lot of its value. Cloud is not better because it can offer servers at a lower cost; cloud is better because cloud vendors have an incentive to be better, and cheaper, and more reliable, and more secure than their competitors. They earn their living getting it right (at least the best ones do). Non-IT businesses have to look at things like DR, where doing better provides no business improvement, but doing worse hurts the business, and ask if they can compete with companies that specialize in the IT basics – power, cooling, network, and racked / stacked CPU and storage.
If you don’t think you can do better than the cloud vendor, and you suspect that you could (on a bad day) do worse, shouldn’t you just find a vendor you can trust, and pay them to do the stuff that doesn’t matter?
Maybe it is because of the business I am in, but I believe that cloud computing and disaster recovery are made for each other. There are lots of reasons for this, but the main one comes down to peak vs. average utilization.
There are a couple of ways to build a disaster recovery (DR) solution, but to avoid long outages, you need another data center off-site, stocked with enough equipment to get your critical systems back on-line. This is where the peak vs. average problem shows up. You have to provision that data center for the peak load during recovery – that is, you need servers, storage, and networks big enough to run your production workload. But you only get the benefit of average utilization – and for the DR center, average is mostly idle time.
As the graphic illustrates, most of the time, your DR data center is sitting idle, but it has to be ready to carry the production burden for all your critical systems at any time. You have to pay for peak utilization, but you only get the benefit of average.
Cloud computing lets us break that model – you can have huge peak capacity standing by, but only pay for what you consume. In a DR setting, you can pay for small amounts of utilization (e.g. just the data storage costs) most of the time, and then bring more resources on-line when you need them for recovery.
I will be doing a joint webinar with Double-Take Software and Amazon Web Services on April 21. It is free, and you can register online. We will talk about how to build a good DR solution using the cloud that provides exactly these kinds of benefits.
Amazon’s CTO, Werner Vogels (@werner) just tweeted that @rightscale has launched over 1,000,000 EC2 instances. At first, I thought he meant they had 1,000,000 instances running simultaneously. That would be very impressive, but not surprising, given Amazon’s attitude about scale.
On second-thought, I decided that this probably just means that RightScale had started that many instances, and who knows how long they ran. Probably many (most?) of them were started and stopped several times in the normal course of development, or to react to scaling needs, and so were counted multiple times.
Even if the truth is more of the latter, and less of the former, I think this is still a nice bit of credibility for Amazon Web Services. Lots of folks think AWS is nothing special – after all, we can all run virtual machines in our own data centers, and probably with more features that AWS offers. I think that AWS is all about operating virtual machines (and the associated storage and networking infrastructure) every efficiently, very reliably, at very large scale. And this milestone (for RightScale) points directly to that large scale operation: how many of us have ever provisioned 1,000,000 virtual machines in our data centers – for any amount of time?
I know that this is only one small metric, and not the only decision criterion, but when you need a dozen virtual machines, do you feel more confident starting them in a rack where you have run a dozen other VMs, or on an infrastructure where VMs have been started over 1M times by their partners?
via Twitter, @brennels asks: “Is Cloud Computing a Commodity? or will it be?”
I think it is a good question, because I think the commodity idea is at the heart of the cloud computing idea. In “The Big Switch“, Nicholas Carr draws a parallel between the history of the electric utility industry, and the computer industry.
Electricity from a utility is, perhaps, the ultimate commodity. Each unit of consumption is completely undifferentiated, to the point that few people can even identify where or even how the electricity in their home or office is generated. This wasn’t always true. One hundred years ago, 95% of electricity was generated by custom-built systems, owned, managed, and maintained by technicians employed by the factory. The industrial age was defined by the power of machines in factories, and the start of that age had every factory generating it’s own source of energy.
Today, we are in the information age, and it is defined by the power of our information systems. We are used to thinking of these systems as the humming, blinking boxes, racked and stacked in air-conditioned rooms. But that is like thinking of your house as the land it is built on. Processors and memory and disks and cabling are just the substrate, the surface upon which information systems are built. The real systems are software applications. When a CFO wants a new accounting system, the vendor typically says “start with a server with X capacity, and Y operating system”, and then makes configuration changes to the accounting software to accommodate the business.
In a factory, that is like a machine demanding a certain type and amount of electricity. Today, a factory owner doesn’t have his PT (power technology) department build up a properly sized generator. He just has an electrician put the proper type of outlet in the wall, and expects to see his monthly electricity bill go up.
Cloud computing has this same promise, that we can just buy as much undifferentiated computing power as we need, without building it ourselves. Along the progression from dedicated on-site generators to ubiquitous electric utility, though, were a few stages. Thomas Edison invented the idea of the electric utility, and launched the first one. But his idea was a bunch of small providers, serving the local community, and his companies supplied the equipment to those providers. Large companies bought his equipment and served their own office campuses.
Today, we have an integrated grid of power generators and consumers. Adding more capacity to the grid is like pouring more water into a barrel – it all looks the same to the consumer. Amazon is starting to make cloud computing look like a commodity, because I can just consume more storage and more CPU cycles, and they all look about the same. But someday, I won’t care if my CPU cycles come from Microsoft, or Amazon, or Google, or Rackspace, because they will all look alike. Maybe some trader will buy up cheap storage-hours in bulk from a local vendor, and sell them on the grid at market price.
Is Cloud Computing a Commodity? It is getting close – I can order capacity on demand, and not worry about how it got built, or even too much about where it lives. Someday, when I don’t care who’s name is on the building where my capacity lives, then it will be a real commodity.
I have been to several cloud conferences over the last few months, and I hear a lot about private clouds. The general idea seems to follow this basic logic:
- Cloud computing offers a lot of promise in efficiency, scalability, and flexibility.
- As an enterprise-sized business, we are very risk averse, and public cloud offerings scare us, so. . .
- Let’s do the same thing, but make it private!
The trouble with this is that almost all of the flexibility, scalability, and efficiency in cloud offerings come from sharing the load with other customers. For example, any business has to build for peak capacity, but can only easily monetize the average utilization. But a public cloud vendor can share the cost of the excess capacity across all the customers, lower the cost of that inefficiency for everyone. But a private cloud must, by definition build (and pay for) peak capacity, and bear the cost of that inefficiency alone.
I suppose that the tools and technologies that enable cloud computing can be applied to private data centers to improve operational efficiency, and make the IT staff more responsive to business unit requests, but it seems too much to call this a private cloud.
In fact, this is the exact opposite of how Amazon Web Services came into being. The story is that they built standardized platforms to support their internal requirements, then realized that others could benefit from these standardized capabilities. I am all for making your internal operations be more flexible, but using these technologies internally and calling it a private cloud is like drinking alone and calling it a private party.