Probably more than anything we’ve seen in IT since the invention of timesharing or the introduction of the PC, cloud computing represents a paradigm shift in the delivery architecture of information services. This overview presents some of the key topics discussed during the ACM Cloud Computing and Virtualization CTO Roundtables of 2008. While not intended to replace the in-depth roundtable discussions, the overview summarizes the fundamental issues generally agreed upon by the panels and should help readers to assess the applicability of cloud computing to their application areas.—Mache Creeger, Moderator, CTO Roundtable Series
Google, Yahoo, Amazon, and others have built large, purpose-built architectures to support their applications and taught the rest of the world how to do massively scalable architectures to support compute, storage, and application services.
Cloud computing is about moving services, computation and/or data—for cost and business advantage—off-site to an internal or external, location-transparent, centralized facility or contractor. By making data available in the cloud, it can be more easily and ubiquitously accessed, often at much lower cost, increasing its value by enabling opportunities for enhanced collaboration, integration, and analysis on a shared common platform.
Cloud computing can be divided into three areas:
Enabling technologies. The following precursor technologies enabled cloud computing as it exists today:
In the past, developing an application service required a large CapEx (capital expense) to build infrastructure for peak service demand before deployment. The risk of a service’s success combined with the operational requirement of a large CapEx investment severely restricted funding. Cloud computing addresses this problem by allowing expenses to track closely with resource use, thus following income rather than having to purchase for peak capacity before income is realized. Running application services on a cloud platform accomplishes this in three fundamental ways:
It moves CapEx to OpEx (operational expense), closely correlating expenses with resource use.
It allows service owners to eliminate significant system-administration head count by avoiding the need for internally purchased servers.
It smooths the path to service scaling by not requiring the CapEx-intensive architectural changes needed to scale up service capacity in the event of service success.
Because the cost of deploying new services is much lower and expenses track real usage, businesses can develop and deploy more services without fear of writing off huge capital investments for dedicated infrastructure that may never be needed. While start-ups are more focused on cost, enterprises are equally focused on flexibility to make required service changes and achieve maximum agility. Some Silicon Valley start-ups are able to go completely without infrastructure, instead using outside services for e-mail, Internet, phone, and source control. This allows the start-up to focus all of its resources on its core differentiating efforts.
Large-scale multitenancy achieves significant economic advantage. Sharing the resources and purchasing power of very large-scale multitenant data centers provides an economic advantage. As an example, a major engineering services company’s current internal cost to provide a gigabyte of managed storage is $3.75 per month, while Amazon charges 10 to 15 cents per month. Initially, ISP charges at the company were $3,500 per megabyte per month. After examining the cost structure of companies such as YouTube, the engineering services company assumed that YouTube’s costs were in the teens. Taking advantage of network peering arrangements and consolidating the company’s interfaces to a place close to the ISP’s POP (point-of-presence) have brought costs down to YouTube levels.
Broad use of virtualization has also significantly reduced the company’s data-center CapEx. Prior to virtualization, server utilization was between 2 and 3 percent and total data-center floor space was around 35,000 square feet. With virtualization in widespread use, server utilization is up to 80 percent and server consolidation has shrunk the square footage of the data center to 1,000 square feet.
As more cloud service vendors become available, computing and storage will become a true commodity with fine-grained pricing models, complete with arbitrage opportunities, similar to other commodities such as natural gas and electric power. Under a cloud model, pricing is based on direct storage use and/or the number of CPU cycles expended. It frees service owners from coarser-grained pricing models based on the commitment of whole servers or storage units.
Transforming high fixed-capital costs to low variable expenses. Setting up an internal cloud within a company provides an efficient service platform while placing a limit on internal capital expenditures for IT infrastructure. An external cloud service provider can supply overflow service capacity when demand increases beyond internal capacity.
Previously, companies had to operate servers for projects even though they might never be invoked—such as servicing warranties. A cost of $800 to $1,000 per month is not unreasonable to have a server idle on the data-center floor. By moving these projects to an external IaaS vendor, those functions can be placed in the cloud and the service run only when required, at pennies per CPU hour. In this way, companies can transform what was a high fixed cost into a very low variable one.
Flexibility. For large enterprises, the ease of deploying a full service set without having to set up base infrastructure to support it can be even more attractive than cost savings. Bechtel must set up new engineering centers with very little notice worldwide. Using internal cloud IT resources, Bechtel can now set up these centers to be fully functional within 30 days.
Smoother scalability path. For application architectures that easily scale with added hardware and infrastructure resources, cloud computing allows for many single services to scale over a wide demand range. Animoto’s service started with 50 instances on Amazon. Because of its popularity, it was able to meet soaring demand and scale to 3,500 instances within three days.
It is not a given, however, that all application architectures scale that easily. Databases are a good example of hard-to-scale applications—hence, the widespread use of programs such as Amazon’s SimpleDB.
Self-service IT infrastructure. Cloud-computing service models are often self-service, even in internal models. Previously, you had to partner with IT to develop your application, provide an execution platform, and run it. Now, much like Amazon, IT departments define use policies for automated platform and infrastructure services with line-of-business-owners developing applications on their own to meet those requirements.
Severely reduced disaster recovery cost. Most SMBs (small- to medium-size businesses) make no investment in DR (disaster recovery). By enabling VMs (virtual machines) to be sent to the cloud for access only when needed, virtualization becomes a cost-effective DR mechanism. Typical DR costs are 2N (twice the cost of the infrastructure). With a cloud-based model, true DR is available for 1.05N, a significant savings. Additionally, because external cloud service providers replicate their data, even the loss of one or two data centers will not result in lost data.
Common application platform enables third parties to add value. While telcos are moving to cloud platforms for cost effectiveness, they also see opportunities resulting from a common application platform. By allowing third parties to use their platforms, telcos can deploy services that either extend the telco’s services or operate independently.
Increased automation. Amazon sees automation as a significant benefit of a cloud services model. Moving into the cloud requires a much higher level of automation because moving off-premises eliminates on-call system administrators.
Release from ABI and operating-system dependencies and restrictions. Amazon also sees cloud computing as a way of releasing data centers from the need to support the ABI (application binary interface) and operating-system requirements of key applications. With EC2, Amazon provides five popular VMs to choose from: three flavors of Linux, OpenSolaris, and Windows Server. Its only concern is effectively running the VMs; it does not have to be involved with the VM’s internal operations.
MapReduce enables new services. Although not the most cost-efficient way of providing data-warehouse functionality, MapReduce’s use of a large parallel-processing resource has enabled a number of companies to provide cloud-based data-warehousing services. This frees customers from having to invest in large specialty hardware purchases for small service requirements. MapReduce is expected to enable additional service types that were once limited to dedicated hardware.
How you deal with the distance between computation and data depends heavily on application requirements. If you need to minimize expensive bandwidth, then you should find a way to keep the two in proximity. In cases where bandwidth is expensive and the distances cannot be shortened, it may make sense to download an extract of the data to work on it locally. Longer term, it would be best if developers could write the application in such a way that it would dynamically adjust its data-access mechanisms in response to the operational context (bandwidth cost, bandwidth latency, security, legal data location requirements, etc.).
A common concern about using an external cloud service provider is that it will make data less secure. Because of the wide quality range of corporate IT security, trusting information assets to a recognized cloud service provider could very easily increase the security of those assets. Given that many corporate data centers struggle to fund, architect, and staff a complete security architecture, and that cloud service providers provide IT infrastructure as their primary business and competence, clouds could possibly increase security for the majority of their users. Moreover, 75 to 80 percent of intellectual property breaches are a result of attacks made inside the company, which would not impact a decision to use clouds one way or the other.
LOVE IT, HATE IT? LET US KNOW
© 2009 ACM 1542-7730/09/0600 $10.00.
Originally published in Queue vol. 7, no. 5—
see this item in the ACM Digital Library
Ivan Beschastnikh, Patty Wang, Yuriy Brun, Michael D, Ernst - Debugging Distributed Systems
Challenges and options for validation and debugging
Sachin Date - Should You Upload or Ship Big Data to the Cloud?
The accepted wisdom does not always hold true.
George Neville-Neil - Time is an Illusion.
Lunchtime doubly so. - Ford Prefect to Arthur Dent in "The Hitchhiker's Guide to the Galaxy", by Douglas Adams
Andrew Brook - Evolution and Practice: Low-latency Distributed Applications in Finance
The finance industry has unique demands for low-latency distributed systems.