This article gives a basic introduction to cloud computing - intended for students. It takes you through the basics: the how, the why, the who and ends with recommendations for further reading, and further hacking
What is Cloud Computing?
Until recently, computing meant a program that ran on a desktop or laptop computer on your desk, or a server in your lab. Or, using the internet, you could use a program that was running on a server somewhere else in the world. But it was always a specific piece of hardware in a specific location that was running the program.
In the context of cloud computing, cloud refers to the internet. And then, cloud computing means that the computing is happening somewhere in the cloud. You don't know where the computing is happening, most of the time, you can't know where it is happening (since it can keep moving around), and the most important factor is that you don't care.
Some service provider is providing you with virtual computers, or virtual disks, or virtual file-systems, or virtual databases, or even higher level constructs (to be described later), and guaranteeing that they will take care of everything related to the virtual hardware that you got - you just need to upload your program and run it.
To understand this better, consider milk. In the old days, everybody had a cow. And they squirted milk out of the cow. And then made butter, buttermilk, paneer, paneer pakodas, and ras-malai from it. But more recently, businesses have sprung up who will deliver you milk in plastic packets at your door, or even butter, paneer pakodas and ras-malai. This model has proved to be so convenient to people (especially those who hate the smell of cow-dung) that very few people now have cows. (If you're a student, and have a cow in your hostel, please let us know - we'd love to hear from you!)
The cow in this example is like the hardware. And milk is the product. Having your own cow is equivalent to traditional computing using your own labs. Getting milk delivered to your door is cloud computing.
How is Cloud Computing Implemented?
Cloud computing largely depends upon virtualization technology. Virtualization refers to the technique in which all the capabilities of a piece of hardware are faithfully reproduced in a software program. So, for example, a virtual machine has a virtual CPU, virtual memory and a virtual disk. The virtual CPU might emulate, for example, an Intel x86 chip, and then it is able to take an executable file consisting of x86 instructions and execute them all. This virtual machine thus behaves just like a real machine - you can install an Operating System on it, you can boot into the OS, and then install other programs into the OS that you just installed. You can reboot the machine, power it off, and power it on again, just like real machines.
However, the virtual machine is just an executable program that is stored on some large real machine somewhere. When the executable is run, it behaves like a machine. When the program is shutdown, it saves the entire state of the machine (including contents of the virtual RAM, contents of the virtual disk, contents of the CPU registers, etc) on a file on the real disk of the real machine.
The most interesting thing is that the program and the data of the virtual machine can be copied to another real machine, and when the program is run there, it will behave exactly like the first virtual machine, and continue executing from exactly where it left off. Thus, a key feature of virtual machines are that they can be moved from one real machine to another, and in fact, from one geographic location to another, with very little effort. Advanced virtualization techniques allow this kind of virtual machine migration to be done without requiring the virtual machines to be shutdown - i.e. the virtual machine can move from one location to another while the programs inside the virtual machines continue to run uninterrupted.
VMWare is the leader in building virtual machine software, and Xen is the most important open source alternative to it. There are of course many other smaller players into this market.
Types of Cloud Computing - IaaS, PaaS and SaaS
There are different kinds of cloud computing, but before we understand that we need to understand what is computing. Computing really can be broken up into these pieces:
- Disk (File-System, Database)
- Operating System (Linux, Windows, Solaris)
- Software Development Environment (Visual Studio, Java+Eclipse, Ruby on Rails, Python)
- The actual programs/applications that people use (Documents, Spreadsheets, Sales Management Software, Customer Relationship Management Software, Accounting Packages, etc)
Each of the things mentioned above can be 'virtualized' and put in the cloud independently. Thus, Amazon EC2 gives a CPU+memory in the cloud. Amazon EBS gives a disk in the cloud, and S3 gives a file-system in the cloud. Microsoft Azure gives a Visual Studio Development environment in the cloud - so that apps developed using Visual Studio can be run 'in the cloud' without you having to worry about the hardware that it runs on. Similarly Google App Engine gives a Java or Python environment in the cloud where you can run your Java/Python apps and they take care of the hardware. Finally, Google Docs is an example of software in the cloud - you directly create documents, presentations, spreadsheets via your web browser. Microsoft too has Office 365, which is their SaaS offering and includes SaaSified versions of Word, Excel, etc. There is no hardware or software to install.
Depending upon what is being virtualized, we get three types of Cloud Computing:
- IaaS or Infrastructure as a Service: these are various services where the hardware is being virtualized. Virtual machines (i.e. CPU + Memory), virtual disks (e.g. Amazon EBS), virtual file-systems (e.g. Amazon S3), virtual databases (e.g. Google BigTable, Amazon SimpleDB, SQL Azure) are all examples of infrastructure. Basically, these are services that are looking to replace all the hardware infrastructure that sits in your server rooms and labs.
- PaaS or Platform as a Service: these are various services where the software development platform (i.e. programming language, runtime environment, etc.) is being virtualized. Google AppEngine (Java/Python), Microsoft Azure (.NET/Visual Studio) are examples of PaaS. In the cow & milk example, PaaS would be equivalent to getting paneer or khoya delivered to your home. You can use this to cook your own delicious items.
- SaaS or Software as a Service: these are various services that have decided to skip the hardware and software engineers altogether and directly approach the end-user with software that s/he wants to use. In IaaS you can install your own OS and software and use it. In PaaS you can write programs in that platform and run them. In SaaS you need to do nothing. There is ready-made software that you can directly start using. Like SalesForce - software used by sales agents. In the cow & milk example, SaaS is equivalent to home-delivery of cooked food (paneer makhanwala and ras-malai).
Advantages of Cloud Computing
There are a number of advantages Cloud Computing has over the old way of doing things:
- Convenience: Cloud Computing is easy. Not having to deal with real machines, and disk failures, and electricity failures, etc is a huge benefit. Anyone who has had to deal with cleaning cow-dung, going to the vet for treating cow diseases, and complaining neighbours will appreciate the huge convenience of milk packets over having a cow.
- Cost: There are two different cost advantages to cloud computing. Sometimes it is cheaper than the physical alternative. At other times, the advantage comes from the fact that you have to pay small installments every month instead of a large chunk of money when you're buying the infrastructure.
- Cheaper: Usually cloud computing turns out to be cheaper. This is mainly because cloud computing providers are able to share their infrastructure across a large number of customers, giving them economies of scale, and higher utilization. You can't buy just half of a physical server, but the typical IaaS provider sells low-end virtual machines which are roughly equivalent to 1/10th of a server.
- Pay-as-you-go: Imagine you're a startup. Buying a server will cost you $4000. And that's Rs. 2L that you don't really have right now. By contrast, buying compute cycles on Amazon EC2 might cost you $100 per month - which is much more manageable. And at times when you're not really using the server, you shut it off, and don't pay for it. If during a busy month, you need two servers, you get a second server for just one month, and then delete it at the end of the month. Much better than having to buy an entire second server that will be useless after the first month.
- Easy scalability: If you're a growing company, and the demand for computing suddenly increases (for example, if your website is mentioned in TechCrunch and you suddenly get 10,000 new customers), it is very difficult to suddenly scale up your physical infrastructure. That would involve buying new servers, migrating programs, files, and databases. And a whole bunch of other setup. By contrast, IaaS providers provide these services at the click of a button. PaaS and SaaS providers take care of scaling completely, in a manner transparent to you, and you don't even need to think about it.
- Location Independence: A cloud computing service can be used from where-ever you are, whereas most physical infrastructure ties you down to one place.
There are a bunch of other advantages, but this should be enough to keep you happy for now. Check out our further reading section if you want more on this.
Important platforms and players in Cloud Computing
In infrastructure as a service, the clear leader is Amazon. EC2 is its service that gives you virtual machines in the cloud on which you can install whatever operating system you want. Amazon has a bunch of other infrastructure as a service offerings, including EBS (virtual hard disks in the cloud), S3 (a simple service that allows you to store and retrieve files), SimpleDB (a non-relational database in the cloud), Amazon Relational Database (did you guess that this is a relational database in the cloud?). It has further offerings in the form of: messaging, queuing services, caching services, content delivery services, monitoring services, load balancing services, ecommerce/payments/billing services. But we'll leave the discussion about those for another day.
The most well-known instance of Platform as a Service is Google's AppEngine which allows programs to be written in Java or Python (or in fact any language written for the JVM, like Scala, Closure, JRuby), with Google BigTable as the corresponding database offering. SalesForce, which revolutionized cloud computing by showing that software-as-a-service can actually make lots of money, is a force to be reckoned with in PaaS space because of their Force.com platform. This requires programming in Apex (a Java-like programming language). SalesForce has also bought Heroku (a Ruby+Rails PaaS provider) so expect more Ruby/Rails here.
Software as a Service has become so common that it is impossible to make a listing. Pretty much any software that you can think of has a SaaS alternative these days.
How should students get ready for Cloud Computing?
As a student who is soon going to be dumped in the big-bad world, what are the best ways of picking up real-world cloud computing skills? Here are some suggestions:
- Google AppEngine: This is totally free, and you can use Java (so no need to learn a new language). Just go to the Getting Started with Java on Google AppEngine page, and follow the instructions there. In no time, you will have set up your first AppEngine app. After this, build another app - in some area that you find interesting. Maybe something to do with cricket scores. Or Bollywood movie ratings. Or motorbikes. And build an interesting website, fully in AppEngine. If you're lucky, your site will go viral and Google will take care of the scaling, and you won't have to look for a job!
- If you are OK with learning a new language, I would highly recommend building AppEngine apps in Python.
- Amazon EC2: Amazon gives free access to a basic EC2 machine in the cloud for one year. That gives you a full machine of your own, where you are the root user. Get one of these and then use web-based tutorials to do interesting things with this machine - like experimenting with different operating systems (various flavors of linux), installing a firewall, installing a web-server, and other more advanced stuff.
- If you're into Microsoft Technologies, try out Windows Azure free for 3 months (25 hours of compute time) for free. The tools that you'll need to develop the app (Visual Studio, and local Azure simulator for testing your app) are available for free to students who register for the DreamSpark program. Or, I would suggest forming a student group in your college and getting in touch with Microsoft, and convincing them to offer free Azure for students of your college. I would suggest trying to convince Aditee Rele.
This article just scratches the surface. There are lots of interesting areas to do further research. These could include:
- Understanding the strengths and weaknesses of IaaS vs Paas vs SaaS
- Things to worry about when migrating an in-house app to the cloud
- How to set-up auto-scaling of your cloud app (because auto-scaling is really not so auto)
- Failure-proofing your app. How to ensure that you survive even if your cloud computing provider has a failure
- Data Portability and Vendor Lock-In
- How to estimate the costs of your cloud computing infrastructure (since pricing is a nightmare in the cloud)
- How to choose a cloud computing provider
- Worrying about security and local laws
- Private clouds
- Cross-platform cloud computing frameworks. These are libraries that allow you to write apps in a way that the same app can run on different cloud computing services.
- Mobile App Development + Cloud Computing = A match made in heaven
- Disadvantages of cloud computing
Here are some starting points for your further reading:
- More on The Basics
- Latest Trends in Cloud Computing (as of June 2011)
- Current State of Affairs in the Cloud - a presentation on what's hot in Cloud Computing made by Chirag Jog, CTO of Clogeny at the IndicThreads conference on Cloud Computing - Pune, 2011
- The Cloud Ecosystem - a presentation on how various cloud services work with each other, by Manjusha Madabushi, CEO of Talentica Software also at the same IndicTreads conference.
- Cloud Security: Threats and Mitigations - A broad guide on how to think about security in the cloud. This is still a developing field
- iCloud - Hype or Tipping Point? - An overview of Apple's iCloud Offering, analysis of competitive landscape and implications. Read, because you should always watch Apple carefully.
- Further, further reading:
Questions, Comments, Feedback welcome
If you have any questions, doubts, suggestions, please leave a comment below. Note: this is intended to be a simple introduction for students - there are some places where we have over-simplified things - and this is intentional. Those interested in more details are encouraged to check out the links in the further reading section, which will give a more nuanced understanding of the topics.
If you liked this article, please email this to friends, share on facebook, republish on your blog (with attribution and link-back, please). And don't miss future articles from us by subscribing to ReliScore. And/or like us on Facebook