Comprehending the different levels of the cloud

When explaining the cloud to my relatives, I typically explain it as "people running computers on your behalf that you can reach over the internet". That's enough to get the point across that the cloud is just someone else managing compute resources for you in some remote location. But for software engineers, it is a bit more nuanced. It's so nuanced, in fact, that I personally have a hard time figuring out the abstraction levels of the whole thing sometimes. To resolve this I am writing this blog post as my attempt to explain to myself and others what the various levels of the cloud are to engineers who need to have some code running somewhere and reachable over the internet.

To start, there is no cloud. The absolute lowest level of serving a resource over the internet is just the proverbial computer under your desk. You provide the internet connection, the electricity, and you manage the computer from top to bottom, hardware to software.

The next level is having a colocation facility host your machine. In that instance you provide the machine and maintain it, but the colo gives you electricity and internet. And in some instances you can push this farther out and have the colo even provide the machine, but you still administer the software and ask the colo to do hardware stuff for you, e.g. swap out a hard drive.

Where I think the cloud really starts is with virtual machines (VMs). This puts you into the world of infrastructure-as-a-service (IaaS). Basically VMs are OSs running on top of OSs; in VM parlance, a guest OS running on top of a host OS. The benefits of this is that a host OS can be running multiple guest OSs. That allows for a cloud provider to shuffle VMs around for optimal resource utilization. It's this ability to shuffle VMs between physical machines that you don't control that makes VMs the bottom rung of the cloud. In terms of maintenance burden, at the VM level you still manage all the software but none of the hardware.

The next level up is containers like Docker. This is somewhat like VMs but instead of necessarily administering the entire OS, you are running on top of a container engine. This is lighter-weight than VMs and so more containers per machine and thus promotes running various services as individual containers that you connect to remotely from other containers. Basically containers create a diff of the filesystem so that you ship around less stuff compared to a VM which needs to cart around the entire OS. And by only storing the diff of the filesystem you can simply apply it on a new OS image regularly and thus let someone else manage the base OS image and not worry about lower-level sysadmin details of keeping the system up-to-date (which is why people are so excited about containers vs. VMs; plenty of control without all the nitty-gritty maintenance burden and cheaper to run). This is still IaaS, although some are trying to make container-as-a-service (CaaS) happen (Mean Girls reference).

Lastly is platform-as-a-service (Paas), which is the top of the cloud abstraction hierarchy. This is where Heroku and App Engine live. If you take the library/framework analogy from software (a library is something that your code calls into while a framework is something that calls your code), then IaaS is a library for the cloud and PaaS is a framework for the cloud; PaaS providers basically ask you to give them code to run on your behalf and then they handle all the other details for you. This does mean you have to play by the rules and restriction for your PaaS provider, but it also means you don't need nearly as much support staff for your application because you don't ever touch the OS image, load balancing, etc. (although IaaS providers like Google are starting to have services like Container Engine to automatically spin up and manage containers using Kubernetes).

Some would also toss in software-as-a-service as part of the cloud (SaaS). This is when a company gives you an API over the internet and then handle everything else for you. But since I'm focusing this blog post on the various ways of hosting your code and you can use SaaS from any level of the cloud abstraction hierarchy, I consider it a tangential topic.

So that's the cloud at its various levels as best as I can decipher it as someone not in the cloud game anymore (disclaimer: I used to be on the App Engine team at Google). PaaS provides the easiest solution with the least flexibility (if you happen to even need that flexibility); you write your code and your PaaS provider gives you a storage solution, load-balancing, etc. Containers will very likely give you the flexibility you need if PaaS doesn't work for you, and allows for some level of abstraction; you can spin up a container for a datastore, a container for your app front-end, and then either you or your IaaS provider will handle load balancing. And if you need absolute control over the OS while not wanting to manage hardware, VMs is the base abstraction of the cloud and what you will need to use.