Containers, Virtual Machines and Docker 101
What is a container image? Why do we use Docker and how does it differ from Virtual Machines, what about other container systems? Here’s our take!
We’ve been diligently dogging our DevOps team, asking them random questions (that even sometimes were about containers!) and trying to be as helpful and unintrusive as possible! Yep, we sure made a good team!
And while we were doing our research into Kubernetes, we also gleamed a few things about containers, images, docker, VMs and other nifty stuff.
Besides, if we want to begin explaining Kubernetes from the technical perspective, there’s no better way than to start with something else – VMs and Docker!
With that out the way let’s get started – Started with the basics!
What is a Virtual Machine?
So, virtualization, what do we need it for? Well, to answer that, lets look at a typical computer.
We have our hardware, which composes from CPU, RAM, GPU, Storage, NIC and other, but it would all be inoperable without an operating system (OS)! An OS acts as an intermediary between your hardware and your applications!
I personally use Windows 10 and have my eyesight on the upcoming Windows 11, which looks sleek as hell! But we digress.
Now what happens if you want to use a Linux based OS for your development, or base your web-app on it? Do you buy a new computer/server? You can of course, but there’s a better way! Let’s digitalize your hardware into a Virtual Machine!
Virtual Machines (Aka VMs) are no different from your tangible computer, phone, laptop, or server; a VM has its own CPU, storage, memory, and access to the internet.
VM’s are typically software-based versions of a computer stored in a file, which is typically called an Image. A VM image in turn is sort of a blueprint, a set of instructions on how exactly to assemble the code and achieve a desired software configuration.
Disclaimer: There are two types of Virtual machines: “Process VMs” and “System VMs”.
Process Virtual Machine enables a single process to run as an app on a host OS, acting as a base platform to emulate the required environment. Example of this is the Java Virtual Machine (JVM) allowing to run Java applications on any operating system as if it were native.
System Virtual Machines is a fully virtual version of a physical thing, borrowing CPU, storage, memory etc. The main thing we’ll be talking about in this and later articles.
To create a System VM you need to use a Hypervisor.
Hypervisor? Types of Hypervisors
Now, to create a Virtual machine you need a Hypervisor. In essence, a Hypervisor is a VMM – a Virtual Machine Manager. There are two types of Hypervisors, type 1 also known as “bare-metal” or “Native”, and type 2 known as “hosted”.
Type 1 “Native” – This hypervisor is installed as any other OS, directly on top of the “bare-metal” of your hardware and has direct and unencumbered access to all its resources.
Type 2 “Hosted” – The hypervisor is installed as a program onto your “Host OS” and functions more or less like an application, creating and “borrowing” resources from its host, any VMs created this way are called “Guest OS”.
The most glaring difference between these two types is the fact that type 1 is a lightweight operating system of its own. Whereas type 2 runs as a software layer, which results in “latency”.
Most big enterprises when using VMs in their infrastructure, primarily build it with type 1 hypervisor, to avoid this latency, or “extra-layer” of communication that goes through your OS, this is especially important when using/setting up any cloud like service.
Type 2’s, however, are generally used by developers or simple users to either emulate applications or “test” their software, where latency is not that important.
Some really great type 1 enterprise level hypervisors are:
VMware’s vSphere – Available as free, or commercial versions, VMware’s ESXi hypervisor is considered among the best options out there.
Citrix XenServer – An open-source-based toolkit and platform for manging Virtual machines among other excellent features!
KVM – Most known open-source Linux based hypervisor, the “Kernel-Virtual-Machine” is a full solution for Linux x86 hardware.
RHEV – The Red Hat Enterprise Virtualization, is Red Hat’s commercial implementation of the KVM hypervisor. Building upon and improving existing structure of the base KVM.
What is a Container?
“Containers” are software executable bundles where all the bins, libraries, and dependencies are packaged alongside their code under a standardized framework. Creating lightweight images of applications that can be run anywhere, desktop, cloud, and anything in between.
Containers achieve this by exploiting three awesome Linux kernel features – “namespaces”, “cgroups” and “seccomp-bpf”. These enable you to emulate a virtual machine, without all the pesky virtualization of the hardware or anything else.
In short: “namespaces” allows you to cut up an operating system into different components and create isolated workspaces. “cgroups” fine-tune and control the cut-up pieces by allotting how much CPU or memory they have access to. And “seccomp-bpf” restricts what system calls they can make!
All this achieves is avoid the dreaded “Well it works on my machine” when a developer ports his code into a new environment. By packaging it all together into a container, it will work.
Think of containers as a means to package any process/app you have, regardless of what programming language it’s built on, and send it to your team. Build a simple “contact form”, package it as a container, and only thing that matters then, is if it works, and it has proper output.
What is Docker
And though the idea is decades old, Docker Engine with its release in 2013, exploded the popularity and use of containers, so much so that many people, to this day, still use “docker” and “containers” interchangeably.
Simply put Docker is a container management software and developer toolkit, allowing developers to construct the images, from which containers are built and deployed.
A “Docker file” is a straightforward text file where it all starts (every container requires this). Simply put it’s a list of CLI (Command-line interface) instructions which Docker Engine performs in given order, to assemble Docker Images.
This is our “blueprint” and source code for all the binaries, libraries, tools, dependencies etc. that our container requires to function as a self-sufficient application.
When Docker Engine runs a docker image it becomes a container (or several). And much like any blueprint, the same image can be the base for multiple containers that will share all commonalities of their stack.
Every time you add new instructions (images) to be applied to your container, it creates a new “layer” of images on top of the container layer, these correspond to their original versions and are saved for rollbacks or re-use in other projects.
This is what we ultimately build, our application, completely interactable by users or admins. Remember Docker Images are simply – “read-only” sets of instructions to build a container, and thus are not actual functional versions of our app.
Here we have the brain of our operations when it comes to docker. Though this is not an orchestration tool, but a service running in the background that “does” most of our Docker operations, it also keeps track of them, cataloguing and assigning them with proper tags.
Though you can totally build your docker images from scratch or commit them from existing containers that you got a hold of, why do that when it’s already done.
DockerHub is a massive repository (100.000 container images+) of docker images and containers, sourced from commercial software vendors to open-source projects and down to individual devs.
All users of DockerHub have access to all images in the repository to help them with their containerization project, or simply as a reference point when developing your own containers for your unique service.
Docker Containers vs VMs (and Hypervisors)
Another way to understand containers is to directly juxtapose them against VMs. Though both offer increased efficiency and speed for applications, they achieve this in different ways.
- Functionality for apps regardless of the “host” operating system.
- High portability, everything is packaged together, so it’s quite easy to transfer these apps from project to project.
- Higher degree of decoupling.
- Offer benefits of completely isolated, yet fully functional multiple operating systems.
- Ability to run multiple applications at once, on one operating system.
- Support for legacy systems and monolithic software structures.
So, while Virtual Machines virtualize the hardware layer to create a “computer” to then run something, containers package lightweight instances of an app with all the tools for it to run, and merely borrow the exact required “power” from your hardware.
Containers provide the same self-isolation (or quarantine) as VMs, but they can also be set up to interact with each other, or even use each other as dependencies, however that is unadvisable.
But the absolute key benefit is their lighter weight and self-sufficiency, containers boot time is measured in seconds, while VM is much, much longer.
Virtual Machines are larger, beefier and slower to boot, but they are better suited to running monolithic apps, or running multiple apps from the same OS, or emulating older systems to run legacy apps that for one or another reason must be active.
Alternatives to Docker
Even though Docker is the leading containerization solution there is, it is not the only one, and of course it has its own set of drawbacks.
That is why there are other options out there and is one of the reasons why Kubernetes for instance, decided to create their own engine for containerization, hence the deprecation as of version 1.20.
As we’ve mentioned before, Linux’s kernel features were arguably what made containerization what it is, and you can certainly achieve the same functionality by simply using the Linux’s VServer. One of the oldest virtualization options out there, released back in 2008.
LXD (pronounced – lexdi)
An interesting attempt by the company behind Ubuntu (Canonical Ltd.) to combine Virtual Machine hypervisors and Containers. Released in 2014, it attempts to look at containers with the Operating system lens, think bare-metal hypervisor… but for containers!
Kubernetes’s Container Runtime Interface – a relatively new addition to the containerization environment, though Kubernetes was release in 2014, it was always primarily used as an orchestration tool for the likes of docker.
Obviously, we cannot forget the absolute market giant Microsoft, which created their own container system back in 2016 as a feature of Windows Server.
And that wraps it up, mostly, not really, this was more of a nudging/opening the door into the wonderful world of cloud-native and DevOps when it comes to containers and virtual machines.
Next time, we’ll see how all this works with our Orchestration tool of choice – Kubernetes, and how you can deploy your own application, set it up and run it!
But before you go, tell us which do you prefer Virtual Machines or Containers? Or maybe you have some other way to managing you microservice structures? Tell us in the comments below!
Stay classy business and tech nerds!