Virtualisation & Containers — VMs, Hypervisors & Docker

A single physical server running at 5% CPU utilisation is waste. Run ten virtual machines on it and that waste becomes useful work. Virtualisation is the technology that makes this possible — and it underpins every public cloud, every CI pipeline, and every container runtime in use today.

What Virtualisation Is#

Virtualisation creates an abstraction layer between software and hardware. Instead of running directly on physical resources, an operating system runs inside a virtual machine (VM) — a software emulation of a complete computer. The VM believes it has its own CPU, RAM, disk, and network card. The physical hardware may be shared among dozens of such VMs.

Why it matters:

Server consolidation — run many workloads on one machine instead of many lightly loaded machines
Isolation — a compromised VM cannot affect other VMs on the same host
Portability — capture an entire OS + app as an image; move it anywhere
Cloud computing — AWS EC2, Google Compute Engine, Azure VMs are all VMs on bare-metal servers
Testing — spin up a fresh OS in seconds, destroy it when done, leave no trace

The Hypervisor#

The software that creates and manages VMs is the hypervisor (also called a Virtual Machine Monitor, VMM). It sits between the hardware and the guest operating systems, multiplexing physical resources among VMs.

Type 1 — Bare-Metal Hypervisors#

Run directly on the physical hardware, replacing the host OS entirely.

Examples:

VMware ESXi — dominant in enterprise data centres
Microsoft Hyper-V — ships with Windows Server
KVM (Kernel-based Virtual Machine) — built into the Linux kernel; technically a Type 1 because Linux becomes the hypervisor

Type 1 hypervisors have lower overhead (no host OS in the path) and are used in production cloud environments.

Type 2 — Hosted Hypervisors#

Run as an application on top of a conventional host OS.

Examples:

VirtualBox — open source, cross-platform
VMware Workstation / Fusion
QEMU (without KVM acceleration)

Type 2 is easier to install and use but has more overhead. Used for development, testing, and running a different OS on your laptop.

How a Hypervisor Virtualises CPU, Memory, and I/O#

CPU Virtualisation#

The challenge: guest OS code runs in ring 0 (kernel mode) inside the VM, but ring 0 inside the VM is not really ring 0 on the physical CPU — that's reserved for the hypervisor.

Full virtualisation (binary translation): The hypervisor scans guest kernel instructions and rewrites privileged ones on the fly to trap into the hypervisor. VMware's original approach. Works with unmodified guest OSes. Slow.

Hardware-assisted virtualisation (VT-x / AMD-V): Intel and AMD added a new ring level below ring 0 (VMX root mode). The hypervisor runs there. The guest OS runs in ring 0 inside a hardware-isolated "guest context." Privileged instructions trap automatically to the hypervisor without software translation. This is the dominant approach today — near-native CPU performance.

Memory Virtualisation#

Guest OS uses virtual addresses → guest physical addresses → host physical addresses. Two levels of translation.

Shadow page tables: Hypervisor maintains page tables that map guest virtual addresses directly to host physical addresses. Must be kept in sync with guest page tables — expensive.

Hardware-assisted (EPT / NPT): Intel Extended Page Tables / AMD Nested Page Tables handle both levels in hardware. The MMU walks two page tables automatically. This dropped memory virtualisation overhead from ~30% to ~5%.

I/O Virtualisation#

I/O is the hardest to virtualise. Options:

Full emulation: Hypervisor emulates a standard device (e.g., an Intel e1000 NIC) in software. Guest uses unmodified drivers. Very slow for disk/network.
Para-virtualised drivers (virtio): Guest OS is modified to use hypervisor-aware drivers that communicate via shared memory queues rather than emulating physical hardware. Much faster. VirtIO is the standard para-virtualisation interface on Linux/KVM.
SR-IOV (Single Root I/O Virtualisation): Physical device (NIC, GPU) exposes multiple virtual functions directly to VMs. Near-native performance.

Full Virtualisation vs Para-Virtualisation#

Aspect	Full Virtualisation	Para-Virtualisation
Guest OS modification	None (runs unmodified)	Required (virtio drivers)
Performance	Near-native with HW assist	Faster for I/O-heavy workloads
Compatibility	Any OS	Only OSes with virtio support
Examples	VMware, Hyper-V (for Windows)	KVM with virtio, Xen PV

In practice: CPU/memory are hardware-virtualised (near-native), while I/O uses para-virtual drivers (virtio) for performance.

Containers vs Virtual Machines#

Containers take a fundamentally different approach. Instead of virtualising hardware and running separate OS instances, containers share the host OS kernel and isolate only the user-space environment.

Property	VM	Container
Kernel	Separate per VM	Shared (host kernel)
Isolation	Strong (hardware)	Process-level (namespaces)
Overhead	High (GBs of RAM per VM)	Low (MBs)
Start time	Tens of seconds	Milliseconds
Security boundary	Very strong	Weaker (kernel exploit → all containers)
Use case	Different OSes, strong isolation	Microservices, CI/CD, scale-out

Linux Namespaces — The Foundation of Containers#

Containers are not a new kernel feature — they are built from composing existing Linux primitives. Namespaces isolate specific aspects of the system view for a set of processes.

There are 7 namespace types:

Namespace	Isolates	Each container sees...
PID	Process IDs	Its own PID 1; cannot see host PIDs
NET	Network interfaces, routing, sockets	Its own `eth0`, IP address, routing table
MNT	Mount points, filesystem tree	Its own `/` (different from host's `/`)
IPC	System V IPC, POSIX message queues	Separate IPC objects
UTS	Hostname and NIS domain name	Its own hostname
USER	User and group IDs	Can be root (UID 0) inside but unprivileged outside
CGROUP	cgroup root directory	Its own cgroup hierarchy view

bash

Loading editor…

When Docker creates a container, it calls clone(CLONE_NEWPID | CLONE_NEWNET | CLONE_NEWNS | ...) to create a new process in fresh namespaces. The container's PID 1 is just a regular process on the host — but inside the container it looks like the init process of a standalone machine.

cgroups — Resource Limits Per Container#

Namespaces provide isolation (what you can see). Control Groups (cgroups) provide resource control (how much you can use).

cgroups allow the kernel to:

Limit CPU usage (e.g., max 2 cores out of 32)
Limit memory usage (e.g., max 512 MB; OOM-kill the container if exceeded)
Limit disk I/O bandwidth (e.g., max 100 MB/s writes)
Limit network bandwidth
Account for resource usage per container

bash

Loading editor…

Docker translates --cpus 0.5 --memory 256m into exactly these cgroup writes.

Docker Concepts#

Docker packages applications into containers using a layered filesystem.

Key Concepts#

Image — a read-only template. Contains the filesystem layers (base OS, dependencies, app code). Stored in a registry (Docker Hub, ECR, GCR).

Container — a running instance of an image. Each container adds a thin read-write layer on top of the image layers. Containers are ephemeral — stop and remove a container and the write layer is gone.

Layer — each Dockerfile instruction that modifies the filesystem creates a new layer. Layers are content-addressed and shared: if two images use the same Ubuntu base, they share those layers on disk.

text

Loading editor…

A Minimal Dockerfile#

dockerfile

Loading editor…

Each instruction explanation:

FROM — sets the base image; every image inherits from something (ultimately from scratch)
WORKDIR — all subsequent commands run relative to this path inside the image
COPY requirements.txt . — copy only requirements first to exploit layer caching
RUN pip install — executes a shell command; the result is baked into a new layer
COPY . . — copy app code; separate from requirements so code changes don't bust the pip cache
EXPOSE — documents which port the app uses; you still need -p 8080:8080 to publish it
CMD — default command when a container starts; can be overridden at docker run time

bash

Loading editor…

The Union Filesystem (OverlayFS)#

Linux's OverlayFS makes layers efficient. Multiple read-only layers are stacked, with a single read-write layer on top. Reading a file checks the top layer first, then lower layers in order. Writing creates a copy of the file in the top layer (copy-on-write). This is why containers start instantly — the image layers are already on disk, and only a new empty top layer is created.

When to Use VMs vs Containers#

Situation	Recommendation
Running a different OS (Windows app on Linux host)	VM
Strong security isolation required (multi-tenant, untrusted code)	VM
Microservice deployments on same OS	Container
CI/CD pipelines, ephemeral build environments	Container
Legacy app that requires its own kernel version	VM
High-density deployments (100s of isolated services)	Container
Desktop virtualisation (developer running macOS + Linux)	VM
Kubernetes / container orchestration	Container

In modern infrastructure, VMs and containers are used together: cloud providers offer VMs (EC2 instances), and inside those VMs Kubernetes runs containers. VMs provide the hardware security boundary between customers; containers provide density and fast iteration inside each customer's VM.

Key Takeaways#

Type 1 hypervisors run on bare metal (KVM, ESXi); Type 2 run on a host OS (VirtualBox)
Hardware extensions (Intel VT-x, AMD-V) make CPU virtualisation near-native
Para-virtualised I/O drivers (virtio) are faster than emulating physical devices
Containers share the host kernel; isolation comes from Linux namespaces and cgroups
Namespaces control visibility (PID, network, filesystem); cgroups control resource usage
Docker layers (OverlayFS) + content addressing make container startup almost instant
Use VMs for strong isolation or different OSes; use containers for density and speed

Virtualisation & Containers — VMs, Hypervisors & Docker

What Virtualisation Is#

Why it matters:

Server consolidation — run many workloads on one machine instead of many lightly loaded machines
Isolation — a compromised VM cannot affect other VMs on the same host
Portability — capture an entire OS + app as an image; move it anywhere
Cloud computing — AWS EC2, Google Compute Engine, Azure VMs are all VMs on bare-metal servers
Testing — spin up a fresh OS in seconds, destroy it when done, leave no trace

The Hypervisor#

Type 1 — Bare-Metal Hypervisors#

Run directly on the physical hardware, replacing the host OS entirely.

Examples:

VMware ESXi — dominant in enterprise data centres
Microsoft Hyper-V — ships with Windows Server
KVM (Kernel-based Virtual Machine) — built into the Linux kernel; technically a Type 1 because Linux becomes the hypervisor

Type 1 hypervisors have lower overhead (no host OS in the path) and are used in production cloud environments.

Type 2 — Hosted Hypervisors#

Run as an application on top of a conventional host OS.

Examples:

VirtualBox — open source, cross-platform
VMware Workstation / Fusion
QEMU (without KVM acceleration)

Type 2 is easier to install and use but has more overhead. Used for development, testing, and running a different OS on your laptop.

How a Hypervisor Virtualises CPU, Memory, and I/O#

CPU Virtualisation#

The challenge: guest OS code runs in ring 0 (kernel mode) inside the VM, but ring 0 inside the VM is not really ring 0 on the physical CPU — that's reserved for the hypervisor.

Memory Virtualisation#

Guest OS uses virtual addresses → guest physical addresses → host physical addresses. Two levels of translation.

Shadow page tables: Hypervisor maintains page tables that map guest virtual addresses directly to host physical addresses. Must be kept in sync with guest page tables — expensive.

I/O Virtualisation#

I/O is the hardest to virtualise. Options:

Full emulation: Hypervisor emulates a standard device (e.g., an Intel e1000 NIC) in software. Guest uses unmodified drivers. Very slow for disk/network.
Para-virtualised drivers (virtio): Guest OS is modified to use hypervisor-aware drivers that communicate via shared memory queues rather than emulating physical hardware. Much faster. VirtIO is the standard para-virtualisation interface on Linux/KVM.
SR-IOV (Single Root I/O Virtualisation): Physical device (NIC, GPU) exposes multiple virtual functions directly to VMs. Near-native performance.

Full Virtualisation vs Para-Virtualisation#

Aspect	Full Virtualisation	Para-Virtualisation
Guest OS modification	None (runs unmodified)	Required (virtio drivers)
Performance	Near-native with HW assist	Faster for I/O-heavy workloads
Compatibility	Any OS	Only OSes with virtio support
Examples	VMware, Hyper-V (for Windows)	KVM with virtio, Xen PV

In practice: CPU/memory are hardware-virtualised (near-native), while I/O uses para-virtual drivers (virtio) for performance.

Containers vs Virtual Machines#

Property	VM	Container
Kernel	Separate per VM	Shared (host kernel)
Isolation	Strong (hardware)	Process-level (namespaces)
Overhead	High (GBs of RAM per VM)	Low (MBs)
Start time	Tens of seconds	Milliseconds
Security boundary	Very strong	Weaker (kernel exploit → all containers)
Use case	Different OSes, strong isolation	Microservices, CI/CD, scale-out

Linux Namespaces — The Foundation of Containers#

Containers are not a new kernel feature — they are built from composing existing Linux primitives. Namespaces isolate specific aspects of the system view for a set of processes.

There are 7 namespace types:

Namespace	Isolates	Each container sees...
PID	Process IDs	Its own PID 1; cannot see host PIDs
NET	Network interfaces, routing, sockets	Its own `eth0`, IP address, routing table
MNT	Mount points, filesystem tree	Its own `/` (different from host's `/`)
IPC	System V IPC, POSIX message queues	Separate IPC objects
UTS	Hostname and NIS domain name	Its own hostname
USER	User and group IDs	Can be root (UID 0) inside but unprivileged outside
CGROUP	cgroup root directory	Its own cgroup hierarchy view

bash

Loading editor…

cgroups — Resource Limits Per Container#

Namespaces provide isolation (what you can see). Control Groups (cgroups) provide resource control (how much you can use).

cgroups allow the kernel to:

Limit CPU usage (e.g., max 2 cores out of 32)
Limit memory usage (e.g., max 512 MB; OOM-kill the container if exceeded)
Limit disk I/O bandwidth (e.g., max 100 MB/s writes)
Limit network bandwidth
Account for resource usage per container

bash

Loading editor…

Docker translates --cpus 0.5 --memory 256m into exactly these cgroup writes.

Docker Concepts#

Docker packages applications into containers using a layered filesystem.

Key Concepts#

Image — a read-only template. Contains the filesystem layers (base OS, dependencies, app code). Stored in a registry (Docker Hub, ECR, GCR).

text

Loading editor…

A Minimal Dockerfile#

dockerfile

Loading editor…

Each instruction explanation:

FROM — sets the base image; every image inherits from something (ultimately from scratch)
WORKDIR — all subsequent commands run relative to this path inside the image
COPY requirements.txt . — copy only requirements first to exploit layer caching
RUN pip install — executes a shell command; the result is baked into a new layer
COPY . . — copy app code; separate from requirements so code changes don't bust the pip cache
EXPOSE — documents which port the app uses; you still need -p 8080:8080 to publish it
CMD — default command when a container starts; can be overridden at docker run time

bash

Loading editor…

The Union Filesystem (OverlayFS)#

When to Use VMs vs Containers#

Situation	Recommendation
Running a different OS (Windows app on Linux host)	VM
Strong security isolation required (multi-tenant, untrusted code)	VM
Microservice deployments on same OS	Container
CI/CD pipelines, ephemeral build environments	Container
Legacy app that requires its own kernel version	VM
High-density deployments (100s of isolated services)	Container
Desktop virtualisation (developer running macOS + Linux)	VM
Kubernetes / container orchestration	Container

Key Takeaways#

Type 1 hypervisors run on bare metal (KVM, ESXi); Type 2 run on a host OS (VirtualBox)
Hardware extensions (Intel VT-x, AMD-V) make CPU virtualisation near-native
Para-virtualised I/O drivers (virtio) are faster than emulating physical devices
Containers share the host kernel; isolation comes from Linux namespaces and cgroups
Namespaces control visibility (PID, network, filesystem); cgroups control resource usage
Docker layers (OverlayFS) + content addressing make container startup almost instant
Use VMs for strong isolation or different OSes; use containers for density and speed

Virtualization — Hypervisors, Containers & VMs

Virtualisation & Containers — VMs, Hypervisors & Docker

What Virtualisation Is#

The Hypervisor#

Type 1 — Bare-Metal Hypervisors#

Type 2 — Hosted Hypervisors#

How a Hypervisor Virtualises CPU, Memory, and I/O#

CPU Virtualisation#

Memory Virtualisation#

I/O Virtualisation#

Full Virtualisation vs Para-Virtualisation#

Containers vs Virtual Machines#

Linux Namespaces — The Foundation of Containers#

cgroups — Resource Limits Per Container#

Docker Concepts#

Key Concepts#

A Minimal Dockerfile#

The Union Filesystem (OverlayFS)#

When to Use VMs vs Containers#

Key Takeaways#

Virtualization — Hypervisors, Containers & VMs

Virtualisation & Containers — VMs, Hypervisors & Docker

What Virtualisation Is#

The Hypervisor#

Type 1 — Bare-Metal Hypervisors#

Type 2 — Hosted Hypervisors#

How a Hypervisor Virtualises CPU, Memory, and I/O#

CPU Virtualisation#

Memory Virtualisation#

I/O Virtualisation#

Full Virtualisation vs Para-Virtualisation#

Containers vs Virtual Machines#

Linux Namespaces — The Foundation of Containers#

cgroups — Resource Limits Per Container#

Docker Concepts#

Key Concepts#

A Minimal Dockerfile#

The Union Filesystem (OverlayFS)#

When to Use VMs vs Containers#

Key Takeaways#