7 min read
You press the power button and seconds later your cursor moves. What happens in between is the most underrated software in the world—and the foundation every pod, container, and cloud VM relies on.
Key Takeaways
- From power button to kernel, there are four stages: Firmware (UEFI) starts the bootloader, which loads the kernel into RAM; the kernel switches the CPU into protected mode and builds its own world. Up to this point, neither files nor processes exist.
- Virtual memory is the greatest lie in computing: Every program sees its own address space, while the MMU translates in the background to real RAM. This very lie makes container isolation and memory limits in cgroups possible in the first place.
- Scheduler, syscalls, and IPC are the cloud fundamentals nobody explains: The Linux scheduler distributes processes across CPU cores; the Kubernetes scheduler distributes pods across nodes—the same mechanism one level higher.
Related:Kubernetes 1.31 Sidecar Containers / Container Runtime Comparison
Power On: Bootloaders and Privilege Rings
An operating system isn’t a single program; it’s a choreographed sequence of ten stages. Anyone debugging containers or planning Kubernetes clusters works with the outcome every day—usually without knowing what’s happening in the engine room.
You press the power button. Power flows to the motherboard, the CPU wakes up, and executes the very first instruction at a hard-wired address. At this moment there’s no memory management, no files—just a single core running firmware. On modern machines it’s called UEFI; back then it was BIOS.
The firmware wakes just enough hardware to find a disk, then hands off to the bootloader: GRUB on Linux, iBoot on Mac, Boot Manager on Windows. The bootloader’s sole job is to load the kernel into RAM. Once that’s done, the CPU runs in Ring 0, the mode with full hardware control.
Ring 0 vs. Ring 3 is the most important dividing line on your computer. x86 CPUs have four rings, but in practice only two matter: Ring 0 for the kernel, Ring 3 for everything else. A bug in Ring 0 code can take down the entire machine—the famous CrowdStrike outage in summer 2024 was exactly that: a faulty driver in Ring 0 that blue-screened Windows machines worldwide.
The Biggest Lie in Computing: Virtual Memory
When a program requests a memory address, that address is almost always a lie. It’s a virtual address that a hardware component called the MMU (Memory Management Unit) translates into a real physical address. The translation happens via a data structure called the Page Table, which the kernel maintains for each process. Memory is delivered in 4-kilobyte pages, and every process gets its own virtual address space.
This separation is precisely why your browser can’t rummage through the memory of your password manager. Both live in parallel universes that only the kernel can oversee.
The MMU caches frequent translations in a tiny buffer called the TLB. When a process accesses a page that isn’t in RAM, the MMU triggers a page fault, the kernel loads the page from disk, and the process continues as if nothing happened. This very mechanism underpins container memory limits: when a pod exceeds its cgroup limit, the OOM killer steps in and shoots the process—no swap, no warning.
Drivers, Interrupts, and the First Process
Once the kernel understands memory and the file system, it loads drivers. Drivers translate generic kernel requests into chip-specific commands for GPUs, Wi‑Fi cards, keyboards. They run in Ring 0, which is convenient for performance but perilous for stability. The CrowdStrike incident serves as a stark reminder of why driver reviews must be taken seriously.
With the drivers in place, the kernel activates interrupts. How does an operating system know you just pressed a key? It doesn’t poll in an endless loop. Instead, the keyboard fires an electrical signal that yanks the CPU from its current task and jumps into an interrupt handler inside the kernel. Every mouse movement, every Wi‑Fi packet, every disk response is an interrupt.
Only now does the kernel spawn the first user-space process: PID 1, on modern Linux usually systemd. PID 1 is the progenitor of all other processes—if PID 1 dies, the kernel panics and the machine goes down. From this moment on, everything runs in Ring 3 and must ask the kernel for permission via system calls whenever it wants to touch the file system, the network, or hardware.
Eight-core multitasking for hundreds of processes
A modern cloud VM may have eight or sixteen vCPUs, but it’s juggling hundreds of processes. That’s where the scheduler comes in. It decides every few milliseconds which process runs on which core. The current Linux scheduler is called EEVDF (Earliest Eligible Virtual Deadline First) and guarantees every process its fair CPU share.
When you type kubectl get pods in Kubernetes, the same mechanism is running one level higher: the Kubernetes scheduler distributes pods across nodes, much like the Linux scheduler distributes processes across cores. Resource requests and limits in K8s are the siblings of nice values and cgroup quotas at the OS level.
Inside a single process there’s often more than one path—via threads. Threads share memory but keep their own stacks. That’s powerful, yet perilous: if two threads write to the same variable at once, race conditions appear. Modern languages try to prevent this—Go with goroutines and channels, Rust with the borrow checker that rejects non-thread-safe code at compile time.
When two entirely separate processes need to talk, IPC—inter-process communication—steps in. The simplest form is the pipe, invented in 1973 by Doug McIlroy at Bell Labs and still unbeaten: cat log.txt | grep ERROR is exactly that. Beyond pipes there are sockets and message queues—the entire microservice architecture is conceptually IPC, just stretched across the network.
What cloud engineers should take away
A container isn’t its own operating system. It’s a userspace process on a host kernel, isolated through namespaces and capped in resources via cgroups. That’s why containers boot in milliseconds instead of seconds like a VM—the real OS is already running. Once you grasp this, you debug OOMKilled pods, latency-critical workloads, and sidecar architectures with an entirely different perspective.
The Fireship video “Every operating system concept in one video” delivers ten stages in eleven minutes—a visual crash course and a great companion. For deeper study, dive into the Linux Kernel Documentation and the classic “Operating Systems: Three Easy Pieces” by Remzi Arpaci-Dusseau, freely available online. If you want to dig into the security implications of Ring-0 drivers, the SecurityToday sister portal analyzes how AI agents find Linux kernel zero-days.
Frequently Asked Questions
Do I need OS knowledge if I only use Kubernetes?
Certainly by the time you see your first OOMKilled pod, your first CPU throttling issue, or your first sidecar logging bug. Kubernetes doesn’t remove the operating system—it sits on top of it. You can only set Resource Limits, Liveness Probes, and Network Policies effectively if you understand how cgroups, namespaces, and the Linux scheduler work underneath.
Are containers lighter than VMs because they don’t have an operating system?
Containers do have an operating system—they just share the host’s kernel. What lives inside the container image is userspace: libraries, binaries, configuration. A VM brings its own complete kernel plus hypervisor overhead. That’s why containers boot in milliseconds while VMs take seconds.
Why does Linux kill my container instead of using swap?
Because cgroups enforce hard memory limits. Once a container exceeds its limit, the OOM killer fires a SIGKILL—no warning, no swap attempt. That’s intentional: predictable performance across the cluster trumps best-effort survival of individual pods. To bypass this, raise limits or run memory profiling.
What’s the difference between kernel space and user space?
Kernel space runs in Ring 0 with full hardware control—drivers, the scheduler, the filesystem module all live here. User space runs in Ring 3 without direct hardware access. Every web server, every database, every container lives in user space. To touch hardware or files, it must go through system calls—the only bridge between the two worlds.
What actually happens during shutdown?
The reverse of startup: PID 1 sends every process a SIGTERM (a polite request to exit). Processes that don’t respond get a SIGKILL after a timeout. Then filesystems flush their journals, drivers release hardware, the kernel syncs memory to disk, disables interrupts, and firmware cuts the power—all in under two seconds on a normal machine.
Source of title image: Pexels / Manuel Geissinger (px:2881229)