Operating systems — the parts that matter
Processes, threads, memory, files, syscalls. What top is showing you. Build a tiny shell.
Prerequisites
01.5 (a low-level language: C or Rust)
Stack
Linux or macOS (WSL2 ok on Windows)C or Ruststrace (linux) / dtruss (mac)perf (linux) or Instruments (mac)
By the end of this module
- Read top, ps, and free output and explain every column.
- Trace a program's syscalls and explain what each one is doing.
- Build a tiny Unix shell that supports pipes and backgrounding, in under 300 lines.
- Reason about virtual memory, page tables, and TLBs at the diagram level.
- Use mutexes, condvars, and semaphores correctly without copy-pasting from the internet.
There is a particular flavor of confusion that hits programmers who have never opened the kernel curtain. They write a program that gets killed by the OOM killer and don’t know why. They see “load average: 12.5” and don’t know what it counts. They write multithreaded code and ship a deadlock. The fix for all of that is one focused pass through operating systems — not a semester of it, but the parts that show up in every job.
This module covers what you actually need: processes, threads, memory, files, and the syscalls that move data across the kernel boundary. You will build a tiny shell that exercises most of it. You will not implement a scheduler from scratch — you will read enough about scheduling to understand why your laptop sometimes feels slow.
The opinionated take: you don’t need to read OSTEP cover to cover. OSTEP is fantastic and it’s also 800 pages. Pick the right chapters, do the projects, move on.
Set up
You need a real Unix-like environment. On Windows, install WSL2 (Ubuntu) before you continue. On macOS, you have most of what you need; install the GNU coreutils for parity with Linux command names.
# Linux
sudo apt install build-essential strace gdb valgrind linux-tools-common
# macOS
xcode-select --install
brew install gnu-coreutils
Verify your environment:
gcc --version && strace -V 2>/dev/null || dtruss -h | head -1
Make a project:
mkdir tinyos && cd tinyos
mkdir shell traces notes
You will write the shell in C (preferred for OS work) or Rust if you went through 01.5 in Rust. Either is fine.
Read these first
In this order, and only the listed chapters:
- OSTEP — Remzi & Andrea Arpaci-Dusseau. free PDF · 8–10 hours · chapters 4–6 (processes, syscalls, limited direct execution), 12–14 (memory virtualization), 26–32 (concurrency). Skip the persistence section unless you want to. This is the single best free OS textbook.
- Robert Love — Linux Kernel Development. book · skim only · for a real-kernel grounding when OSTEP feels abstract.
- The Linux man pages, sections 2 and 3. online ·
man 2 fork,man 2 execve,man 2 waitare required reading. The man pages are the actual API. - Brendan Gregg — Systems Performance. book · only after the basics. The best applied-OS book in print.
After OSTEP’s process and concurrency chapters you have the conceptual grounding. Stop reading and start writing.
Processes vs threads
A process is the kernel’s unit of isolation: its own virtual address space, its own file descriptor table, its own credentials. A thread is a unit of scheduling within a process: shared memory, shared file descriptors.
The four syscalls every Unix programmer must know:
fork()— duplicate the current process. Returns 0 in the child, the child’s PID in the parent. Surprising on first encounter; obvious by the third use.execve()— replace the current process image with a new program. Almost always called right after fork in the child.wait()/waitpid()— block until a child exits, reap its status. Without this, you leak zombies.exit()— terminate the current process and return a status to the parent.
Drill: write a 30-line C program that forks five children, each prints its PID, parent waits for all five. If you understand this program — the order of the prints, the interleaving, why the parent’s wait loop matters — you understand the basic process model.
// shell/fork_demo.c
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void) {
for (int i = 0; i < 5; i++) {
pid_t p = fork();
if (p == 0) { printf("child %d pid %d\n", i, getpid()); return 0; }
}
while (wait(NULL) > 0) {}
return 0;
}
Run it ten times. Notice the prints come out in different orders. That is the scheduler at work.
Virtual memory, the diagram-level view
Every process sees a flat virtual address space. The MMU (memory management unit) translates virtual addresses to physical addresses through a per-process page table. The TLB (translation lookaside buffer) caches recent translations.
What you must internalize:
- Pages are 4 KiB on most systems. Memory is allocated and tracked in page-sized units.
- Page tables are big. A naive page table for a 64-bit address space is unimaginably huge, so they are multi-level (4 levels on x86-64).
- A page fault is an interrupt. When you access a virtual page that’s not mapped to a physical page (or has been swapped out), the CPU traps to the kernel, which fixes it up.
- The TLB matters. TLB misses are expensive — hundreds of cycles. Cache-friendly code is also TLB-friendly.
You don’t need to implement page tables. You do need to know what top’s VIRT, RES, and SHR mean (virtual size, resident-in-RAM, shared with other processes). When VIRT is huge but RES is small, that’s normal — VIRT counts everything mapped, including memory you’ve never touched.
Syscalls and the kernel boundary
A syscall is a controlled transition from user mode to kernel mode. On x86-64 Linux, the syscall instruction triggers it. Userspace cannot directly access hardware; it must ask the kernel.
The conventional wisdom: syscalls are slow because they cross the boundary, flush some CPU state, and run kernel code. Roughly 100–1000 ns per syscall, depending on what it does. This is why batched I/O (writev, io_uring) and zero-copy paths (sendfile, mmap) exist.
Tracing a real program’s syscalls is one of the most enlightening 10 minutes you can spend:
# Linux
strace -c ls /tmp
# macOS
sudo dtruss -c ls /tmp
The output is the syscall histogram: which calls, how many, time spent. You will be surprised how much of ls is getdents64 and stat.
File descriptors and the unified-IO model
In Unix, almost everything is a file. Sockets, pipes, regular files, terminals, even some device drivers — all accessed through small integers called file descriptors. fd 0 is stdin, 1 is stdout, 2 is stderr.
The four core operations:
read(fd, buf, n)— pull up to n bytes.write(fd, buf, n)— push up to n bytes.close(fd)— release the descriptor.dup2(old, new)— makenewrefer to whateveroldrefers to.
The shell pipe is dup2’s killer app. To wire ls | wc, the shell creates a pipe (returns two fds), forks twice, in the ls child it dup2s the write end onto fd 1, in the wc child it dup2s the read end onto fd 0. That is the entire trick. Once you’ve written it, pipes stop being magic.
The build: a tiny shell
Build a shell that supports:
- Running a single command with arguments.
- Piping commands with
|. - Backgrounding with
&. - Built-in
cd,exit.
The skeleton is roughly 200 lines of C. Build it in shell/tinysh.c over an evening or two.
// shell/tinysh.c — sketch only; you write the rest
int main(void) {
char line[1024];
while (printf("$ "), fflush(stdout), fgets(line, sizeof line, stdin)) {
// 1. tokenize on whitespace
// 2. detect & for backgrounding
// 3. detect | and split into pipeline stages
// 4. handle built-ins (cd, exit) without forking
// 5. fork/exec each stage, dup2 pipe ends, wait or don't
}
}
When yours runs cat README.md | grep todo | wc -l correctly and supports &, you have built a complete (if tiny) shell. That is roughly the same architecture as bash, dash, and zsh. Real shells add aliases, history, glob expansion, job control — all of which are layers on top of this core.
Concurrency primitives
The three you must know cold:
| Primitive | What it does | Use when |
|---|---|---|
| Mutex | Mutual exclusion; one holder at a time | Protect a shared data structure |
| Condvar | Wait until a condition becomes true | Coordinate producers and consumers |
| Semaphore | Counted resource | Limit concurrent access to N |
Anti-patterns to avoid:
- Naked spinlocks in user space. Use a mutex. The kernel knows how to put a thread to sleep.
- Sleeps as synchronization.
sleep(0.1)is not a substitute for a condvar. - Holding a lock across I/O. The latency tail will eat you.
Drill: write a bounded-buffer producer/consumer in C with pthreads. One mutex, two condvars (not_full, not_empty). Run it with three producers and two consumers. If the buffer ever overflows or underflows, your synchronization is wrong.
Scheduling, briefly
Linux uses CFS (Completely Fair Scheduler), which approximates “give every runnable thread a fair share of CPU.” The relevant facts for application programmers:
- Runnable threads compete for CPU. Threads in
Dstate (uninterruptible sleep) often mean blocked I/O — a common cause of “my system is slow.” niceadjusts priority weight. Most apps shouldn’t touch it.- Per-CPU run queues mean cache locality matters; the scheduler tries to keep a thread on the same CPU it last ran on.
You don’t need to implement a scheduler. You do need to read top and htop and explain the load average. (Hint: load average counts runnable + uninterruptible processes, averaged over 1/5/15 minutes. A load of 4 on a 4-core box is “fully busy”; on a 2-core box it’s “queueing up.”)
Going deeper
When you have specific questions, in this order:
- OSTEP — full text. free PDF · the persistence chapters (filesystems, journaling, log-structured) are excellent when you need them.
- Brendan Gregg — Systems Performance. book · the applied volume. Read it after OSTEP, not before.
- The Linux Programming Interface (Kerrisk). book · 1500 pages of the actual Linux API. Reference, not cover-to-cover.
- xv6. book + code · MIT’s teaching kernel. Read the source over a weekend; it’s tiny and complete.
Skip “What is an operating system” 5-minute YouTube videos. They will not help.
Checkpoints
If any one wobbles, redo the corresponding section.
- Walk through what
fork()returns to the parent vs the child. Why is the return value different, and how does that one quirk power the entire process model? - You see
topshowing one process at VIRT 8 GB, RES 200 MB. Explain what that means and whether you should be worried. - A program writes 10 MB to disk. Roughly how many syscalls, and what’s the cost difference between calling
write()once with 10 MB vs ten million times with one byte each? - Sketch the pipe + dup2 dance for
cmd1 | cmd2. Specifically: who creates the pipe, who closes which end, and what does the child fd table look like afterdup2? - You have a producer thread and a consumer thread sharing a queue. Name the primitives you need and why a single mutex without a condvar is the wrong choice.
If you can answer all five from memory, you’ve earned 02.4. Move on to 02.5 (Computer networks, end to end) — where the file descriptor model extends across machines.