Processes & Threads
Every program you run becomes a process — an instance of that program in execution. The OS creates processes, tracks them, and switches between them thousands of times per second.
What is a Process?
A process is more than just code. The OS wraps each running program in a container that holds everything needed to execute it:
Each process gets its own isolated address space — it cannot read or write another process's memory directly. This isolation is fundamental to security and stability.
Process Control Block (PCB)
The OS tracks each process using a Process Control Block (PCB) — a data structure in kernel memory that stores:
| Field | Description |
|---|---|
| Process ID (PID) | Unique identifier |
| State | Ready / Running / Waiting / Terminated |
| Program Counter | Address of the next instruction |
| CPU Registers | Saved register values during context switches |
| Memory info | Page table base address, limits |
| Open files | File descriptor table |
| Scheduling info | Priority, CPU time used |
Process States
A process moves through these states during its lifetime:
- New — process is being created
- Ready — loaded into memory, waiting for CPU time
- Running — currently executing on a CPU core
- Waiting — blocked on I/O, a lock, or a timer
- Terminated — finished; OS reclaims resources
Context Switching
When the OS switches from process A to process B, it must:
- Save A's CPU registers and program counter into A's PCB
- Load B's saved registers and program counter from B's PCB
- Switch the memory mapping to B's address space
This is a context switch — pure overhead (no useful work is done). Modern OSes minimise it but can't eliminate it.
Threads
A thread is a unit of execution within a process. Multiple threads share the same process memory — the heap, globals, and open files — but each thread has its own:
- Program counter
- Stack
- Register state
Creating a thread is much cheaper than a process fork — no address space copy, no full PCB.
Processes vs Threads
| Process | Thread | |
|---|---|---|
| Memory | Own address space | Shared with siblings |
| Creation | Expensive (fork/exec) | Cheap (new stack only) |
| Communication | IPC (pipes, sockets) | Direct shared memory |
| Crash isolation | A crash doesn't affect others | One thread crash kills the process |
| Use when | Isolation matters (web server workers) | Parallelism within one task |
User Threads vs Kernel Threads
- User-space threads (green threads, coroutines): managed by a runtime library, not the OS. Fast context switches but can't truly parallelise across CPU cores on a single OS thread.
- Kernel threads: the OS schedules them directly. True parallelism on multi-core systems; Python's
threadingmodule uses these (but the GIL limits Python-level parallelism).
Key Takeaways
- A process is an isolated execution container — code, memory, and OS metadata
- The PCB is the kernel's record of a process; it's saved/restored on every context switch
- Threads share memory within a process — fast to create but require synchronisation
- Context switching has overhead; too many processes/threads hurts performance
- Race conditions arise when multiple threads modify shared state without coordination