Discrete math, ranked by what shows up
The slice of math that earns its place in a CS curriculum: counting, induction, recurrence, graphs, modular arithmetic, basic probability.
Prerequisites
high-school math
Stack
a notebookPython 3.12 to verify by simulation
By the end of this module
- Use counting and pigeonhole arguments to bound algorithm behavior.
- Prove a property by induction, on integers and on structures.
- Solve a recurrence with the master theorem without memorizing it.
- Reason about graphs and modular arithmetic well enough to read the original Dijkstra and RSA papers.
- Compute expectations and apply linearity, Markov, and Chebyshev where they earn their keep.
Most discrete math courses are taught by pure mathematicians and read like a parade of definitions. The students who survive learn truth tables and Venn diagrams, then forget all of it the next semester. That is not a productive use of the time.
Here is the truth: there are roughly six topics in discrete math that show up repeatedly in CS work — in algorithm analysis, in cryptography, in distributed systems, in machine learning. Everything else is either niche or downstream of those six. This module covers them in priority order, with simulation in Python to make every claim concrete.
The opinionated take: prove a thing, then simulate it in Python, then check they agree. Pure proof without simulation is how students disengage. Simulation without proof is how they convince themselves of false things. Both together is how you actually learn.
Set up
You won’t write much code, but you’ll write some.
mkdir discrete && cd discrete
uv venv .venv && source .venv/bin/activate
uv pip install numpy matplotlib sympy
mkdir notes sims
Each topic gets a notes file (markdown is fine) and a sims/ script that verifies the claim numerically. The simulations are the difference between believing and knowing.
Read these first
In this order:
- MIT 6.042J — Mathematics for Computer Science. course · 4–6 hours of video · the gold standard. Lehman, Leighton, and Meyer’s notes are free.
- Susanna Epp — Discrete Mathematics with Applications. book · pick this up only when 6.042J’s prose isn’t clicking. Chapters on induction and counting are the strongest.
- 3Blue1Brown — Probability essentials. series · 90 minutes · for visual intuition on expectation and Bayes.
- Knuth, Graham, Patashnik — Concrete Mathematics. book · only after the above. Dense, wonderful, slow.
After 6.042J’s induction and counting lectures you have most of what you need. Stop reading. Start proving.
Counting, the everyday tool
Counting is what you do every time you analyze an algorithm. Permutations, combinations, the pigeonhole principle — these are the primitives.
The four facts you must know cold:
- Sum rule. If A and B are disjoint, the count of A or B is the sum.
- Product rule. Independent choices multiply.
- Permutations of k from n. n! / (n - k)!
- Combinations of k from n. n! / (k! · (n - k)!), often written C(n, k).
The pigeonhole principle: if you place n + 1 items in n bins, some bin has at least 2 items. Trivial-looking, surprisingly powerful. The proof that any hash function on n+1 inputs into n buckets must collide is one line of pigeonhole.
# sims/birthday.py — pigeonhole made concrete via the birthday paradox
import random
def trial(n=23):
days = [random.randint(0, 364) for _ in range(n)]
return len(set(days)) != n
print(sum(trial() for _ in range(10_000)) / 10_000) # roughly 0.507
The simulation produces ~0.507. The closed form gives the same number. Each verifies the other. That is the loop you want to be in.
Induction, the proof tool you’ll actually use
Most working CS proofs are induction. Loop invariants are induction. Recursive correctness is induction. Tree-traversal proofs are structural induction.
The shape of every induction:
1. Base case. Show the claim holds for the smallest input (n = 0 or n = 1).
2. Inductive step. Assume it holds for n; show it then holds for n + 1.
3. Conclude it holds for all n by induction.
The variant you’ll use almost as often is strong induction: assume the claim for all k less than or equal to n, then show it for n + 1. Strong induction is what you use to prove correctness of algorithms that recurse on subproblems of varying sizes (merge sort, fast exponentiation).
Practice prompt: prove that any tree on n nodes has exactly n - 1 edges. Then code a function that builds random trees and verifies the count. Both should agree.
Recurrences and the master theorem
When you analyze a recursive algorithm, you write a recurrence: T(n) = a · T(n / b) + f(n). The master theorem tells you the answer in three cases without you doing any algebra.
| Case | When | T(n) is |
|---|---|---|
| 1 | f(n) grows slower than n^(log_b a) | Θ(n^(log_b a)) |
| 2 | f(n) is roughly n^(log_b a) | Θ(n^(log_b a) · log n) |
| 3 | f(n) grows faster than n^(log_b a) | Θ(f(n)) |
Memorize the shape, not the formal statement. In practice you only need a handful of recurrences:
- T(n) = 2 T(n/2) + O(n) → O(n log n) — merge sort, the master theorem case 2.
- T(n) = 2 T(n/2) + O(1) → O(n) — tree traversal.
- T(n) = T(n/2) + O(1) → O(log n) — binary search.
- T(n) = T(n - 1) + O(n) → O(n^2) — naive recursion that does linear work per level.
Drill yourself: given a recursive Python function, write its recurrence in 30 seconds and apply the master theorem. Most algorithm analysis collapses to this.
Graphs, enough to read papers
You don’t need a separate graph theory course. You need:
- Definitions. Vertices, edges, directed vs undirected, simple, multi-, weighted.
- Degrees. The handshake lemma: sum of degrees is twice the edge count.
- Connectivity. A graph is connected if there’s a path between every pair.
- Trees. Connected acyclic graphs. n nodes, n - 1 edges, unique paths.
- DAGs and topological order. A DAG has at least one topological ordering iff it has no cycles.
- Bipartite graphs. 2-colorable iff no odd cycles.
- Planarity. You can draw it without edges crossing. Rarely matters in practice; common in theory exercises.
That’s about 80% of what you’ll see in algorithm papers. The rest you pick up in context — flow, matchings, spectral — when a specific problem demands them.
Modular arithmetic for hashing and crypto
Modular arithmetic is everywhere a computer touches integers: hashing, RNGs, hash maps, error correction, and especially cryptography.
You need:
- Modular addition and multiplication. (a + b) mod n and (a · b) mod n. Distributes through.
- Modular inverse. a^(-1) mod n exists iff gcd(a, n) = 1. Found via the extended Euclidean algorithm.
- Fermat’s little theorem. If p is prime and a is not divisible by p: a^(p - 1) ≡ 1 (mod p).
- Euler’s theorem. Generalization for non-prime modulus, using φ(n).
- GCD via Euclid. gcd(a, b) = gcd(b, a mod b). The single most elegant algorithm in CS, and it’s three lines.
- Chinese Remainder Theorem. Reconstruct an integer mod n from its residues mod the prime factors of n.
# sims/gcd.py — Euclid in three lines, all you ever need
def gcd(a, b):
return a if b == 0 else gcd(b, a % b)
If you understand the GCD and Fermat’s little theorem, the RSA paper is readable. That is a non-trivial unlock for an afternoon’s work.
Probability, the basics with teeth
You need expectation, linearity of expectation, variance, Markov’s inequality, Chebyshev’s inequality, and Bayes’ rule. That’s the kit.
The single most important fact: linearity of expectation holds even for dependent random variables. If X and Y are random, E[X + Y] = E[X] + E[Y], full stop. This is the fact that makes hashing analysis work, that makes randomized algorithms tractable, that gets you through the analysis of quicksort.
# sims/quicksort_avg.py — average comparisons in quicksort, by simulation
import random
def qs_count(arr):
if len(arr) <= 1: return 0
p = arr[0]
lo = [x for x in arr[1:] if x < p]
hi = [x for x in arr[1:] if x >= p]
return (len(arr) - 1) + qs_count(lo) + qs_count(hi)
n = 1000
trials = 100
avg = sum(qs_count(random.sample(range(n), n)) for _ in range(trials)) / trials
print(avg, "vs theory", 2 * n * (sum(1/i for i in range(1, n+1))))
The simulation lands within 1% of the closed form 2n · H(n). That is what proof + simulation looks like in practice.
Markov’s inequality (for non-negative X): P(X ≥ a) ≤ E[X] / a. Crude but always available. Chebyshev sharpens it when you also know variance: P(|X - μ| ≥ k σ) ≤ 1 / k^2. These two cover most real “tail bound” needs.
What to skip
Honest list of topics that get curricular oxygen but don’t pay for themselves at this stage:
- Most of formal logic beyond truth tables and basic predicate logic.
- Deep set theory beyond the intuitive level.
- Automata, formal languages, and the Chomsky hierarchy in detail. (Worth a one-day skim, no more.)
- Generating functions, unless you’re going into competitive programming or theoretical CS.
Come back to these when a specific need pulls you in.
Going deeper
When you have specific questions, in this order:
- MIT 6.042J — Lehman, Leighton, Meyer notes. free PDF · best free book on the topic.
- Concrete Mathematics (Knuth et al.). book · for sums, generating functions, asymptotics.
- Blitzstein & Hwang — Introduction to Probability. free PDF · the cleanest probability intro for CS students.
- Stanford CS103 lecture notes. page · proofs and discrete structures, well-paced.
Checkpoints
If any one wobbles, redo the corresponding section.
- Prove by induction that 1 + 2 + … + n = n(n+1)/2. Then write a Python check that agrees for n up to 1000.
- Apply the master theorem to T(n) = 3 T(n/4) + n. Which case, and what’s the answer?
- State the pigeonhole principle, then use it to argue that any hash function from 1 million keys into 999,999 buckets must have at least one collision.
- Compute gcd(252, 105) by hand using Euclid. Then find x, y such that 252 x + 105 y = gcd, using the extended algorithm.
- A coin lands heads with probability 0.6. You flip it 100 times. What’s the expected number of heads, and what’s a one-sigma upper bound on how far you can be from that expectation?
If you can answer all five without notes, you’ve earned 02.3. Move on to 02.4 (Operating systems — the parts that matter) — where the abstractions you’ve been programming on top of get pulled apart.