Welcome to the everyday problem Big O was invented to describe. This post is a practical, MySQL-flavoured tour of time complexity. We’ll keep the math light, the pictures clear, and we’ll finish with the exact algorithms your database uses to keep your queries fast (or slow).
What is Big O, really?
Forget seconds. Big O is not a unit of time. It’s a way of describing how the cost of an algorithm grows as the input grows. Specifically: if you double the data, does the work double? Stay the same? Quadruple? That shape is the algorithm’s complexity.
We call the input size n. A few shapes come up over and over in real systems:
- O(1) — constant. The size of
ndoesn’t matter. Accessing an array by index, or a hash lookup. - O(log n) — logarithmic. Doubling
nadds one more step. Binary search, B-tree traversal. - O(n) — linear. Work scales directly with size. A full table scan.
- O(n log n) — linearithmic. A sort over
nitems. - O(n²) — quadratic. For every row, look at every other row. A naive nested-loop join.
- O(2ⁿ) — exponential. One more row doubles the work. The ultimate danger class: combinatorial explosion, travelling salesman, brute-force subset enumeration.
The useful picture is this: as n grows, these curves diverge dramatically. Big O tells you which curve your query is riding.
n — wildly different fates. The curve your query rides is the single biggest predictor of whether it’ll still be fast next year.Little, medium, lots: the numbers that will ruin your week
Big O only matters because, at some size, the curves stop overlapping. Below is the same story with real numbers. We ignore constant factors — those matter in practice, but they don’t change the shape.
| Complexity | n = 100 | n = 10,000 | n = 10,000,000 |
|---|---|---|---|
| O(1) | 1 | 1 | 1 |
| O(log n) | 7 | 14 | 24 |
| O(n) | 100 | 10,000 | 10,000,000 |
| O(n log n) | 700 | 140,000 | ≈ 240,000,000 |
| O(n²) | 10,000 | 100,000,000 | 100,000,000,000,000 |
| O(2ⁿ) | ≈ 1.3 × 1030 | beyond atoms in the universe | unthinkable |
At n = 100 the count already dwarfs the number of atoms in your body. Exponential is not a number you compute with — it’s a number you avoid.
At n = 100 everything is fine — a nested loop over 100 × 100 rows is 10 K operations, basically free. At n = 10 million, the same nested loop is 100 trillion operations. Your query planner’s job is to never let that happen.
Notice something: O(log n) barely moves. Going from 100 rows to 10 million rows takes the count from 7 up to 24 — only 17 more operations for 100,000× more data. That’s the superpower of a B-tree index, and it’s why the first rule of MySQL tuning is “add an index.”
Why this matters specifically for MySQL
MySQL’s optimizer is, at heart, a machine for picking the algorithm with the best Big O for your actual data. Every time you run EXPLAIN, you’re looking at the optimizer’s answer to one question: “Which shape of curve should this query ride?”
Three things decide the answer:
- Which indexes exist — because each index unlocks an O(log n) access path.
- How many rows are expected — because at small
n, a scan can beat an index (the constants win). - How the tables join — because the difference between nested loop and hash join is the difference between O(n × m) and O(n + m).
Everything the optimizer does — index choice, join order, whether to use a temp table, whether to sort with filesort — is cost math. And cost math is just Big O with constants attached.
MySQL’s workhorse algorithm: the B+tree search
InnoDB stores every table as a B+tree on the primary key, and every secondary index as another B+tree. A B+tree of a billion rows is only about four levels deep. That’s the magic of O(log n) in action.
Here’s the search in motion. A query arrives (“find row with id = 42”), and the tree is walked from the root, one level per read, until it lands on the leaf page holding the row. With one billion rows, it’s still only four page reads.
This is why the single most impactful thing you can do for a slow query is usually “add an index on the column you’re filtering by.” You’re not making the query faster by a constant factor — you’re changing its complexity class from O(n) to O(log n). At 10 million rows, that’s a 400,000× difference in pages touched.
If you want to see this in motion with real MySQL semantics — clustered vs secondary index, covering vs non-covering — there’s an interactive version in the myflames project under docs/teach/index/btree.html.
The other shapes, as MySQL operators
O(n) — the full table scan
No usable index? MySQL reads every row. Double the rows, double the work. That’s type: ALL in EXPLAIN, and it’s the reason “this query was fine last year” is such a common support ticket.
EXPLAIN SELECT * FROM orders WHERE note LIKE '%refund%';
-- type: ALL
-- rows: 10,000,000
-- Extra: Using where
A LIKE '%something%' can’t use an index prefix — the wildcard at the front breaks the B-tree’s ordering. So the optimizer falls back to a scan: O(n). The fix is either FULLTEXT, a functional index, or rewriting the query so the wildcard is at the end.
O(n × m) — the nested loop join
For every row in the outer table, look up matching rows in the inner table. If the inner side has an index, each lookup is O(log m) and the whole join is O(n log m) — tolerable. If it doesn’t, each lookup is O(m) and the join is O(n × m) — catastrophic.
This is why the introduction of hash join in MySQL 8.0.18 was such a big deal. It gave the optimizer an O(n + m) option for cases where neither side had a useful index, turning a previously unshippable query into a merely slow one.
O(n log n) — the sort
When MySQL needs to return rows in a specific order and can’t get them pre-sorted from an index, it falls back to filesort. That step adds O(n log n) on top of whatever reading cost you already paid. For a million-row result set, that’s tens of millions of extra comparisons before a single byte goes back to the client.
This is where a sort-ordered covering index earns its keep. The classic “2-second query becomes 20 ms” speedup is actually two separate complexity wins stacked on top of each other:
1. “Sort-ordered” — the sort disappears entirely. A B-tree index is physically stored in key order, and InnoDB links the leaf pages together as a sorted list. If your ORDER BY col_x matches the index’s leading column, MySQL just walks the leaves in that already-sorted order — no comparison step, no temp buffer, nothing. The O(n log n) sort doesn’t get faster; it stops happening. Add LIMIT 20 and MySQL reads exactly 20 leaf entries and quits.
2. “Covering” — the per-row PK hops disappear. A secondary index’s leaves only contain the indexed columns plus the primary key. If your SELECT needs a column that isn’t in the index, MySQL has to do a second B-tree walk into the clustered tree for every matching row — an extra O(log n) per row, or O(k log n) total. A covering index has every column you asked for right there on the leaf, so that whole step vanishes.
Stacked, the complexity moves look like this:
| Step | Without index | With sort-ordered covering index |
|---|---|---|
| Row access | O(n) scan, or O(k log n) with PK hops | O(log n + k) — seek + walk |
ORDER BY sort |
O(n log n) | eliminated (0 ops) |
LIMIT k |
applied after the sort | applied during the walk |
That last row is the one that matters most for paginated queries: without the index, MySQL must sort all n rows even if you only want the top 20, because it can’t know which rows are the top 20 until it’s sorted them. With the index, LIMIT 20 stops the walk after 20 leaves. That is where the 100× speedup actually comes from.
O(1) — the dream (sort of)
Pure O(1) is rare in MySQL — even a primary-key lookup is O(log n) because it still walks the clustered tree. But amortized O(1) shows up in the InnoDB adaptive hash index, which builds an in-memory hash on top of frequently-accessed B-tree pages. When it hits, you skip the log-n tree walk entirely.
O(2ⁿ) — where exponential hides in MySQL
You almost never execute an exponential algorithm in MySQL — the optimizer works hard to keep it off your plate. But there’s one place where exponential work really does happen: query planning. Picking the best join order for N tables means weighing up to N! different orderings, which is in the same danger class as 2ⁿ. For 12 tables that’s nearly half a billion orderings to evaluate before the query starts running.
MySQL tames this with branch-and-bound pruning, on by default (optimizer_prune_level = 1). As the planner tries out partial join orders, it drops any branch whose cost already exceeds the cheapest complete plan found so far — there’s no point finishing a plan that’s already losing. That turns the theoretical N! down to thousands or millions of orderings actually costed. For most queries, it’s done in milliseconds and you never notice.
You do notice it on large joins (say, 15–20 tables) where the tables look similar enough that cost estimates can’t tell the good branches from the bad ones. Pruning loses its bite, and real planning time appears in the Optimizing / statistics / preparing stages (visible in SHOW PROCESSLIST) before a single row is touched. The exponential monster you avoided at runtime was waiting for you at plan time.
The EXPLAIN mindset: reading for complexity
Once you carry Big O around in your head, EXPLAIN reads differently. You stop asking “is this query fast?” and start asking “which complexity class is each row on?” Rough mapping:
EXPLAIN clue |
Complexity | Feeling |
|---|---|---|
type: const / eq_ref |
O(1) – O(log n) | great |
type: ref / range |
O(log n + k) | good |
type: index |
O(n) over index | depends on n |
type: ALL |
O(n) | risky at scale |
Using filesort |
+ O(n log n) | OK if n is small |
Using join buffer (Block Nested Loop) |
≈ O(n × m) | scary |
Using join buffer (hash join) |
O(n + m) | good |
Tools like myflames visualize this directly — the flame graph width is proportional to row count, so an O(n²) operator looks wider and hotter than an O(log n) one. The visual is just Big O wearing colour.
Takeaways
- Big O describes the shape of a query’s cost, not the number of seconds. Shapes diverge brutally at large
n. - Adding an index is usually a complexity change (O(n) → O(log n)), not just a speed tweak.
- Joins without a usable inner-side index are O(n × m). Hash join is O(n + m). That’s often the difference between “works” and “doesn’t.”
- Sorting is O(n log n). Pre-sorted indexes make sorts disappear entirely.
EXPLAINis a complexity diagnosis, not a speed check. Read it that way.
The next time a query is slow, skip the “what changed?” reflex and ask “what shape is it riding?” first. Ninety percent of the time, the answer — and the fix — are in the curve.
The floor is yours.
0 comments · Moderated · civil & on-topicFirst comment appears here once approved. Questions, corrections, and counterpoints welcome — just no self-promotion.