Where Seniors Earn Their Pay
Every post in this series has, at some point, used the phrase "undefined behaviour" and moved on. This is the post where we stop moving on. Why does C even have a concept like UB? Why are compilers allowed to be so aggressive about it? And how do practitioners catch it before it ships, given that it's designed to be invisible at compile time? Ten parts of pointers lead here.
What UB actually is
The C standard has three flavours of "not fully specified" behaviour, and people mix them up all the time. They are, in order of badness:
Implementation-defined behaviour is when the standard says "this is up to the compiler, but the compiler must document its choice." The size of int on your platform, for instance. Your code behaves predictably on any given compiler; you just have to know which one.
Unspecified behaviour is when the standard says "this could be any of these specific choices, and we don't guarantee which." The order of evaluation of function arguments, for instance. The program's output might vary between compilations, but it's still bounded.
Undefined behaviour is the third category, and it's different in kind. When your program invokes UB, the C standard places no constraints at all on what happens next. The program might produce the expected output, produce garbage, crash, delete files, send an email, reformat the world's largest stone. The standard, very literally, says nothing. This isn't a joke; it's a quote from the C standard in practice.
The list of things that are UB in C is long. Dereferencing a null pointer. Reading an uninitialised variable. Signed integer overflow. Writing past the end of an array. Modifying a string literal. Using a dangling pointer. Violating the strict aliasing rule (from Part 8). Writing through a const cast-away when the original was const (Part 7). Use-after-free (Parts 5 and 10). Every single one of these is, in the standard's eyes, equally undefined.
Why compilers exploit UB
Here is the part that surprises people the most. When the standard says "behaviour is undefined," compilers don't treat that as "the programmer might have done something strange, let's be cautious." They treat it as a promise from the programmer that this situation never occurs, and they use that promise to optimise.
Consider this function:
int foo(int* p) { int x = *p; // implies p is not null if (p == NULL) // ... so this branch is dead. return -1; return x; }
A reasonable programmer reads this and thinks "the null check comes too late, but it's still safe." A reasonable optimising compiler reads this and thinks something different. It sees *p on line 1. Dereferencing null is UB. The programmer implicitly promised p is non-null by dereferencing it. Therefore the p == NULL check on line 3 can never be true. Therefore that branch is dead code. Delete it:
// What the compiler actually emits: int foo(int* p) { return *p; // null check gone }
If the caller passes null, the program crashes or does something worse. The safety check you added was silently removed by the compiler, because you'd already invoked UB by the time it ran. The compiler is correct according to the standard. Your mental model of the code is wrong.
This sounds cruel. It isn't, actually. The logic is: if the compiler had to defensively assume every possible UB might happen, it couldn't optimise anything. Every array access would need bounds checks. Every pointer deref would need null checks. Every signed add would need overflow checks. The cost would be enormous, and C's whole selling point is "the abstraction is thin, you pay for what you use." UB exploitation is the price of that thinness.
The deal is: the standard gives the compiler strong permission to assume you wrote correct code. You, the programmer, promise not to break those assumptions. If you break them, the resulting code is not a bug in the compiler, it's a bug in your program that the compiler helpfully amplified.
Four real examples of UB exploitation, simplified but representative. In each, the source code looks defensive. After optimisation, the defensive part is gone, because the compiler concluded it was unreachable given the other operations in the function. The bug you were trying to guard against becomes the bug you can't guard against.
A tour of the classics (from this series)
Every major UB we've met in the previous ten parts traces back to the same root cause: the programmer made a claim the compiler believed, and the claim was false. Here's the whirlwind retrospective.
Dereferencing a null or dangling pointer (Parts 2, 5). Writing *p is an implicit promise that p points at a live object. Null fails that promise; a pointer to freed memory fails that promise; a pointer to a stack frame that already returned fails that promise. The compiler is allowed to assume all of these don't happen.
Use-after-free (Parts 5, 10). Once free(p) or delete p has run, the memory is no longer yours. The bits might still be there; reading them is UB. The allocator may have reused them for something else in the meantime. Smart pointers (Part 10) exist precisely to make this category of UB structurally harder to write.
Buffer overrun (Parts 3, 4). Accessing arr[n] when the array has fewer than n+1 elements is UB. The bytes you land on aren't "off the end, but readable"; they're UB-land. This is one of the most common sources of security vulnerabilities ever; read about stack smashing or heap corruption for the full story.
Writing through cast-away const (Part 7). If the underlying object was declared const, casting the const away and writing through the resulting pointer is UB. The compiler may have put the object in read-only memory, inlined its value, or skipped reads entirely.
Strict aliasing violation (Part 8). Accessing an object of one type through a pointer of an unrelated type is UB. The compiler assumes pointers of unrelated types don't refer to the same memory, and optimises on that assumption.
Wrong-type recovery from void* (Part 9). Casting a void* to the wrong typed pointer and dereferencing it produces garbage. Often this is also a strict aliasing violation, layering one UB on another.
Signed integer overflow. One we haven't dwelled on, but it's worth naming because it's the UB programmers hit by surprise the most. INT_MAX + 1 is UB in C and C++. Unsigned overflow is defined (it wraps); signed overflow is not. Compilers may (and do) optimise loops, comparisons, and even security checks based on "signed overflow can't happen."
Uninitialised reads. Reading the value of a local variable before writing to it is UB. The variable's bits are whatever the stack slot happened to contain. Some compilers zero-init as a safety net; the standard doesn't require it.
Data races (modern C and C++). Two threads accessing the same variable without synchronisation, where at least one is writing, is UB. Not "reads and writes get interleaved in some unpredictable order"; full UB. The program may produce any output, crash, or miscompile.
Seven categories, one pattern. Every line of C you write carries an implicit set of promises. Keep them and the compiler works for you. Break them and the compiler works on you.
Why UB is so hard to catch
The feature that makes UB dangerous is that it's not required to fail. The classic horror story goes like this: you write some code with UB. It happens to produce the right output on your machine, at -O0, on your compiler's current version. It passes your tests. It ships. Two months later, someone upgrades the compiler, or enables -O2, or runs it on ARM, or links with LTO, and it mysteriously breaks.
The reason UB is allowed to "work" is the same reason it's allowed to "fail": the standard places no constraints, which includes no requirement to produce observable failure. A well-behaved compiler might let your UB limp along for years. A new optimisation pass, later, notices that your program implied a certain constraint and rewrites the code accordingly. Nothing about the source changed; a dormant bug woke up.
This is why UB-related bugs are called "heisenbugs." They disappear when observed, return when you look away. They show up in release builds but not in debug. They surface after compiler upgrades. They manifest on machines you don't have. They make rational people question their sanity.
The tools we're about to discuss exist for exactly this reason: you cannot trust UB to announce itself. You have to go looking for it.
How practitioners actually catch UB
There is no single tool that catches all UB; that would amount to solving the halting problem. What exists instead is a collection of overlapping tools, each of which catches a specific class of bug with high reliability. Using them in combination is how a mature codebase stays honest.
AddressSanitizer (ASan)
Enabled with -fsanitize=address on GCC and Clang. The single most important tool in the list. ASan instruments every memory access in your binary and maintains a shadow map of which bytes are valid. When your code reads or writes a byte it shouldn't, ASan halts the program and prints a stack trace showing: where the bad access happened, where the memory was allocated, and (for use-after-free) where it was freed.
Covers use-after-free, double-free, heap overflow, stack overflow, use-after-return, use-after-scope, and more. Runs your program at roughly 2x slower and with more memory, which is fine for test runs. This is the tool that turns Part 5's three villains (null, wild, dangling) from "mystery segfault" into "here's the exact line and the exact allocation site."
UndefinedBehaviorSanitizer (UBSan)
-fsanitize=undefined. The complement to ASan. Where ASan catches memory-access UB, UBSan catches arithmetic and language-level UB: signed overflow, shifts by more than the word width, division by zero, misaligned pointer loads, invalid enum values, null dereferences, out-of-bounds array indexing (with bounds), unreachable-code reached, and about twenty other flavours.
UBSan runs at near-native speed, so some projects even ship it in production (paying the runtime cost in exchange for crash-loud-early behaviour). Combining ASan and UBSan (they compose: -fsanitize=address,undefined) catches the overwhelming majority of UB your program can hit at runtime.
ThreadSanitizer (TSan)
-fsanitize=thread. Not compatible with ASan (they can't both run at once), but equally important if you write multi-threaded code. TSan detects data races and deadlocks. It logs every memory access and every synchronisation operation, and reports when two threads access the same address without proper ordering. Expect 5x to 15x slowdown, which is painful but manageable for CI runs.
Data races are the UB category that's hardest to find by inspection or testing, because they depend on timing. TSan turns them from "occasional production glitch" into "deterministic CI failure."
MemorySanitizer (MSan)
-fsanitize=memory, Clang-only. Catches uninitialised memory reads: not "reading memory you shouldn't have access to," which ASan catches, but "reading memory that was never written." Useful for code that does low-level work with buffers where the difference matters. Requires all dependencies to be built with MSan too, which makes it more annoying to deploy than ASan or UBSan.
Valgrind (and its memcheck tool)
The other-other option. Predates the sanitizers and works differently: instead of compile-time instrumentation, Valgrind runs your binary on a synthetic CPU and interprets every instruction. Detects the same bugs ASan does, plus a few more (like reads of uninitialised memory, which ASan doesn't catch by default). The trade-off is speed: 20x to 100x slower than native. If your program is small enough that this doesn't matter, Valgrind is still excellent. For most modern code, ASan + UBSan is what people reach for first.
Static analysis
All of the above run at test time and need the bug to actually execute. Static analysers look at the source and reason about possible executions without running anything. Clang Static Analyzer, cppcheck, PVS-Studio, and Coverity are common. They're less precise than runtime tools (more false positives, some false negatives) but they catch bugs on code paths that your tests never exercise. Good in CI as a gate: don't merge if the analyser flags anything new.
Compiler warnings
The cheapest static analyser in the world is your existing compiler. Building with -Wall -Wextra -Wpedantic on GCC or Clang turns on a large family of warnings that catch obvious bugs at compile time: uninitialised-variable uses, unused results of functions that shouldn't be ignored, obvious buffer-size mismatches, questionable type conversions. Warnings aren't errors by default; promoting them with -Werror forces the codebase to stay clean.
Adding -Wshadow, -Wconversion, -Wnull-dereference, and, for strict aliasing, -Wstrict-aliasing=3 catches more. Modern C++ adds -Wold-style-cast, -Wcast-align, and so on. You should know what your compiler can warn about.
No single tool catches everything, and that's the point. Different classes of UB live in different regions of the program's behaviour, and different tools probe different regions. In a serious codebase, you'd run several of these in different CI pipelines to get coverage across the board.
A discipline for staying out of UB
Beyond tools, most UB is avoidable through habits. A short list of the habits that matter:
Initialise every variable. int x = 0; is a free lunch. Don't declare locals uninitialised and hope you'll write to them before reading. The UB category of "uninitialised read" is almost entirely preventable with this one rule.
Use modern containers instead of raw buffers. std::vector, std::array, and std::string come with bounds checking (.at()) and don't leak, don't dangle, don't overrun on append. In C, which has none of these, you need buffer-carrying struct patterns with explicit length checks at every access.
Use smart pointers. Part 10 was the whole argument: ownership in the type system means fewer chances to use-after-free. Don't free raw pointers in modern C++.
Turn warnings into errors. -Wall -Wextra -Werror is the absolute minimum. Anything the compiler can flag for free should be a build-stopper.
Run sanitizers in CI. Every test run. Not just on your machine when you remember. ASan plus UBSan in one job, TSan in another, and a nightly job on static analysis. The cost is a handful of CPU-hours per day; the benefit is catching bugs before users do.
Treat "works on my machine" as a warning sign for UB. If your code works at -O0 but breaks at -O2, the likely cause is UB. If it breaks on ARM but works on x86, likely UB. If it breaks after a compiler upgrade, likely UB. These are not coincidences; they are the signature of a dormant UB that just woke up.
When debugging a heisenbug, reach for UBSan first. Before adding print statements, before running valgrind, build with -fsanitize=address,undefined and run your test. Most heisenbugs stop hiding when the sanitizers are watching.
A word about C++ and safer languages
UB is a C heritage. C++ inherited nearly all of C's UB and added some of its own (notably around object lifetime, virtual-call timing, and exception unwinding). For a long time, this was just how systems programming worked: accept UB as the price of the thin abstraction, use tools to mitigate, and ship.
Recent languages have pushed back hard on this premise. Rust eliminates most memory UB at compile time through its borrow checker: it reasons about ownership and lifetimes the way C++ smart pointers try to, but statically, and refuses to compile code that might produce use-after-free or data races. You still have unsafe blocks for low-level work, and UB is still possible there, but the default path is safe.
Swift, Kotlin, Go all make different trade-offs but share the same direction: narrow the surface area where UB is possible, at some runtime or compile-time cost. C++ itself is evolving: C++20 added more bounds-checked operations, C++23 continues that work, and some compilers ship -fbounds-safety modes that narrow UB further. The Safe C++ proposal and Google's "secure C++" are live discussions about how far the language can go without losing its identity.
The pragmatic view for working programmers: the languages aren't going away, and you'll write C and C++ for years yet, because they run the infrastructure the newer languages depend on. The skill is to understand UB deeply enough to recognise it, avoid it, and catch it quickly when it happens. That's what this series has been building toward.
Summing up
Undefined behaviour is C's deal with compilers: you follow a specific set of rules, the compiler gets to generate fast code. The rules aren't trivial, and violating them produces not "a bug you can spot" but "behaviour the standard places no constraint on." The compiler is legally allowed to assume you followed the rules, and uses that assumption to optimise, which means a small UB can become a big runtime surprise.
Every major memory-related UB we've met in this series (use-after-free, buffer overrun, const violation, aliasing, wrong-type recovery, null deref, dangling, use-after-move) has the same underlying shape: the code made a promise the compiler believed, and the promise was false.
You catch UB through a combination of discipline (modern containers, smart pointers, initialised variables, compiler warnings as errors) and tooling. ASan catches memory-access UB; UBSan catches arithmetic and language-level UB; TSan catches data races; MSan catches uninitialised reads; Valgrind covers similar ground at different trade-offs; static analysis finds bugs before they execute. In a mature codebase, some subset of these runs in CI continuously.
The reason "works on my machine" is suspicious is that UB is allowed to work until it isn't. Optimisation levels, compilers, architectures, and library upgrades can all wake a dormant UB. Treat "suddenly broke" as evidence of UB, not of bad luck.
Modern languages are pushing back on UB by eliminating the ambiguity at compile time. That's the direction the industry is moving. But C and C++ aren't going away tomorrow, and for everyone who writes them, understanding UB isn't optional. It's what separates people who ship systems code from people who just write it.
Eleven posts, one idea: memory is explicit in C and C++, and every line you write takes a position on it. The series started with "what is a pointer"; it ends with "what happens when the rules governing pointers are broken." The middle is everything between.
If you've read all eleven parts, you've covered essentially the entire pointer curriculum that a working C or C++ programmer needs. You know what pointers are and where they live; you know the villains that come with them; you know how const communicates intent and how restrict communicates guarantees; you know how ownership is encoded in modern C++ and how to pick between unique_ptr, shared_ptr, and weak_ptr; you know how void* fakes polymorphism in C; and you know what UB is, why it's hard, and what tools catch it.
The next step isn't more reading. It's writing. Build something with dynamic memory. Ship it. Run it under ASan. Fix the bugs it finds. Read someone else's codebase and argue with their ownership decisions. These are the things that turn knowledge about pointers into fluency.
Thanks for sticking with it.
Test yourself
Seven questions on UB's nature, its most common forms, and the tools for catching it. This is the last quiz in the series; five correct means you've graduated.
B is true. This is the key insight: the compiler doesn't defensively guard against UB; it assumes you avoided it and uses the assumption for optimisation.
C is false. The worst thing about UB is that it's allowed to "work." Code with UB can produce correct output for years and then break after a compiler upgrade.
D is true. This is the heisenbug pattern. The source is unchanged; the compiler's new optimisation passes exploit a UB that was dormant before.
E is false. C++ inherits nearly all of C's UB and adds more (around object lifetime, virtual-call timing, exception unwinding, etc.).
-O2, and why.int process(int* p) { int v = *p; if (!p) { log_error("null pointer!"); return -1; } return v * 2; }
p. Dereferencing a null pointer is UB. Therefore, for any execution where line 2 is reached, p is not null. Therefore, !p on line 3 is always false. Therefore, the true branch (lines 4 and 5) is unreachable. Dead code elimination removes it entirely.What the compiler may emit:
int process(int* p) { return (*p) * 2; }No null check, no
log_error call. If the caller passes null, the program dereferences null and crashes (or worse). The "defensive" code was silently removed because it was unreachable under the compiler's assumptions.The fix: check before dereferencing, not after.
if (!p) return -1; int v = *p; keeps the null check alive because the dereference is now after the check.
-O2) and only on customer machines. Debug builds pass all tests; their local machine is fine. Which of the following is the most probable diagnosis?B is correct. Debug builds typically turn off optimisations and often zero-init memory, which masks UB like uninitialised reads. Release builds with
-O2 enable aggressive optimisation that turns dormant UB into visible crashes. "Breaks only in release" is the classic UB fingerprint.C is unlikely. OS differences can cause portability bugs, but they're usually visible in debug builds too. The optimisation-level signature points elsewhere.
D is naive. Different memory layouts can expose UB (uninitialised reads happening to be zero on one machine and garbage on another), but the underlying problem is still UB. The fix is to remove the UB, not to blame the customer's machine.
The next step: build with
-fsanitize=address,undefined, run the test suite, and read the sanitizer reports. That's the thing that turns a customer-only heisenbug into a CI failure you can fix.
Select every correct matching.
B: UBSan instruments arithmetic operations to check for overflow, div-by-zero, shift-out-of-range, and similar. Signed overflow is a core UB that UBSan catches reliably.
C: TSan logs every memory access and every synchronisation operation, and reports when two threads touch the same address without ordering. It's the standard tool for data races, though you can't run it alongside ASan.
D: MSan instruments every load to track whether the bytes were written. Valgrind's memcheck does similar work via dynamic binary translation. MSan is faster but Clang-only and needs dependencies to be MSan-built.
E: ASan catches both heap-buffer-overflow and stack-buffer-overflow. It puts "red zones" around each allocation (stack or heap) and traps on any access into them.
In practice, most projects run ASan+UBSan in one CI job and TSan in another. The separate jobs are because ASan and TSan instrument the same things and can't coexist.
int sum_with_limit(int a, int b) { int s = a + b; if (s < a) { // "overflow detection" return INT_MAX; } return s; }
a + b; if it overflowed (wrapping around and producing a value smaller than a), return INT_MAX as a safe ceiling. This would work for unsigned integers, because unsigned overflow is defined to wrap.The problem: for signed integers, overflow is undefined behaviour. The compiler is allowed to assume that, for any valid execution,
a + b does not overflow. Under that assumption, s >= a is always true (because b >= 0 implies a + b >= a, and for b < 0... well, the compiler's analysis is more sophisticated). The key point: the optimiser concluded the check is redundant and removed it.The compiler upgrade introduced a more aggressive version of this optimisation pass, which is why the bug didn't surface until then. Before the upgrade, the check was kept (maybe the old pass wasn't clever enough to spot it); after, it wasn't.
The correct overflow check: use
__builtin_add_overflow (GCC/Clang), or test against INT_MAX before adding:
if (b > 0 && a > INT_MAX - b) return INT_MAX; if (b < 0 && a < INT_MIN - b) return INT_MIN; return a + b;These checks happen before the overflowing operation, so they don't rely on the UB to detect it. UBSan would have caught the original code's UB the moment it ran on overflowing inputs.
void f() { int* p = malloc(sizeof(int)); *p = 42; free(p); std::cout << *p; // use-after-free }
p's bytes as "freed" after the free(), and traps on the subsequent read. You get a detailed report showing the allocation site, the free site, and the use-after-free site.B (Valgrind): yes. memcheck tracks the same kind of information as ASan and catches use-after-free with similar precision. Slower, but correct.
C (Clang Static Analyzer): yes, usually. A use-after-free this simple is well within static analysis's reach. More elaborate use-after-free (where the free and the use are in different functions) may escape it, but this case is detected reliably.
D (TSan): no. TSan is for data races and synchronisation. Single-threaded use-after-free is not its domain.
E (compiler warnings): unfortunately not, for this pattern. GCC and Clang can warn about some simple use-after-free (using
-Wuse-after-free or similar, available in newer versions), but in general this is too inter-procedural for compiler warnings. Static analysis or runtime sanitizers are the right tools.The practical recipe: compile-warnings-as-errors is the first line of defence; ASan in CI is the second; static analysis as a gate is the third. Each catches bugs the others miss.
// Job 1: correctness sanitizers flags: -O1 -g -fsanitize=address,undefined runs: full test suite // Job 2: concurrency sanitizer flags: -O1 -g -fsanitize=thread runs: full test suite (ASan and TSan can't coexist) // Job 3: release build + static analysis flags: -O2 -Wall -Wextra -Werror runs: static analysis (clang-tidy, cppcheck, or similar); full test suite at -O2; binary-size and performance regression checksWhy the split:
Job 1 catches the largest category of bugs: memory errors (use-after-free, buffer overruns, double-frees) and arithmetic or language-level UB (signed overflow, null derefs, alignment). Combining ASan and UBSan in one job is cheap (they compose) and comprehensive.
Job 2 is separate because ThreadSanitizer is incompatible with ASan; they instrument overlapping memory operations in conflicting ways. You have to pick one per build, so you run TSan separately to catch data races.
Job 3 tests what actually ships: optimised release builds behave differently from sanitized debug builds (sanitizers change allocation patterns and timing). Static analysis adds a third detection layer that finds bugs even in code paths the tests don't exercise. Running the test suite at
-O2 also catches UB that only manifests after aggressive optimisation.Optional additions: a nightly MSan job if uninitialised reads are a concern; a fuzz-testing job (libFuzzer, AFL) on any parser or deserialiser; a cross-platform matrix if you ship on multiple architectures. But three well-chosen jobs cover the vast majority of UB that a project is likely to produce.
Comments