Pointers: Thinking in Memory

Aliasing

Two pointers, one piece of memory. This innocent-sounding arrangement quietly shapes how the compiler reorders your code, why reinterpreting bytes through a different type is a one-way ticket to undefined behaviour, and why one unassuming keyword can make your inner loop twice as fast.

The word itself

In English, an alias is just another name for the same thing. When someone goes by Bob at work and Robert on their tax return, both names refer to the same person. In C, two pointers alias when they refer to the same chunk of memory. Different names, same address, same object.

Here's the simplest possible example:

int x = 10;
int* p = &x;
int* q = &x;   // p and q alias. They name the same int.

*p = 42;         // writing through p ...
printf("%d\n", *q);   // ... is visible through q. Prints 42.

Nothing mysterious on the surface. The word "aliasing" sounds technical, but the idea is just "more than one pointer to the same thing." You already write aliasing code every time you pass a pointer to a function: the function's parameter and the caller's variable alias for the duration of the call.

So why does aliasing get its own chapter? Because aliasing is the thing compilers worry about when they try to make your code fast, and it's the thing that turns innocent-looking type casts into undefined behaviour. Understanding what the compiler assumes about aliasing is the difference between code that runs and code that runs three times faster, and between code that works and code that quietly stops working when the optimiser turns up.

Two pointers that refer to the same memory are aliases. Every write through one is visible through the other. The compiler has to guess which pairs of pointers in your code might alias, and it guesses pessimistically by default.

Why the compiler cares

Look at this function:

void add_to_all(int* arr, int* delta, int n) {
    for (int i = 0; i < n; i++) {
        arr[i] += *delta;
    }
}

Looks simple. The loop adds *delta to every element of arr. A natural optimisation is to read *delta once, into a register, and reuse it across the loop. That would save n-1 memory loads. Easy win.

Except the compiler can't do it. Why? Because delta might point inside arr. Imagine the caller does this:

int nums[] = {1, 2, 3, 4};
add_to_all(nums, &nums[0], 4);   // delta aliases arr[0]

Now the first iteration reads *delta (which is 1), writes arr[0] = 1 + 1 = 2, and in doing so changes what *delta reads next time. On iteration two, *delta is 2, not 1. The cached value would have been wrong. The compiler, not knowing whether the caller will do this or not, has to play safe: re-read *delta every iteration.

That's the cost of aliasing. The compiler's default assumption is that any two pointers to the same type might alias, so it can't hoist loads, can't reorder reads and writes freely, and can't fold redundant memory accesses into registers. Every pointer write is a potential surprise for every pointer read.

The compiler optimises as if any two same-typed pointers might alias. Until it can prove they don't, every write is a wall and every read is fresh from memory.

Watch what happens when two pointers share a cell.

memory

Pick a scenario, then step through the operations. Watch how writes through a do or don't affect reads through b. The alias scenario shows why the compiler can't optimise: the second pointer's read is hostage to the first pointer's write.

The strict aliasing rule

Given that aliasing blocks optimisation, the C standard had a choice. Either leave things as they are and let the compiler stay conservative forever, or give the compiler some aliasing assumptions for free. The standard went with the second option.

The deal is called the strict aliasing rule, and it says, roughly: you can access an object of type T through a pointer of type T, or through a char*, but not through a pointer of some other unrelated type. If you violate this, behaviour is undefined, which in practice means the compiler is allowed to assume you didn't do it, and optimise on that basis.

The benefit is huge. With strict aliasing in force, the compiler can say: "this int* and that float* can never point at the same memory, because if they did, reading through one after writing through the other would be UB, and the user promises they don't write UB." That assumption unlocks real optimisation. Different-typed pointers are now guaranteed not to alias, and the compiler can reorder and cache across them freely.

Here's the canonical example of strict aliasing at work:

float f = 3.14f;
int* p = (int*)&f;      // pointing at a float through an int pointer
printf("%d\n", *p);      // UNDEFINED BEHAVIOUR

You've taken a float's bytes and told the compiler to read them as an int. That's not a conversion, that's a reinterpretation, and the standard forbids it outright. Most machines will happily go through with it and give you some integer made of the float's bits, but it's UB, and at high optimisation the compiler might do something stranger. It might assume p doesn't point at a float (because that would be UB, which you promised to avoid), and optimise accordingly.

The exceptions you're allowed

The rule has a small list of exceptions that keep ordinary code legal:

Same type. You can access a T through a T*. That's the normal case.
Signed/unsigned of the same type. You can access an int through an unsigned int* and vice versa. The standard treats these as compatible for aliasing.
A const or volatile qualified version of the type. A const int* and an int* can point at the same object without trouble.
A character type. char*, unsigned char*, and (in C) signed char* can legally be used to inspect the bytes of any object. This is the escape hatch for writing memcpy, hash functions, byte-level dumpers, and so on.
A struct or union that contains the type. Specific structural rules that let you navigate through aggregates sensibly.

Everything else is undefined. Reading a float through an int pointer: UB. Reading a struct through an unrelated struct's pointer: UB. Reading a long through a short*: UB. The rule is stricter than most beginners guess, and the failures tend to hide in release builds because optimisation is what makes the UB bite.

You can read an object through a pointer of its own type, through a compatible signed/unsigned or const variant, or through a char*. Everything else is undefined behaviour. The compiler relies on this to optimise.

Pick a pair of types. See if the aliasing is legal.

legal alias

undefined behaviour

Rows are the type of the actual object in memory. Columns are the pointer type you're using to access it. Click any cell to see why it's legal or undefined. Notice the column for char*: it's entirely green, because character pointers can legally peek at anything. That's the memcpy escape hatch.

Type punning (and why you should memcpy)

Sooner or later, you'll want to reinterpret some bytes. Maybe you're serialising a float into a network packet and you need its raw bits. Maybe you're reading a binary file and you want to view a buffer as a struct. This is called type punning, and there's a right way to do it.

The wrong way is the pointer-cast we just saw:

// WRONG: strict aliasing violation
float f = 3.14f;
unsigned bits = *(unsigned*)&f;   // UB

The right way is memcpy:

// CORRECT: defined behaviour, and usually just as fast
float f = 3.14f;
unsigned bits;
memcpy(&bits, &f, sizeof(bits));

This looks like it ought to be slower, but in practice every modern optimising compiler recognises this pattern and compiles it down to a single register move. You pay nothing at runtime, and you get defined behaviour. This is the idiom the standard wants you to use, and it's what quality codebases use for every kind of type-punning need.

Why does this work when the pointer cast doesn't? Because memcpy reads bytes, and byte-level access to any object is one of the explicit exceptions in the strict aliasing rule. Inside memcpy, the source is treated as a stream of bytes, not as a float, so no type rule is violated. The compiler can see what you're doing and generate the same efficient code as the cast, but without the UB.

In C11 and later, unions are also a legal option:

union FloatBits { float f; unsigned u; };
union FloatBits fb = {.f = 3.14f};
unsigned bits = fb.u;         // legal in C

Note that this is a C guarantee. In C++ it's technically undefined behaviour to read from a union member other than the last one written, although most compilers tolerate it as an extension. When in doubt, across both languages, use memcpy. It's the universally defined choice.

Every type-punning need can be expressed as a memcpy. It's defined behaviour, it compiles to the same instructions as the cast you were tempted to write, and it works the same in C and C++. Use it.

restrict: promising the compiler that you won't alias

Remember the add_to_all function from earlier? The compiler couldn't hoist *delta out of the loop because delta might have aliased arr. C99 introduced a keyword to let you promise the compiler that you won't do that:

void add_to_all(int* restrict arr, int* restrict delta, int n) {
    for (int i = 0; i < n; i++) {
        arr[i] += *delta;
    }
}

The restrict keyword is a promise from you to the compiler: "for the lifetime of this pointer, the object it points at won't be accessed through any other pointer." With that promise in hand, the compiler can now hoist *delta out of the loop, vectorise freely, and produce the optimised code you wanted all along.

The catch is that restrict is a promise you make, and the compiler trusts it without checking. If you lie, if the caller passes in overlapping pointers, you've written undefined behaviour. The program might work, might produce wrong results, might crash. The compiler did exactly what you said it could do; the bug is yours.

This is why restrict shows up all through the standard library's signatures. For instance:

void *memcpy(void *restrict dest, const void *restrict src, size_t n);

That restrict is the reason memcpy is allowed to be so fast: the standard declares it illegal to call with overlapping buffers, and the compiler can generate code that wouldn't handle overlap correctly. (If you do need to copy overlapping regions, the standard library gives you memmove, which handles overlap at some cost.) When you see restrict in an API, it's telling you "do not pass aliasing pointers into this function."

When does restrict actually help?

Not every function benefits. The compiler already knows that two differently-typed pointers can't alias (thanks to strict aliasing), so restrict is redundant there. Where restrict earns its keep is when you have multiple pointers of the same type going into a function, and the compiler would otherwise have to assume they might overlap.

Classic candidates: numerical kernels, image filters, linear algebra routines, encoders and decoders, anything where the inner loop reads from one same-typed buffer and writes to another. In those loops, restrict can be the difference between a scalar loop and a fully vectorised one, and benchmark differences of 2x or more are common.

A few more practical notes:

It's not available in standard C++. C++ borrows the idea via vendor extensions (__restrict in MSVC, __restrict__ in GCC and Clang) but there's no portable spelling. Use the extension your compiler provides.
It's a function-level promise. Typically you put it on function parameters; it says the pointer and its pointees won't be aliased during the function's execution.
It's easy to get wrong in refactoring. If you split a function and pass the same restrict-qualified pointer to two places where they can now alias, you've lied to the compiler without realising it. Keep restrict close to the loop that actually benefits.

restrict is a one-sided contract. You swear the pointer won't alias anything else in its scope; the compiler optimises on your word. Useful for hot numeric loops with same-typed inputs and outputs. Wrong promise, silent bug.

Toggle restrict. Watch what the compiler is allowed to do.

Source no restrict

What the compiler emits pseudo-asm

With restrict off, the compiler reloads *delta on every iteration (it can't trust that the previous write through arr didn't change it). Flip restrict on, and you're telling the compiler that delta and arr don't overlap; the load hoists out of the loop, and vectorisation becomes legal. The keyword is tiny, the optimisation budget it unlocks is not.

Practical debugging: when aliasing bugs strike

Aliasing bugs have a signature. Code works in debug builds. Code breaks in release builds. Adding a printf makes the bug go away. The numbers look almost right. You stare at the source, run it in gdb, and everything seems fine; you run it without gdb and it crashes. If this has ever happened to you, strict aliasing is a suspect.

What makes aliasing bugs hard is that the compiler's optimisations, not your code, are what expose the UB. At -O0, the compiler makes few assumptions and executes your code more or less literally. At -O2 or -O3, it starts aggressively using strict aliasing to reorder and cache, and that's when the UB in your cast finally matters. This is why "it worked fine yesterday" often means "I bumped the optimisation level."

A short toolkit for diagnosing:

Compile with -Wstrict-aliasing (GCC and Clang). At -Wstrict-aliasing=2 or 3, the compiler will flag many pointer casts that violate the rule. It misses some cases, but the hits are real and often the whole bug.
Build with -fno-strict-aliasing as a diagnostic. If your bug vanishes, strict aliasing was the cause. This flag tells the compiler to assume everything might alias everything, which kills some optimisations but makes the code behave as most people naively expect. Many large C codebases (notably the Linux kernel) build with this flag permanently.
Replace casts with memcpy. Every *(T*)p that punned a type becomes memcpy. Defined behaviour, usually same machine code.
UBSan (-fsanitize=undefined). The undefined-behaviour sanitiser catches several classes of aliasing UB at runtime, with source-line reports. Expensive but precise.

The general lesson is: if you're reinterpreting bytes, use memcpy. If you're telling the compiler two pointers don't alias to unlock an optimisation, use restrict and make sure you're not lying. If a program behaves differently at different optimisation levels, suspect UB first and aliasing second.

A brief C++ note

C++ inherits C's strict aliasing rule almost word for word, with one important addition: std::bit_cast (C++20). It's the statically-typed, constexpr-friendly spelling of "reinterpret these bytes as that type." It replaces the memcpy idiom for most uses, with cleaner syntax:

// C++20: clean and checked at compile time
float f = 3.14f;
auto bits = std::bit_cast<unsigned>(f);

It requires the source and destination types to be the same size and both trivially copyable, which the compiler checks for you. No UB, no memcpy boilerplate. If you're writing modern C++, this is the preferred tool.

A second C++ note: unions for type punning are not portable in C++, even though they work in C. The C++ standard says reading from a union member other than the last one written is undefined; GCC and Clang tolerate it, but MSVC and the standard disagree. In C++ code, reach for bit_cast (C++20) or memcpy (any version). Avoid unions unless you genuinely need a tagged-union type.

Summing up

Two pointers alias when they name the same memory. Aliasing is fine; what's not fine is lying about it, which you can do in two directions. You can lie towards the compiler by casting through an incompatible type (violating strict aliasing), and you can lie to the compiler by putting restrict on a parameter that really does alias. Both are UB, both are hard to debug.

The strict aliasing rule lets the compiler assume that different-typed pointers don't refer to the same memory. That unlocks optimisation, but it makes casts-through-types undefined behaviour. When you need to reinterpret bytes, memcpy is the defined-behaviour idiom, and every modern compiler turns it into efficient code.

restrict is the opposite direction: you tell the compiler your same-typed pointers don't alias, and it optimises accordingly. Use it in numerical inner loops where the default assumption of possible aliasing is what's stopping the compiler from caching values and vectorising.

When aliasing bugs appear, they usually look like optimisation-level-dependent heisenbugs. Build with strict-aliasing warnings turned on, reach for -fno-strict-aliasing to bisect, and replace pointer-cast punning with memcpy.

What's next

Part 9 moves into C++ and picks up the thread of smart pointers. We've now seen that raw pointers can lie about aliasing, can dangle, can leak. Smart pointers encode ownership directly in the type, so the compiler (not your memory, not your comments) tracks who's responsible for freeing what. It's one of the few places where C++ actually makes memory management easier than C.

You've seen how two pointers can secretly share memory. Next: pointers that carry ownership in their types, so the sharing is no longer a secret.

Test yourself

Seven questions on the strict aliasing rule, type punning, restrict, and aliasing debugging. Five correct means you're ready for Part 9.

Which of these accesses are legal under the strict aliasing rule? (Select all that apply.)

int x = 42;
float f = 3.14f;

// (A)
unsigned int* up = (unsigned int*)&x; unsigned v = *up;

// (B)
char* cp = (char*)&f; char c = *cp;

// (C)
int* ip = (int*)&f; int n = *ip;

// (D)
const int* cip = &x; int n = *cip;

Legal: A, B, D. Undefined behaviour: C. A is legal. The standard allows signed and unsigned versions of the same integer type to alias. Reading an int through an unsigned int* is fine.
B is legal. A char* can legally read the bytes of any object. This is the byte-inspection exception, and it's what memcpy depends on.
C is UB. int and float are unrelated types. The cast is syntactically allowed, but the dereference violates strict aliasing. The bit pattern of 3.14f does not become a sensible int; at high optimisation, the compiler may do something entirely unexpected.
D is legal. Adding const is always safe for aliasing. You can always read an object through a more-qualified version of its type.

You need to extract the raw IEEE-754 bit pattern of a float as a uint32_t. Which of the following is the recommended approach in portable C?

C is correct. A is UB. A cast-through-unrelated-pointer type is exactly the strict aliasing violation we've been warning against. May appear to work; may silently misbehave at -O2.
B is wrong. This doesn't extract bits; it performs a value conversion from float to uint32_t. (uint32_t)3.14f yields 3, not the bit pattern 0x4048F5C3.
C is correct. memcpy is defined, portable, type-safe, and compiles down to a single move on any modern optimising compiler. This is the idiomatic answer.
D is meaningless. That casts the address of f to an integer, not the contents. Useful for printing pointer values, not for reading IEEE bits.

A function is declared:

void blend(int* restrict dst,
           const int* restrict a,
           const int* restrict b,
           int n);

Which of the following calls are safe? (Select all that apply.)

Safe: A, C. Undefined behaviour: B, D. A is safe. All three pointers point at distinct, non-overlapping objects. This is exactly what restrict expects.
B is UB. dst has restrict, and so does a. Passing the same buffer for both violates the restrict contract: the function is permitted to write through dst assuming no read through a sees that write.
C is safe. This one surprises people. Two read-only inputs being the same buffer is fine, because the function's writes only go to dst. Neither a nor b is written through, so the restrict promise (no write-then-read through aliased pointers) isn't violated. C's strict reading of restrict actually allows read-only sharing among const-qualified restrict parameters.
D is UB. dst and a overlap in memory, which directly contradicts the restrict promise on both.

A colleague says: "My code works at -O0 but segfaults at -O2. The compiler must have a bug." What's the most likely explanation, and how would you investigate?

The code almost certainly has undefined behaviour. Strict aliasing is a strong suspect. Compiler bugs that cause optimisation-level-dependent crashes are very rare; UB in user code is very common. The optimiser exposes UB that the unoptimised build happened to tolerate. Aliasing UB is a classic culprit here because it's invisible in debug builds (no optimisation means no aliasing assumptions) but suddenly matters at -O2.

Diagnostic steps, in order:
1. Rebuild with -Wall -Wextra -Wstrict-aliasing=2. Check every warning.
2. Try -O2 -fno-strict-aliasing. If the crash goes away, strict aliasing is confirmed.
3. Run under UBSan (-fsanitize=undefined). It catches a lot of aliasing UB at the moment it happens and prints the source line.
4. Grep the code for C-style pointer casts between pointer types ((T*)) and for reinterpret_cast in C++. Each one is a suspect; replace with memcpy or bit_cast.
5. Only after all of the above comes up empty should you suspect the compiler. File a bug with a minimal reproducer; expect the maintainers to show that your code was UB.

Why does the standard library declare memcpy as

void *memcpy(void *restrict dest,
               const void *restrict src,
               size_t n);

and what should you call instead if you need overlapping copies?

B is correct. The restrict qualifiers on memcpy's parameters tell the compiler (and document to the caller) that the source and destination must not overlap. With that promise in hand, the compiler can emit a fast forward copy, often using wide SIMD loads and stores that would be incorrect if the buffers overlapped.

For the overlapping case, C provides memmove, which is specified to handle overlap correctly. It's slightly slower than memcpy because it has to check direction and potentially copy backwards, but it's defined. strcpy copies null-terminated strings and makes no overlap guarantees of its own; don't use it as an overlap-safe copy.

Read this C99 code. What does the compiler output for the expression in printf at -O2, assuming strict aliasing is enforced?

int compute(int* a, float* b) {
    *a = 1;
    *b = 2.0f;
    return *a;
}

// Caller:
int x;
printf("%d\n", compute(&x, (float*)&x));

D. The caller violates strict aliasing. Behaviour is undefined. The cast (float*)&x produces a float* that names an object of type int. The moment compute dereferences b to write 2.0f, the program is accessing an int through a float*. That's forbidden under strict aliasing, and the compiler is permitted to assume it didn't happen.

In practice, under -O2, the compiler will assume a and b can't alias (different types!) and may hoist the final *a to a register loaded before the float write, returning 1 even after *b has clobbered the memory. Under -O0, the write happens through memory and the return reads back whatever bits the float write left behind, so the function might return the IEEE pattern of 2.0f as an int. Both outcomes are legal. The behaviour is undefined.

The fix: don't type-pun through pointer casts. Use memcpy if you really need to reinterpret bits.

You're writing a vector operation that reads two input arrays and writes to an output array:

void vec_add(float* out, float* a, float* b, size_t n) {
    for (size_t i = 0; i < n; i++) {
        out[i] = a[i] + b[i];
    }
}

Benchmarking shows the loop isn't being auto-vectorised. Rewrite the signature to give the compiler what it needs, and explain the tradeoff.

Add restrict to all three pointer parameters.

void vec_add(float* restrict out,
             float* restrict a,
             float* restrict b,
             size_t n) {
    for (size_t i = 0; i < n; i++) {
        out[i] = a[i] + b[i];
    }
}

Without restrict, all three pointers are float*, and the compiler must assume that out might overlap with a or b. That assumption blocks vectorisation because a SIMD store could clobber values the next iteration wants to read. With restrict, the compiler is permitted to assume no overlap and can emit wide SIMD code (4 or 8 floats per instruction).

The tradeoff is on the caller. They must now guarantee that out doesn't overlap with a or b. Calling vec_add(buf, buf, x, n) to do an in-place add is no longer safe; it's UB. If you need the in-place variant, provide a second function without restrict, or document the restriction clearly so callers don't trip over it. A common pattern is to have two functions: the restrict one for the fast path, and a slower overlap-safe one for the general case, modelled on memcpy vs memmove.

How did you do?

5 or more correct, you're ready for Part 9. Less than that, the common trouble spots are Q3 (restrict with multiple parameters) and Q6 (why the caller's cast creates UB, not the function). Re-read "The strict aliasing rule" and "restrict" sections, then try again.

Aliasing

The word itself

Why the compiler cares

The strict aliasing rule

The exceptions you're allowed

Type punning (and why you should memcpy)

restrict: promising the compiler that you won't alias

When does restrict actually help?

Practical debugging: when aliasing bugs strike

A brief C++ note

Summing up

What's next

Test yourself

Comments