Addresses, & and *
The syntax you've been dreading, explained by what it actually means. Three characters do almost all the work, and the asterisk is playing two roles at once.
The cast of characters
In the last post we built the motivation for pointers without writing a single line of pointer code. This post is where we finally earn the right to show syntax, because you now know what it's trying to say. Three symbols do almost everything in this territory. It's honestly embarrassing how small the vocabulary is compared to how much confusion it generates.
&— the address-of operator. Put it in front of a variable and you get back its address.*— the dereference operator. Put it in front of a pointer and you get back the thing it points to.int*,char*,float*— a type. "Pointer to int", "pointer to char", and so on.
Three things. That's the whole alphabet. Every complicated pointer expression you've ever stared at is a combination of these. The trouble is that the asterisk shows up twice for two totally different reasons, and every beginner has to spot the trick the first time.
The address-of operator: &
Let's start with the simpler one. When you write &x, you're asking the compiler "give me the address of x." That's it. The result is a value, specifically a value of type "pointer to whatever x is."
int x = 42; // x is a normal int, living at some address the compiler chose &x // this expression evaluates to x's address, e.g. 0x7ffe4a1b // type of &x is int*, read "pointer to int"
You've almost certainly seen & before without thinking of it as an operator. Every scanf call uses it: scanf("%d", &n). The reason you write &n and not n is that scanf needs to modify your variable, and to do that it needs to know where your variable lives. You're handing it the address so it can reach in and write there.
If you tried scanf("%d", n), you'd be passing the current value of n, which is garbage, and scanf would interpret that garbage as an address and try to write to it. That's where mysterious crashes come from in beginner C programs. The missing & is the bug. You've been doing pointer programming all along without knowing the vocabulary.
The type: int* and friends
Once you have an address, you need a place to put it. That place is a variable of pointer type.
int x = 42; int* p = &x; // p is a pointer to int. It holds the address of x.
Read int* as "pointer to int." The type tells the compiler two things at once. First, p is a pointer, so it's always the same size on a given machine (8 bytes on 64-bit systems, regardless of what it points to). Second, what lives at the address is an int, which tells the compiler how many bytes to read when you dereference, and how to interpret them.
This is why int* and char* are different types even though both are 8 bytes. They point to different sized things. Pointer arithmetic, casts, and strict aliasing all depend on this distinction. More on all of that later. For now, just know that the type sticks with the pointer.
The dereference operator: *
Now the fun part. Once p holds an address, how do you actually use what's at that address? You dereference with *.
int x = 42; int* p = &x; int y = *p; // read: follow p, give me what's there. y is now 42. *p = 99; // write: follow p, put 99 there. x is now 99.
That last line is the whole point of pointers, right there. You wrote through p and it changed x, even though you never mentioned x. The pointer was holding x's address, and dereferencing it reaches back to x's location in memory.
When you first see this, it feels magical. It isn't. It's the most basic operation in the language. You're just asking the CPU to use the address stored in p to do a memory access, the same kind it does every time you use any variable. The only difference is that the address came from a pointer variable instead of being fixed at compile time.
Flip between "Name view" (how the code reads) and "Address view" (what the machine actually sees). The variable name x is a label for a specific address. &x gives you that address. *p follows the address back to the value. Step through the code and watch the green highlight travel.
The asterisk paradox
Here's the thing that trips everyone up. In a declaration, * is part of the type. In an expression, * is an operator that dereferences. Same symbol, different jobs.
int* p; // declaration: * means "p is a pointer" *p = 5; // expression: * means "go fetch what p points to"
You can tell them apart by context. If there's a type right before the star, it's a declaration. If there's nothing or an operator before it, it's dereferencing. Compilers figure this out by knowing what kind of place they're parsing. For humans, it just takes a little practice.
There's a historical reason for the overloading, actually a charming one. The designers of C wanted "declaration mirrors use." When you write int* p, you're saying "the expression *p will have type int." So the declaration looks like the usage. This is why C types are read inside-out and get twisted into pretzels for complex cases. It was clever in 1972 and has confused students for 50 years.
int* p. It's meant to remind you that *p is an int.Most modern code bases, and Bjarne Stroustrup himself, prefer writing int* p with the star next to the type, to emphasise that "pointer to int" is a single thing. K&R-style C often writes int *p with the star on the variable side, to make sure beginners don't fall for the multi-declaration trap, which we'll hit in a moment. Both styles are fine. Be consistent.
The multi-declaration trap
Now for the most popular rite of passage in C. What does this declare?
int* a, b;
Nine out of ten beginners read this as "two pointers to int." They're wrong. It declares a as int* and b as plain int. The asterisk binds to the variable name, not to the type. Even though you wrote int* as a visual unit, the compiler parses it as int with a * that sticks to a.
This is why many style guides recommend writing one declaration per line, or at least this form when you must bundle them:
int *a, *b; // both are pointers, because each has its own * int *a; // or just give each its own line. honestly the best option. int *b;
If you take only one thing from this section: the * in a declaration is glued to the variable, not to the type. Once you see that, the rest of C declaration syntax stops being mysterious and starts being merely annoying.
Pick a preset or type your own. The parser shows you which variables are pointers and which are plain values. Pay attention to the classic int* a, b; case. If you expected two pointers, you're in good company, and you're wrong. The star glues to a alone.
Dereferencing is navigation
Here's a mental model that will serve you for the rest of this series. Think of memory as a city, and addresses as street numbers. A regular variable like x = 42 is like saying "the house at 1000 Maple Street contains 42 people."
A pointer p = &x is saying "here's a note that says '1000 Maple Street.'" The note is not the house. The note is not the people. It's a piece of paper with an address written on it.
When you dereference, *p, you're saying "follow the note to the house, then look inside." You hop from the note to the address, from the address to what lives there. Reading with *p means "how many people live in that house?" Writing with *p = 99 means "change the number of people in that house to 99."
The note itself could be anywhere. p might be sitting at 2000 Oak Street, holding a note that says 1000 Maple Street. Dereferencing is always a hop from wherever the pointer lives to wherever it's aimed.
Click an expression and watch the animation show what the machine does. Notice that x and *p are different ways of saying the same thing. They land on the same house. &x and p are also the same, they're both the note pointing at x. And p itself lives somewhere too, at address &p. Everything has an address, all the way down.
Initialising pointers: don't skip this
A pointer variable, like any variable, starts out holding whatever garbage happened to be in that memory. An uninitialised pointer is a wild pointer, and dereferencing one is the programmer equivalent of driving blindfolded.
int* p; // p holds garbage. could be anything. could crash the OS. *p = 5; // undefined behaviour. might crash. might silently corrupt something.
Always initialise. If you don't have a real target yet, initialise to null. Null means "this pointer is intentionally pointing at nothing," which at least gives you a well-defined state you can check for.
// Classic C style: int* p = NULL; if (p != NULL) { *p = 5; }
// Modern C++11 and later: int* p = nullptr; if (p != nullptr) { *p = 5; }
In C, NULL is usually defined as (void*)0. It works, but it's a macro, and in overloaded contexts it can sometimes resolve to integer zero, which causes surprises. C++11 introduced nullptr, a keyword with its own type (std::nullptr_t), specifically to dodge those surprises. If you're writing C++, use nullptr. If you're in C, use NULL. Either way, don't leave pointers uninitialised.
Where C++ adds references
C++ introduced something that C doesn't have: references. A reference is syntactic sugar that gives you most of what pointers offer, with fewer ways to hurt yourself. You've almost certainly seen them in function signatures.
void increment(int* x) { (*x)++; } int n = 5; increment(&n); // must pass &n // n is now 6
void increment(int& x) { x++; } int n = 5; increment(n); // no &, no * // n is now 6
Under the hood, a reference is almost always implemented as a pointer. But the language hides the indirection from you. No & at the call site. No * inside the function. The reference just behaves like another name for the same object.
References come with restrictions that make them safer. A reference must be initialised when it's declared, it can't be reseated to point at something else, and it can't be null. Those three rules eliminate a huge class of pointer bugs at the cost of flexibility. We'll dig into references properly in a later post, because they deserve their own room. For now, just know that C++ gives you two tools where C gives you one, and most C++ style guides recommend references by default.
Putting it all together
Let's close with a tiny program that uses every idea in this post. Read it slowly. There's more going on here than it looks.
// C version (C++ is identical except nullptr) #include <stdio.h> int main(void) { int x = 42; int* p = NULL; // safe start p = &x; // p now holds x's address printf("x lives at: %p\n", (void*)&x); printf("p holds: %p\n", (void*)p); printf("value of x: %d\n", x); printf("value via *p: %d\n", *p); *p = 99; // reach through p and change x printf("after *p = 99: x = %d\n", x); return 0; }
Run it. The two %p lines will print the same address, because &x and p are the same value. The two %d lines both print 42, because x and *p both read from the same memory. And then *p = 99 writes through the pointer, and when you print x, you'll see 99, because x and *p are now two ways of saying the same location.
If you've followed this far, you now have the whole mental model. Variables live at addresses. & gives you the address. A pointer is a variable that stores one. * follows the address back to the value. The syntax is weird in spots for historical reasons, but the ideas are small and sharp.
What's next
Now that we can declare, assign, read, and write through pointers, the next thing to understand is one of the most productive sources of bugs in the language: the relationship between pointers and arrays. They look almost interchangeable. They aren't. Next post, we'll dig into why int arr[5] and int* p feel like the same thing and behave like the same thing right up until the moment they don't.
Test yourself
Before you move on, five questions to check the mental model. No syntax tricks, no rote recall. Each one probes something the post actually taught. Try them honestly before peeking at the answer. If you can't get at least four of these, re-read the relevant section before heading into Part 3.
int x = 10; int* p = &x; *p = 20; printf("%d\n", x);
p holds x's address. *p = 20 means "go to that address and write 20 there." Since that address is x's location, x is now 20. The point: x and *p are two names for the same memory.
int* a, b;
* binds to the variable name, not the type. So a picks up the star and becomes int*, but b is just a plain int. If you wanted two pointers, you'd write int *a, *b; (each one with its own star) or, better, put them on separate lines.
int* p; *p = 5;
p was declared but never initialised, so it holds whatever garbage was left on the stack. Writing through it (*p = 5) sends the value 5 to some arbitrary address the machine will happily try to write to. The result is undefined behaviour: it might crash immediately, silently corrupt unrelated memory, or appear to work fine until it doesn't. There's no guaranteed symptom, which is what makes it dangerous. Always initialise pointers to NULL (C) or nullptr (C++) if you don't have a real target yet.
p holds the address of variable x. Which of these pairs evaluate to the same value? Select all that apply.x reads the value at x's address. *p follows p (which holds x's address) and reads what's there. Same value.B —
&x is x's address. p stores x's address. Same value.C and D are wrong because
&p is a different address entirely — it's where the pointer variable itself lives, not where it points. Every variable has its own address, including pointer variables.
const int* p1; // pointer to const int int* const p2 = &x; // const pointer to int const int* const p3 = &x; // const pointer to const int
For each, answer: can you change the pointer itself (make it point elsewhere)? Can you change the value it points to?
const int* p1): "p1 is a pointer to a const int." You can change what p1 points at (reseat it), but you cannot change the int through p1.p2 (
int* const p2): "p2 is a const pointer to int." You cannot reseat p2, but you can change the int it points to.p3 (
const int* const p3): "p3 is a const pointer to a const int." Neither the pointer nor the pointee can be changed.Trick: find the
*. Everything to its left describes the pointee; everything to its right describes the pointer itself. Part 7 goes deep on this.
Comments