Pointers: Thinking in Memory

Pointers and Functions

First we pass pointers into functions. Then we return them out. Then we discover that functions themselves have addresses, which means we can point at them too. That last step is where pointers graduate from "a thing you store data with" to "a tool for building programs that compose."

Functions only see copies

Here's the rule that catches every beginner at least once. When you call a function and pass it an argument, the function receives a copy. The original is untouched. If the function modifies its parameter, it's modifying the copy, and when the function returns, the copy vanishes along with the frame. This is called pass by value, and it's the default for every argument in C.

The canonical demonstration is the swap function that doesn't swap:

void swap(int a, int b) {
    int tmp = a;
    a = b;
    b = tmp;
}

int main(void) {
    int x = 1, y = 2;
    swap(x, y);
    printf("%d %d\n", x, y);  // prints 1 2
}

The function gets its own a and b on its stack frame, holding copies of x and y. It shuffles its copies. Its frame pops. The originals in main are exactly as they were. To actually touch the caller's variables, the function has to be told where they live, not just what they contain:

void swap(int* a, int* b) {
    int tmp = *a;
    *a = *b;
    *b = tmp;
}

int main(void) {
    int x = 1, y = 2;
    swap(&x, &y);
    printf("%d %d\n", x, y);  // 2 1
}

The function still gets copies. But now those copies are addresses. Dereferencing them reaches back into main's frame and mutates the originals. That's the trick. You can't modify someone else's variable directly, but you can follow a map to where it lives and modify it there.

Pass by value copies the data. Pass by pointer copies the address. Only the second one lets you reach back and change the original.

Watch the swap succeed and fail.

Memory

Step through to watch what happens to x, y, and the function's local copies.

Tab between the two versions. In the value version, a and b inside swap are fresh variables with copies of x and y's values, and modifying them touches nothing else. In the pointer version, a and b hold the addresses of x and y, so dereferencing them mutates main's frame.

Returning pointers safely

Functions can return pointers too, and this is where lifetime thinking earns its keep. Every returned pointer is a promise to the caller: "this will still be valid when you read through it." Keep the promise, you've given the caller useful access to memory you prepared. Break it, you've handed them a dangling pointer.

From Part 4, you already know the four kinds of memory. Here are the three safe patterns and the one unsafe one, attached to the signatures you'll actually see in code:

// SAFE: heap. Caller owns it and must free.
char* duplicate(const char* src) {
    size_t n = strlen(src) + 1;
    char* out = malloc(n);
    if (!out) return NULL;
    memcpy(out, src, n);
    return out;
}

// SAFE: pointer into caller's own memory. No ownership transfer.
char* find_char(char* s, int c) {
    while (*s) { if (*s == c) return s; s++; }
    return NULL;
}

// SAFE: static/global memory. Lives forever.
const char* weekday_name(int d) {
    static const char* names[] = {"Sun","Mon","Tue","Wed","Thu","Fri","Sat"};
    if (d < 0 || d > 6) return NULL;
    return names[d];
}

// UNSAFE: pointer to a local. Dangling on return.
int* broken(void) {
    int x = 42;
    return &x;   // x dies at return. pointer dangles.
}

The compiler will warn you about the last one. Listen to the warning. Never return a pointer to a local.

One category worth calling out explicitly: string literals. When you write return "hello";, you're returning a pointer to a string literal, and those live in read-only static memory for the entire life of the program. This is why the return type is const char* rather than char*. The pointer is safe to hold forever, but the memory it points at is read-only; modifying it through the pointer (s[0] = 'H';) is undefined behaviour. Returning string literals is one of the most common safe-return patterns in C, used in error-message helpers, tag-name lookups, and anywhere a function needs to hand back a fixed piece of text.

Every returned pointer is a promise about lifetime. The function owes the caller an answer to "where does this live, and who frees it?"

Code is in memory too

Now we get to the good part. Everything so far has been about pointers to data. Stack data, heap data, string literals. But your compiled program contains more than data. It contains code, the actual machine instructions that make your functions do what they do. And that code has to live somewhere when the program runs.

When you compile a C program, the output binary has a few sections. The stack and heap we already know about. There's a static data section for globals. And there's a section called the text segment (sometimes just "code" or ".text"), which holds the compiled machine instructions for every function in your program. At runtime, the OS maps this segment into your process's memory, usually as read-only and executable. Every function you've written gets laid out in that segment at some specific address.

So when we say int add(int a, int b) { return a + b; }, once compiled, add isn't just a name. It's a block of machine instructions sitting at some specific address, let's say 0x4005a0. When your code calls add(3, 4), what actually happens at the CPU level is a jump to address 0x4005a0, execution of the instructions there, and a return.

That means add has an address, just like x did. And just like we could store the address of x in a pointer variable, we can store the address of add in a pointer variable. The only difference is the type. A pointer to an int has type int*. A pointer to a function has a more elaborate type that describes the function's signature, because the compiler needs to know what arguments to pass and what return value to expect when we call through the pointer.

A function is a block of machine code at some address. If you can take the address of data, you can take the address of code.

Here's the simplest possible demonstration:

int add(int a, int b) { return a + b; }
int mul(int a, int b) { return a * b; }

int main(void) {
    printf("address of add: %p\n", (void*)add);
    printf("address of mul: %p\n", (void*)mul);
}

// Output, approximately:
// address of add: 0x55c3d85da145
// address of mul: 0x55c3d85da158

They're real addresses. Notice that mul sits right after add in memory. That makes sense: the linker laid them out one after the other in the text segment.

Now, if I can print those addresses, I can also store them. That's what a function pointer is.

int (*op)(int, int);   // declares op as a pointer to a function
op = add;                 // now op holds add's address
printf("%d\n", op(3, 4));  // 7.  Calls add through the pointer.

op = mul;                 // now op holds mul's address
printf("%d\n", op(3, 4));  // 12. Calls mul through the pointer.

The variable op is just a pointer. It holds whichever address you put in it. Calling op(3, 4) means "jump to whatever address op currently holds, with arguments 3 and 4." The same line of code can invoke completely different functions depending on what op points at.

This is the entire magic trick. You've decoupled "what function gets called" from "the point in the code where the call happens." That decoupling is the foundation for the patterns we're about to see.

Functions have addresses. Pointers can hold them.

Memory layout

Function pointer

int (*op)(int, int) = NULL;

Actions

Explanation

Click op = add to point the function pointer at add's address. Then click op(3, 4) to call through it. The CPU jumps to whatever address the pointer holds.

Call log

The left side is a stylised view of the text segment, where compiled functions live. Each function occupies a chunk of memory at its own address. On the right, op is a function pointer sitting on the stack. Assigning op = add copies the address of add into op. Calling op(3,4) jumps to that address. Same call site, different function, just by changing what the pointer holds.

Your first function pointer

Let's start as small as possible. One function. One pointer to it. We'll call the function directly, then call it through the pointer, and see that the result is the same.

void greet(void) {
    printf("hello\n");
}

int main(void) {
    greet();           // normal call. Prints "hello".

    void (*gp)(void);  // gp is a pointer to a function that takes no args and returns nothing
    gp = greet;        // gp now points at greet
    gp();              // call through the pointer. Also prints "hello".
}

Read those two bottom lines slowly. gp = greet; doesn't copy the function. It copies the address of the function. After this line, gp holds whatever address the machine has decided to put greet at. Then gp() says "go to whatever address gp holds and execute the code there." Which is greet. So you get "hello."

That's the whole mechanic. A function pointer is just a variable that holds a function's address. Calling through it means "jump to whatever address I have."

Notice what you didn't write. You didn't write &greet. In C, the name of a function decays to its address in most contexts, the same way an array name decays to a pointer. So gp = greet and gp = &greet both work and both mean the same thing. Same for calling: gp() and (*gp)() are equivalent. Pick a style and stick to it; the community mostly uses the short forms.

A function pointer is a variable. It holds the address of a function. Call through it and the CPU jumps there.

Making the pointer point somewhere else

A pointer variable can be reassigned. So can a function pointer. Add a second function and watch what happens when we change what gp points at.

void greet(void) { printf("hello\n"); }
void shout(void) { printf("HELLO!\n"); }

int main(void) {
    void (*gp)(void);

    gp = greet;
    gp();            // "hello"

    gp = shout;
    gp();            // "HELLO!"
}

The line gp() appears twice. It's the same line of code. But the output is different, because the pointer pointed at two different functions when the line ran. The call site is fixed. The behaviour is fluid.

This is worth sitting with for a moment. In every language feature you've used before this, the function being called is known the moment the source code is written. printf(...) always calls printf. sqrt(9) always calls sqrt. You couldn't write a line of code and have the identity of the function being called be decided later. With function pointers, you can.

Choosing a function at runtime

Now let's use that flexibility for something concrete. We'll pick which function to point at based on a decision the program makes while running.

void greet(void) { printf("hello\n"); }
void shout(void) { printf("HELLO!\n"); }

int main(void) {
    int loud = 1;           // imagine this came from user input

    void (*gp)(void);
    if (loud) gp = shout;
    else     gp = greet;

    gp();                  // prints whichever one we picked
}

This is a toy example, but it's exactly the shape of every runtime-configured behaviour you'll write for the rest of your career. Based on some condition, set a function pointer. Then call through it. The call site doesn't care which function it ended up pointing at.

If your mental model is still shaky, trace what's happening on the machine. gp is a variable on the stack, big enough to hold an address. The if statement writes one of two addresses into it. The call gp() reads that address and jumps there. No magic. Just a pointer doing what pointers do, with code on the receiving end instead of data.

The decision of which function to call can happen at runtime, not at the call site. This single capability changes what kinds of programs you can write.

Reading the syntax

Okay, now that you've written a few of these, let's face the declaration. The syntax for function pointers is notoriously unfriendly, and it's unfriendly for a reason: C's parentheses are doing structural work that changes meaning.

Anatomy of a function pointer declaration

int (*fp)(int, int)

fp is a pointer. When you call it, you pass (int, int) and you get back an int. Read it starting from the name: figure out what fp is first (a pointer, because of the *), then what the call returns (on the left), then what it takes (on the right).

The parentheses around *fp matter. A lot. Here are two declarations that look almost the same and mean completely different things:

int (*fp)(int, int);  // fp IS a pointer. It points at a function. This is what we want.

int *fp(int, int);    // fp IS a function. It returns an int*. Different role entirely.

Without the parens, the * binds to the return type, making the declaration mean "function named fp that takes two ints and returns int*." fp here is not a storable variable. It's a function declaration, like any other prototype.

With the parens, the * binds to the name, making it "fp is a pointer, and when called, returns int." Now fp is a variable.

Same characters, different role for fp. Both involve pointers somewhere, but one makes fp itself a pointer while the other makes fp a function that returns a pointer. Read from the name outward.

Typedef and using are mercy

The declaration syntax is so ugly that essentially nobody writes it raw in real code. Name the type once with a typedef, then use the name naturally:

typedef void (*Greeter)(void);  // name the type "Greeter"

Greeter gp = shout;         // reads like a normal declaration now
gp();

In C++11 and later, using is even nicer because it reads left-to-right:

using Greeter = void(*)(void);

Commit to one style across your codebase. Raw function pointer declarations scattered through the code are a smell; somebody missed the typedef memo.

Passing a function to another function

Here's the step where function pointers go from "neat trick" to "genuinely useful." If a variable can hold a function, you can pass that variable as an argument. Which means you can pass a function to another function.

The simplest possible example. A function that takes another function, and calls it twice.

void greet(void) { printf("hello\n"); }
void shout(void) { printf("HELLO!\n"); }

void do_twice(void (*fn)(void)) {
    fn();
    fn();
}

int main(void) {
    do_twice(greet);   // prints "hello" twice
    do_twice(shout);   // prints "HELLO!" twice
}

Study do_twice. Its parameter is called fn, and its type is "pointer to a function that takes nothing and returns nothing." Inside the body, we treat fn like any other function: call it. The parameter is a function, for the duration of the call.

At the call site, do_twice(greet) passes greet's address as the argument. Then do_twice calls through the parameter. You, the caller of do_twice, chose what behaviour happens inside it.

That's the idea. do_twice doesn't know what function you'll pass. It just calls whatever you hand it. You can pass greet, or shout, or any function that matches the signature. A function that takes nothing and returns nothing? do_twice will call it twice, no questions asked.

You gave do_twice a function. It called yours for you. That's the whole shape.

The pattern has a name: callback

What you just wrote has a name. The function you passed to do_twice is called a callback. The name comes from the fact that do_twice "calls back" into your code through the function pointer you handed it. Some other piece of code is temporarily in charge of when and how your function runs.

That's all a callback is: a function you pass to another function so it can call you. No more, no less. Every fancy callback system you'll ever encounter is a variation of the do_twice example. The differences are in what gets passed, when it gets called, and why. The mechanic is identical.

Before we see why callbacks matter, let's write one more. A function that calls your callback once per element in an array. That way the callback does something useful to each piece of data.

void for_each(int* arr, int n, void (*fn)(int)) {
    for (int i = 0; i < n; i++) {
        fn(arr[i]);
    }
}

void print_it(int x) { printf("%d ", x); }
void print_squared(int x) { printf("%d ", x*x); }

int arr[] = {1, 2, 3, 4};
for_each(arr, 4, print_it);        // prints: 1 2 3 4
for_each(arr, 4, print_squared);   // prints: 1 4 9 16

Same for_each. Two different behaviours, just by swapping the callback. The loop lives in for_each; what to do with each element lives in the callback. Two separate concerns, cleanly divided. That's the first real taste of why callbacks matter.

Why this is a big deal

Picture a realistic scenario. You've written a sorting function for integers. Twenty clean lines. Then someone asks you to also sort strings. Fine, copy it, change the types and the comparison. Then structs by one field. Copy. Then structs by a different field. Copy. Your codebase now has five nearly-identical sort functions. Bug in the algorithm? You have to find and fix it in five places.

void sort_ints(int* arr, int n) {
    for (int i = 0; i < n-1; i++)
        for (int j = 0; j < n-1-i; j++)
            if (arr[j] > arr[j+1]) {    // <-- only this line is specific
                int t = arr[j]; arr[j] = arr[j+1]; arr[j+1] = t;
            }
}

void sort_strings(char** arr, int n) {
    for (int i = 0; i < n-1; i++)
        for (int j = 0; j < n-1-i; j++)
            if (strcmp(arr[j], arr[j+1]) > 0) {  // <-- only this line is specific
                char* t = arr[j]; arr[j] = arr[j+1]; arr[j+1] = t;
            }
}

These are the same function. The loops are identical. The swap is the same shape. The only real difference is how two elements get compared. Everything else is copy-paste. This doesn't scale. Add a third sorter, it's three places. Fix a bug, you'd better find all three.

The fix should be obvious now. What changes between the two is the comparison. So let the caller supply the comparison, and write the algorithm once.

// One sort function. Works for any type. Caller supplies the comparison.
void generic_sort(void* arr, int n, size_t sz,
                  int (*cmp)(const void*, const void*));

That signature belongs to a real function in the standard library. It's called qsort, and we just re-invented it from first principles by following the duplication all the way to its conclusion. The loops and the swaps and the bounds-checking all live inside qsort. The comparison (the one line that changed between sorters) lives in a callback the caller provides.

It's worth being precise about the division of labour here, because it's the mental model for every callback you'll ever use.

What qsort knows: the base address of the array (arr), the number of elements (n), and the size of a single element (sizeof(int)). Three facts. With those, it can compute the address of any element and memcpy elements around during sorting. That's enough to run a sorting algorithm.

What qsort does NOT know: the element's type (int? double? struct?), what the bytes mean when interpreted as that type, and how to order two elements (ascending? descending? by which field?). All three gaps are filled by the callback. When qsort needs to decide "does element A come before element B?", it calls your callback with pointers to the two elements. The callback knows the real type, casts the pointers back, reads the data, and returns -1/0/1.

qsort owns the algorithm. Your callback owns the type-specific logic. Neither works without the other. This division shows up in every generic algorithm, every event loop, every data-structure library in C.

#include <stdlib.h>

// Your callback. qsort will call this whenever it needs to compare two elements.
int compare_int(const void* a, const void* b) {
    int ia = *(const int*)a;
    int ib = *(const int*)b;
    return (ia > ib) - (ia < ib);   // -1, 0, or 1
}

int arr[] = {5, 2, 8, 1, 9, 3};
qsort(arr, 6, sizeof(int), compare_int);

Want to sort in descending order? Write a different callback, flip the sign. Want to sort structs? A callback that reads the right field. Want to sort strings? A callback that calls strcmp. qsort never changes. One function in the standard library handles every sorting need anyone has ever had, because the ordering logic lives in the callback you pass in.

The return-value convention (negative / zero / positive for "a comes first / equal / b comes first") is the one you'll see in every comparison callback in C. The exact numbers don't matter, only the sign.

A quick note on void*. The callback takes const void* because qsort has no idea what element type you're working with. void* is C's way of saying "pointer to something; I promise I won't look at it, you cast it back when you need to." It's how C expresses a generic pointer without real generics. Full treatment in Part 8.

The algorithm doesn't know the data. The data doesn't know the algorithm. The callback is the bridge.

C++ references: pointers with the sharp edges filed off

We touched on references in Part 2. They deserve a proper look now that we're talking about passing things into functions, because C++ references are really "a safer way to pass things by reference instead of by pointer."

C (pointer)

void increment(int* x) {
    (*x)++;
}

int n = 5;
increment(&n);
// n is now 6

C++ (reference)

void increment(int& x) {
    x++;
}

int n = 5;
increment(n);
// n is now 6

Under the hood, a reference is almost always implemented as a pointer. But the language hides the indirection. No & at the call site, no * inside the function. The reference acts like another name for the same object.

This has an important consequence worth pausing on: references have zero runtime cost compared to pointers. Same machine code, same performance, same everything at runtime. The compiler generates exactly the same instructions for passing a reference as it would for passing a pointer. The safety you get (no null, no wild, no reseating) is all enforced at compile time by the type system, not at runtime. So there's no "references are slower because they're fancier" trade-off. Use them freely; you're not paying anything.

References come with restrictions that make them safer than pointers:

Must be initialised at declaration. No uninitialised reference. No wild references.
Can't be null. No null references.
Can't be reseated. Once bound, stays bound.

Those rules kill two villains (wild and null) at the language level. And because they're compile-time rules, the compiler rejects violations before your program ever runs.

References are "pointers that can't lie." You lose flexibility and gain safety, at zero runtime cost.

When pointers still beat references

Optional outputs. If the function might not produce a value, a nullable pointer is cleaner than a reference plus a separate bool.
Reseating. If you need to change which object is being referred to, pointers are the way.
C interop. C APIs speak pointers.
Function pointers. C++ has function references (int (&fp)(int)), but they're rare. Function pointers dominate.

Modern C++ rule: use references when you can, pointers when you must.

Summing up

Functions in C only see copies of their arguments. To modify the caller's data, you pass addresses and dereference to reach back. Returning pointers follows Part 4's rules: heap, caller-provided, and static memory are safe; locals are not.

Functions themselves have addresses. The text segment holds compiled code, and every function is a block of machine instructions at some specific address. A function pointer is a variable that holds such an address. Once you can store a function in a variable, you can reassign it, choose which function to point at based on a runtime decision, and most importantly, pass it to other functions.

A function you pass to another function so it can call you is called a callback. That's the whole name, the whole concept. The algorithm lives inside the outer function; the piece of behaviour that changes case-to-case lives in your callback. The classic example is qsort, where the sorting algorithm is fixed and the comparison is your callback. The same shape shows up in dozens of other places: GUI events, signal handlers, thread entry points, and any library that wants to let its users plug in custom behaviour.

C++ adds references on top of all of this. They're essentially pointers that can't be null, can't be uninitialised, and can't be reseated. Use them when you can, raw pointers when you must.

What's next

We've covered almost all the structural ideas behind pointers. What's left is the language-level machinery that makes pointer code safer or more subtle. Next post is the const puzzle, which sounds small but catches people constantly. const int*, int* const, and const int* const all mean different things, and learning to read them properly is a small investment that pays off for the rest of your career.

You can now hand pointers to functions and hand functions to pointers. Next: how to say what they can and can't change.

Test yourself

Six questions on pass-by-pointer, returning pointers, function pointers, callbacks, and references. Four correct means you're ready for Part 7.

What does this print, and why?

void change(int x) {
    x = 100;
}

int main(void) {
    int n = 5;
    change(n);
    printf("%d\n", n);
}

A. Prints 5. C is pass-by-value. change gets its own x on its stack frame, a copy of n's value. Setting x = 100 modifies the copy. When change returns, the copy vanishes. To modify n, the function would need to take an int* and the caller would pass &n.

Which of these return a safe pointer? Select all that apply.

Safe: B, C, D. Unsafe: A. A is unsafe. Returns pointer to a local. Dangling the instant the function returns.
B is safe. Heap memory. Caller owns it and must eventually free it.
C is safe. Pointer into caller-owned memory. Valid as long as the caller's buffer is.
D is safe. String literals live in static memory forever. The const matters because modifying a string literal is UB.

Read this declaration out loud. What is fp?

double (*fp)(double);

fp is a pointer to a function taking a double and returning a double. Read inside-out. (*fp) means "fp is a pointer." (double) to the right means "when called, takes a double." double on the far left is the return type. So: "pointer to a function from double to double." You could assign sqrt, sin, cos, or any other one-double-in-one-double-out function to fp. The parens around *fp are required; without them, double* fp(double) declares a function, not a pointer.

Rewrite the following code to use a function pointer instead of a switch, so that the caller supplies the operation and apply doesn't need to know about any specific operations:

enum Op { OP_DOUBLE, OP_SQUARE };

void apply(int* arr, int n, enum Op op) {
    for (int i = 0; i < n; i++) {
        if (op == OP_DOUBLE) arr[i] *= 2;
        else if (op == OP_SQUARE) arr[i] *= arr[i];
    }
}

Pass the operation as a function pointer.

void apply(int* arr, int n, int (*fn)(int)) {
    for (int i = 0; i < n; i++) {
        arr[i] = fn(arr[i]);
    }
}

int double_it(int x) { return x * 2; }
int square_it(int x) { return x * x; }

apply(arr, n, double_it);
apply(arr, n, square_it);

Now adding a new operation means writing a new function and passing it. apply doesn't change. The enum goes away. The switch goes away. If the caller wants negate or triple or abs, they write a one-line function and pass it in. That's the pattern.

In the qsort call qsort(arr, n, sizeof(int), compare), what specific information does qsort not know about the data it's sorting, and how does the callback fill in those gaps?

qsort knows the layout. The callback knows the meaning. What qsort knows: the base address of the array (arr), how many elements it has (n), and the size of one element (sizeof(int)). With those three things, it can compute the address of any element and memcpy elements around during sorting.

What qsort does NOT know: the element's type, what the bytes mean, or how to compare two of them. It has no idea whether it's looking at an int, a double, a struct, or a string. From its perspective, every element is just some bytes.

How the callback fills the gap: when qsort needs to know "should element A come before element B?", it calls your compare(const void* a, const void* b). Your callback knows the real type, casts both pointers back, reads the actual data, and returns -1/0/1. This is the division of labour: qsort owns the algorithm, your callback owns the type-specific logic. Same division in every generic data-structure library.

True or false, with a short reason for each.

Select the ones that are true.

True: A, C, E. False: B, D. A true. C is pass-by-value. To modify, the function must be handed the address.
B false. The parens are required by precedence. int (*fp)(int) is a pointer to a function. int *fp(int) is a function returning int*. Very different.
C true. qsort has no clue what element type you're working with, so it uses void* as a type-erased pointer. Your callback casts to the real type.
D false. A reference is almost always implemented as a pointer under the hood. Same machine code, same performance. The safety is at the language level.
E true. That really is the whole definition. Every fancy callback system you'll meet is a variation of this one idea.

How did you do?

4 or more correct, you can use function pointers fluently. Move on to Part 7. Less than that, the usual trouble is Q4 (writing your own) or Q5 (the qsort / callback division of labour). Re-read "Passing a function to another function" and try again.

Pointers and Functions

Functions only see copies

Returning pointers safely

Code is in memory too

Your first function pointer

Making the pointer point somewhere else

Choosing a function at runtime

Reading the syntax

Typedef and using are mercy

Passing a function to another function

The pattern has a name: callback

Why this is a big deal

C++ references: pointers with the sharp edges filed off

When pointers still beat references

Summing up

What's next

Test yourself

Comments