HomeAbout BlogsProjects Moisture Meter Teaching Roles Workshops Talks Contact
Pointers: Thinking in Memory

Arrays Are Not Pointers

They look the same, they feel the same, they even compile to almost the same machine code. But confusing them is one of the most productive sources of bugs in C, and the day you see clearly why is the day a lot of mysterious things stop being mysterious.

The comfortable lie

At some point early in your C journey, somebody told you that arrays and pointers are basically the same thing. It was a forgivable thing to say. Most of the time, the language behaves as if it were true. You can use array indexing on a pointer. You can use pointer arithmetic on an array. You can pass an array to a function and it quietly becomes a pointer. Everything seems fine.

It isn't fine. It's a convenient half-truth that works most of the time and then blows up in your face at exactly the moments that matter. When you call sizeof. When you pass an array to a function. When you think you're checking bounds but you aren't. When you free memory, or when you try to reason about memory layout. The point of this post is to make the distinction sharp enough that you never get fooled again.

Arrays and pointers are not the same thing. They're two different things that often dress like each other.

What an array actually is

When you write int arr[5], you're asking the compiler to set aside a contiguous block of memory big enough to hold five ints. On a typical system that's 20 bytes, laid out as one unbroken strip. The name arr doesn't refer to any single one of those ints. It refers to the whole block, with a specific type: "array of 5 ints."

int arr[5] = {10, 20, 30, 40, 50};
// 20 bytes in one contiguous strip
// arr has type "int[5]", which is different from "int*"

The type int[5] is distinct from int*. An array type carries its size around with it at compile time. The compiler knows, when it sees arr, that this thing has five ints in it. That knowledge is why sizeof(arr) can tell you "20 bytes" without having to look anything up at runtime.

A pointer, on the other hand, is just a variable that holds an address. It doesn't know anything about how much memory lives at that address. int* p could point at a single int, at the middle of a giant buffer, at nothing at all, or at an array of 10,000 ints. The pointer itself cannot tell. Its type says "points to an int", full stop.

An array is a region. A pointer is a finger. They're related but not the same.
See the difference in memory.
Array a contiguous region of memory
int arr[5] = {10,20,30,40,50};
Pointer a single address, pointing somewhere
int* p = &arr[0];
the array (one region, 20 bytes) the pointer (itself 8 bytes, holds one address) what the pointer is aimed at

The array box is the thing itself. All 20 bytes of it. The pointer box is a tiny 8-byte container that happens to hold an address. They look interchangeable in code because the pointer is aimed at the array's first element. They are not the same object. One owns its memory; the other just knows where to find someone else's.


Array-to-pointer decay

Here's the rule that causes all the confusion. In almost every context where you use an array's name, C silently converts it to a pointer to its first element. This is called array-to-pointer decay, and it's the single most important mechanic in this whole post.

int arr[5] = {10, 20, 30, 40, 50};
int* p = arr;   // arr decays to &arr[0], an int*

p[2];      // 30. you can index a pointer just like an array.
arr[2];    // 30. same result.
*(p + 2);  // 30. pointer arithmetic works on both.
*(arr + 2); // 30. arr decayed here too.

The decay is why all those expressions look equivalent. When you write arr[i], what's actually happening under the hood is *(arr + i), and the arr in that expression has already decayed into a pointer. Array indexing is defined in terms of pointer arithmetic, not the other way around. Which, as a fun aside, is why 0[arr] is valid C and gives you the same thing as arr[0]. Commutative. Horrifying. True.

But decay isn't always silent. There are a few contexts where the array keeps its identity:

  • sizeof(arr). The compiler needs the type, so no decay happens.
  • &arr. Gives you a pointer to the whole array (int (*)[5]), not to its first element. More on this in a moment.
  • typeof(arr) or decltype(arr) in C++. The real type is preserved.

Everywhere else, the moment the array name appears in an expression, it decays. You didn't ask for it. You can't stop it. It just happens.

Arrays decay into pointers in almost every expression. The places they don't are where the bugs live.

The sizeof trap

This is the most famous place the two diverge, and the one that's burned every C programmer at least once. Let's look at it with no commentary first and then unpack.

int arr[5] = {10, 20, 30, 40, 50};
int* p = arr;

sizeof(arr);   // 20. the whole array.
sizeof(p);     // 8. just a pointer.
sizeof(arr[0]); // 4. one int.

Why the difference? Because sizeof is a compile-time operator. It looks at the type of its argument, not the runtime value. arr has type int[5], so the compiler returns 20. p has type int*, so the compiler returns the size of a pointer on your platform (8 on 64-bit). The pointer doesn't know how many ints live at its target, so neither does sizeof.

This is why sizeof(arr) / sizeof(arr[0]) is the classic idiom for "how many elements in this array." It's 20 / 4 = 5. Bulletproof, as long as arr is a real array in scope. The moment it decays, the idiom breaks silently.

Try sizeof on different things.
// Our setup: int arr[5] = {10, 20, 30, 40, 50}; int* p = arr; char s[] = "hello"; // 6 chars including the null terminator char* sp = s;
Click an expression above to see what it evaluates to and why.

Click through each expression and pay attention to when you get the whole thing and when you get just 8. Any expression involving p or sp gives you pointer size, because that's what they are. Any expression on arr or s gives you the whole array, because sizeof is the rare place where arrays don't decay. This single rule explains 90% of C sizeof confusion.


The function parameter lie

Now we come to the cruelest part of the language. When you declare a function parameter as an array, C does something sneaky: it secretly converts it to a pointer. Look closely.

// These three declarations are IDENTICAL to the compiler:
void print_array(int arr[5])  { /* ... */ }
void print_array(int arr[])   { /* ... */ }
void print_array(int* arr)   { /* ... */ }

Inside all three, arr is a plain int*. The [5] is a documentation-only lie. The compiler accepts it, then silently throws away the size. It's as if you wrote int* the whole time. This is called parameter decay, and it's the reason the following code is one of the classic C bug traps.

void print_array(int arr[]) {
    int n = sizeof(arr) / sizeof(arr[0]);  // BUG
    for (int i = 0; i < n; i++) {
        printf("%d\n", arr[i]);
    }
}

int main(void) {
    int data[5] = {10, 20, 30, 40, 50};
    print_array(data);  // prints... what?
}

Inside print_array, sizeof(arr) is 8 (pointer size), and sizeof(arr[0]) is 4. So n is 2. The function prints the first two elements and stops, and if you didn't know about parameter decay, you spend an hour staring at this wondering where the other three went.

The fix is the one every serious C codebase uses: pass the size alongside the pointer. Always.

Broken
void print_array(int arr[]) {
    int n = sizeof(arr) / sizeof(int);
    for (int i=0; i<n; i++)
        printf("%d\n", arr[i]);
}
Correct
void print_array(int* arr, size_t n) {
    for (size_t i=0; i<n; i++)
        printf("%d\n", arr[i]);
}
// call site:
print_array(data, 5);

This pattern, a pointer paired with a length, is the universal currency of C array-handling. memcpy, fread, fwrite, strncpy, qsort. Every POSIX call that takes a buffer follows this shape. All of them take the pointer and size separately, because inside a function, the pointer has no idea how big the array was.

Once an array is passed to a function, its size is gone. Pass the size yourself, or live with the consequences.

In C++, this is where you start to appreciate std::array<int, 5>, std::vector<int>, and std::span<int>. All three preserve size information across function boundaries because they're objects, not raw arrays. Modern C++ code avoids raw arrays for this exact reason. We'll meet these later in the series, but if you're writing new C++, prefer them.


Pointer arithmetic

Pointer arithmetic is the other place where "arrays and pointers behave the same" comes from. You can do it on both, and the rules are identical: arithmetic is measured in elements, not bytes.

int arr[5] = {10, 20, 30, 40, 50};
int* p = arr;      // p points at arr[0]

p + 1;               // advances by sizeof(int) bytes, i.e. 4. points at arr[1].
p + 3;               // advances by 12 bytes. points at arr[3].
*(p + 2);            // 30. equivalent to arr[2].

When you write p + 1, the compiler doesn't add literal 1 to the address. It adds sizeof(*p) to the address. For int* that's 4 bytes. For char* it's 1 byte. For double* it's 8 bytes. The stride is determined by the pointer's type.

This is why char* is the universal choice for byte-level work. Incrementing a char* moves exactly one byte, which is what you want when you're poking around in raw memory.

Drag the slider. Watch the stride change with the type.
Pointer type:
Offset k: 0
p + 00x1000 = 0x1000 + 0 × 1 bytes

Pick a pointer type, then drag the offset slider. Notice how the same p + 3 lands on wildly different addresses depending on whether p is a char*, int*, or double*. Pointer arithmetic speaks in elements. The type is what tells the compiler how big one element is.

Subtraction between two pointers is also defined, and it returns the number of elements between them, not bytes:

int arr[5];
int* first = &arr[0];
int* last  = &arr[4];

last - first;       // 4, not 16. elements, not bytes.

The one-past-the-end fence

Here's a rule that most tutorials skip or bury in a footnote, and which catches almost every C programmer at least once. Pointer arithmetic isn't a free-for-all. There's a fence, and crossing it is undefined behaviour even if you never actually touch the memory.

The C standard defines pointer arithmetic only inside an array, plus exactly one position past the end. That one-past-the-end position is legal to form and legal to compare against. Dereferencing it, however, is still undefined behaviour. That "one past the end" exists specifically so this iteration idiom works:

int arr[5] = {10, 20, 30, 40, 50};

for (int* p = arr; p < arr + 5; p++) {
    printf("%d ", *p);
}
// arr + 5 is the fence. p reaches it, the loop ends, nothing gets dereferenced.

Here's where the trap opens up. You might think that computing an address beyond the fence is fine as long as you don't dereference it. After all, the CPU just does integer addition. Nothing should go wrong. But the C standard is stricter than the hardware. Look at this:

int arr[5];

arr + 0;   // valid. points at arr[0].
arr + 4;   // valid. points at arr[4].
arr + 5;   // valid. the one-past-the-end sentinel.
arr + 6;   // UNDEFINED BEHAVIOUR. even just forming this is illegal.
arr + 10;  // UB. doesn't matter if you dereference or not.

This feels wrong the first time you see it. The code runs. The output looks fine. But the standard says the compiler is allowed to assume you never step past the fence, and modern compilers actively exploit that assumption when optimising. A compiler that sees arr + k can reason "k must be between 0 and 5, or the code would be UB, so I can skip some bounds checks." If your runtime value happens to be 10, the program's behaviour is undefined and the optimizer was well within its rights to delete code that would have caught the problem.

Pointer arithmetic has a fence at one-past-the-end. Even forming a pointer beyond that fence is undefined, not just dereferencing it.

The same rule applies to subtraction. You can take the difference of two pointers as long as they both live in the same array (or one is the one-past-the-end of that array). Subtract pointers to two completely unrelated objects and you're in UB territory, even though the machine would happily give you an answer. This is why tools like AddressSanitizer and UBSan exist. They catch this kind of silent corruption before it ships.

If you take one thing away from this section, take this: pointer arithmetic lives inside arrays. The fence is at one-past-the-end. Stay inside the fence or accept that the compiler might do anything.


arr, &arr[0], and &arr

Before we close out, one more distinction that trips up almost everyone the first time they meet it. Given int arr[5], you might reasonably call any of these "the address of the array":

  • arr. The array name, which decays to a pointer to the first element.
  • &arr[0]. Explicitly, the address of element zero. Same as above, just spelled out.
  • &arr. The address of the whole array as a single object.

Numerically, all three are the same address. They all point at the first byte of the array. But the C type system treats them differently, and that difference bites the moment you do arithmetic.

arr      // type: int*          pointer to one int
&arr[0]  // type: int*          same
&arr     // type: int (*)[5]    pointer to an ARRAY of 5 ints

That last type, int (*)[5], reads right-to-left as "pointer to an array of 5 ints." The parentheses matter. Without them, int* [5] would mean "array of 5 pointers to int", which is a completely different thing.

The only time this distinction actually does something observable is during pointer arithmetic. + 1 advances by the size of whatever the pointer points at. For arr, that's one int (4 bytes). For &arr, that's one entire array of 5 ints (20 bytes). Don't take my word for it. Run the code and watch the numbers.

CodeC
#include <stdio.h>

int main(void) {
    int arr[5] = {10, 20, 30, 40, 50};

    // 1. Same numeric address from all three
    printf("arr       = %p\n", (void*)arr);
    printf("&arr[0]   = %p\n", (void*)&arr[0]);
    printf("&arr      = %p\n\n", (void*)&arr);

    // 2. But sizeof reveals different types
    printf("sizeof(arr)     = %zu\n", sizeof(arr));
    printf("sizeof(&arr[0]) = %zu\n", sizeof(&arr[0]));
    printf("sizeof(&arr)    = %zu\n\n", sizeof(&arr));

    // 3. Pointer arithmetic is the real tell
    printf("arr + 1      = %p  (+4 bytes: one int)\n",
           (void*)(arr + 1));
    printf("&arr + 1     = %p  (+20 bytes: one whole array!)\n\n",
           (void*)(&arr + 1));

    // 4. Confirm the byte distance
    long d1 = (char*)(arr + 1) - (char*)arr;
    long d2 = (char*)(&arr + 1) - (char*)&arr;
    printf("arr+1  - arr   = %ld bytes\n", d1);
    printf("&arr+1 - &arr  = %ld bytes\n", d2);

    return 0;
}
Outputon a 64-bit Linux machine
$ gcc -Wall -o demo demo.c && ./demo

# 1. Same numeric address from all three
arr       = 0x7ffd2e4a8ab0
&arr[0]   = 0x7ffd2e4a8ab0
&arr      = 0x7ffd2e4a8ab0

# 2. But sizeof reveals different types
sizeof(arr)     = 20
sizeof(&arr[0]) = 8
sizeof(&arr)    = 8

# 3. Pointer arithmetic is the real tell
arr + 1      = 0x7ffd2e4a8ab4  (+4 bytes: one int)
&arr + 1     = 0x7ffd2e4a8ac4  (+20 bytes: one whole array!)

# 4. Confirm the byte distance
arr+1  - arr   = 4 bytes
&arr+1 - &arr  = 20 bytes

Read the output top to bottom. Section 1 confirms all three expressions point at the same byte. Section 2 shows that even though they're addresses of the same thing, sizeof(arr) sees the full array (20 bytes), while the other two are plain 8-byte pointers. Section 3 is the punchline. arr + 1 walks one int forward (4 bytes), but &arr + 1 walks an entire array forward (20 bytes), because each "step" of a int (*)[5] is the size of the whole array. Section 4 confirms the byte math directly.

Same starting address, different stride. The type is what tells + 1 how far to jump.

The only everyday use for &arr is when you're working with 2D arrays, where each "element" of the outer array actually is a whole inner array. We'll get to that in a later post. For now, what you need to remember is simple. arr and &arr[0] are the same thing. &arr is a different type that happens to share the same starting address.


A small tour of what this all explains

Now that you have the decay rule and pointer arithmetic clear, a bunch of C idioms that looked arbitrary start to make sense. Consider string handling.

char s[] = "hello";     // array of 6 chars (including '\0')
char* p = s;             // decays to &s[0]

while (*p) {                 // loop until we hit '\0'
    putchar(*p);
    p++;                      // advance by sizeof(char) = 1 byte
}

This is how strlen, strcpy, and every classic string function work internally. They take a char*, they walk through memory one byte at a time using pointer arithmetic, and they stop at the null terminator. They don't know the length up front because they can't. All they got is a pointer. This is also why buffer overflows exist. If the caller passed in a buffer that isn't null-terminated, or one that's smaller than the function assumes, the loop keeps going into memory it shouldn't touch.

Same pattern for traversing any buffer:

void sum(int* arr, size_t n) {
    int total = 0;
    for (int* p = arr; p < arr + n; p++) {
        total += *p;
    }
}

arr + n is the one-past-the-end fence we just talked about. Legal to form, not legal to dereference. This is the standard C iteration idiom, and it's exactly how std::begin and std::end work in C++. A pointer to the start and a pointer to one-past-the-end. Two pointers describe a range, and arithmetic walks you through it.

A lot of C is just "walk a pointer through memory until you hit a sentinel or a bound." Once you see that, half the standard library writes itself.

The C++ note

C++ inherits all of this, but pushes you toward safer alternatives. Raw arrays and pointers still work exactly the same way. Array-to-pointer decay still happens. The sizeof trap still applies. The one-past-the-end fence is still the same fence. All of it.

But C++ gives you tools that preserve size information and optionally bounds-check:

  • std::array<int, 5>. A fixed-size array that's a real object. It doesn't decay. arr.size() always works.
  • std::vector<int>. A dynamic array. Knows its size at runtime.
  • std::span<int>. A non-owning view (pointer + length). The safer replacement for raw (int*, size_t) parameters.

If you're writing modern C++, reach for these first. Raw C arrays are appropriate for tight performance code, interop with C, and low-level systems work. For everything else, the standard library has a better option.


Summing up

An array is a contiguous block of memory with a type that remembers its size. A pointer is a variable holding one address, and its type says nothing about how much memory sits at that address. They look interchangeable because arrays decay into pointers in most contexts, and because indexing is defined in terms of pointer arithmetic. The exceptions are sizeof, &, and declarations. That's where the distinction bites.

The specific things to never forget:

  • sizeof(arr) is the whole array. sizeof(p) is always 8 (or whatever your pointer size is). They are not the same.
  • Function parameters declared as int arr[5] are secretly int*. The size is a lie. Always pass size alongside.
  • Pointer arithmetic is in elements, not bytes. p + 1 advances by sizeof(*p).
  • Pointer arithmetic has a fence at one-past-the-end. Even forming a pointer beyond the fence is undefined behaviour.
  • Array-to-pointer decay happens in almost every expression. The places it doesn't are where bugs hide.

What's next

So far we've treated memory as something that just exists, without asking where it came from or when it goes away. That's the next step. Some memory lives on the stack, tied to function calls. Some lives on the heap, allocated and freed by you. Getting this wrong is the single biggest source of pointer bugs in real code. Next post: the stack, the heap, and what "lifetime" actually means.

Every pointer points at something. That something has a lifetime. That's next.

Test yourself

Five questions on what you just read. Try them before peeking. The first four test the core ideas; the fifth is a trap that catches almost everyone the first time.

Q1
What does this print, and why?
int arr[10];
int* p = arr;
printf("%zu %zu\n", sizeof(arr), sizeof(p));
Answer: C (40 8 on a 64-bit system) sizeof(arr) sees an array of 10 ints and returns 10 * sizeof(int) = 40. sizeof(p) sees a pointer and returns the size of a pointer (8 on 64-bit, 4 on 32-bit). p doesn't remember where it came from. It's just an address.
Q2
What's wrong with this function, and what will it print when called with a 5-element array?
void print_all(int arr[]) {
    int n = sizeof(arr) / sizeof(arr[0]);
    for (int i = 0; i < n; i++)
        printf("%d ", arr[i]);
}
Parameter decay. It prints 2 elements on a 64-bit system. Inside the function, arr is an int*, not an array. The [] in the parameter is a lie the compiler tolerates. So sizeof(arr) is 8 (pointer size) and sizeof(arr[0]) is 4 (int). n = 2. The loop prints only the first two elements. Fix: pass the size as a separate parameter. This is why every standard C function that takes a buffer also takes a length.
Q3
Given int arr[5] and int* p = arr, what does p + 2 give you, and how many bytes forward does it move in memory?
p + 2 points to arr[2], which is 8 bytes forward. Pointer arithmetic is in elements, not bytes. p + 2 means "advance by 2 elements of *p's type." Since *p is an int (typically 4 bytes), the address moves forward by 2 * 4 = 8 bytes. If p were a char*, p + 2 would move 2 bytes. If it were a double*, 16 bytes. The type is what sets the stride.
Q4
Which of these statements are true? Select all that apply.
Answers: A, C, D A is true. That's array-to-pointer decay, the rule that makes them look interchangeable.
B is false. They have different types. int[5] vs int*. sizeof sees the difference, and so does anything that inspects the type (typeof, decltype, template deduction).
C is true. sizeof, &arr, and declarations are the three places arrays keep their identity.
D is true. The [5] in a parameter is stripped away. The compiler treats it as int* regardless of what size you write.
Q5
Given int arr[5], what's the difference between arr, &arr[0], and &arr? Which two are the same, and how does the third one differ?
Same address, different types. All three evaluate to the same numeric address, which is the start of the array. But they have different types.
arr (when it decays) is int*, a pointer to the first int.
&arr[0] is also int*, the same thing spelled out.
&arr is int (*)[5], a pointer to the whole array of 5 ints.
The difference bites when you do arithmetic. arr + 1 moves forward by one int (4 bytes). &arr + 1 moves forward by one whole array (20 bytes), landing just past the end. If you skipped the runnable demo above, go back and run it. The output shows this concretely.
Q6
Given int arr[5] = {10, 20, 30, 40, 50};, which of these are valid C, and which are undefined behaviour?

Select the expressions that are valid C. Leave the UB ones unselected.

Only A is valid. B, C, and D are all UB. A is valid. arr + 5 is the one-past-the-end sentinel. You can form it, hold it, and compare against it. This is the whole reason loop idioms like for (p = arr; p < arr + n; p++) work.
B is UB. The pointer is valid but dereferencing it reads memory you don't own.
C is the one that catches most people. Even forming a pointer beyond one-past-the-end is undefined behaviour, not just dereferencing it. The C standard draws the fence at one past the last element. Step beyond it and the compiler is free to assume you never did, which can produce surprising bugs when optimisers exploit that assumption.
D is UB twice over. Pointer formation is UB, dereferencing would be UB. Either would sink you.
How did you do?
5 or 6 correct? You've got arrays and pointers locked in. Move on to Part 4. Got tripped up by Q5 or Q6? Those are the advanced corners, and they stick with another read-through.

Comments