Friday, June 13, 2008

The Fine Distinction Between a Pointer and an Array

I keep finding crashes cause by people not comprehending what C pointers and arrays really are.

Definitions:
1. array = some auto variable that was declared as a collection of objects:
char x[100];
be it as a global or even worse on stack;
2. pointer = something that has been malloc'ed or something that holds the address of another object:
char* x1 = (char*)malloc(100);
or
char c;
char* x2 = &c;
A pointer may live as a global or on stack, regardless of this its memory footprint is 4 bytes (on a 32-bit architecture).

To many people pointers are arrays and arrays are pointers. True, but not all the time.

To simplify the discussion let us assume that a segfault occurs when we go beyond the boundaries of a buffer regardless of access type (read/write), its size and alignment.

Let's assume you populated x and x1 with a 99-byte message and you say:
write(STDOUT_FILENO, &x, 99); // mind the \0!
This is correct.

If you say:
write(STDOUT_FILENO, &x[0], 99); // mind the \0!
This is also correct.

If you say:
write(STDOUT_FILENO, x1, 99); // mind the \0!
This is correct.

If you say:
write(STDOUT_FILENO, &x1, 99); // mind the \0!
This is a SEGFAULT because you will try to access (99-4) bytes on the stack that follow x1 and there is nothing mapped there and you program will crash and burn!

The explanation is that
x == &x == &x[0]
but
x1 != &x1
as x1 means the allocated memory buffer to which x1 points to whereas &x1 means the address in memory where the pointer x1 lives!!!

In real life it gets even worse if x1 lives on the stack and you write to &x1 -- here you put garbage on stack and you may go past the red page. If there are some other auto vars on stack after it and they happen to be pointers then you have a recipe for a clusterfuck.

If x1 is a global then you will corrupt other globals that live after x1. This is even more fun!

-ulianov