From K&R page 99:
As formal parameters in a function definition,
char s[];
and
char *s;
are equivalent; we prefer the latter because it says more explicitly that the variable is a pointer.
I need some clarification as to why "saying more explicitly that the variable is a pointer" would have any importance? I would expect the reason is for the program to be faster rather than more explicit, as stated in page 97:
The pointer version will in general be faster but, at least to the uninitiated, somewhat harder to understand
But in that case, why would the pointer version be faster? if arr[i]
just equivalent to *(a+i)
why would the program be faster? is it because C doesn't need to convert the array to the pointer version?
Outside of a parameter list, char s[]
and char *s
are different.
void foo( void ) {
char a[] = "abcdef"; // An array of 7 `char` elements.
char *p = "abcdef"; // A pointer that points to a sequence of 7 `char` objects.
}
In a parameter list, char s[]
and char *s
are 100% equivalent by definition.
void foo(
char p1[], // A pointer that points to a `char` object.
char *p2 // A pointer that points to a `char` object.
);
It's saying that since s
is a pointer, it's clearer to use char *s
. Using char s[]
, while equivalent, could mislead "the uninitiated" into thinking it's an array when it's not.
Note that not everyone agrees with always using char *s
instead of char s[]
for parameters. I'm just explaining the opinion expressed by the book as requested.
Since char s[]
and char *s
are 100% equivalent in a parameter list, there is no difference in performance.
When talking about performance, I believe the book is referring to code using variable pointers vs code using constant indexed pointers, such as
for (size_t i = 0; i < n; ++i) dst[i] = src[i];
-vs-
for (size_t i = 0; i < n; ++i) *(dst++) = *(src++);
This is just a guess since the book doesn't explain to what it's referring.
I would largely ignore that statement. Even if it was true at the time the book was written, there has been lots of changes to CPUs and compilers since. More importantly, this focus on micro-optimizations is best avoided except when justified.
Here's a little more context for the quote about the pointer version being faster:
Any operation that can be achieved by array subscripting can also be done with pointers. The pointer version will in general be faster but, at least to the uninitiated, somewhat harder to understand.
And, a pages later:
There is one difference between an array name and a pointer that must be kept in mind. A pointer is a variable.... But an array name is not a variable.
[My page numbers don't match, so we must be reading different editions.]
If you have an array name, and you want to dereference to a particular offset, you first have to (1) load the array address into a register and (2) do the arithmetic to get to the offset. If you have already have a pointer to the first element, then you've effectively already done the first step.
This doesn't matter for one-off accesses, but it might have mattered for an algorithm that makes repeated accesses, either sequentially or jumping around.
But the details are specific to the hardware and the compiler, both of which have evolved a lot since K&R wrote the comment about speed. Over the years, I've seen demonstrations comparing an array-indexing and a pointer-dereferencing versions of the same algorithm. Sometimes the faster one was indexed, sometimes it was by pointer. As the years went by, the difference was generally less and less significant.
Both methods are valid, and nowadays there usually isn't a performance difference. It's important to understand both. But when deciding which to use, pick the one that more naturally expresses the algorithm.
we prefer the latter because it says more explicitly that the variable is a pointer.
K&R's preference for the pointer notation is an attempt to remind the reader that pointer decay happens. If you forget that a function parameter is actually a pointer to the first element of an array rather than the array itself, you can get tripped up.
For example, it is (or once was) common for C programmers to use a preprocessor macro like this to determine how many elements are in an array:
#define ARRAYSIZE(a) (sizeof(a) / sizeof(a[0]))
That works as long as a
is actually an array. But if it's actually a pointer to an array, you'll get an incorrect answer. So you cannot re-write that macro as a function because the argument will decay to a pointer.
/* This function is wrong. */
size_t arraysize(int a[]) {
return sizeof(a)/sizeof(a[0]);
}
int main() {
int a[3];
printf("%llu\n", ARRAYSIZE(a)); /* prints 3 */
printf("%llu\n", arraysize(a)); /* prints something else */
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With