Let's say we have the following code
#include <stdlib.h>
#include <limits.h>
#include <stdio.h>
int main(void) {
char *w = malloc(sizeof(long));
long *n;
*w = 'x';
printf("some character: %c\n", *w);
n = (long *)w;
*n = LONG_MAX;
printf("LONG_MAX: %ld", *n);
return 0;
}
on a system where long is more strictly aligned than char (which will be most systems).
If we didn't print the long object *n, the result of malloc would only be used for a char (which has alignment 1). That is, it seems that an intelligent compiler wouldn't have to be bound by the constraint that the result of malloc be compatible with any object type (see this question: how does malloc understand alignment?); it could thus have malloc return an arbitrarily aligned slice of memory.
With the additional lines of code that deal with n, such a compiler wouldn't be able to do that anymore, since the result of malloc now also has to be suitable for a long. As far as I understand, the above code doesn't violate alignment requirements or strict aliasing rules.
Is my understanding correct?
malloc to return more weakly aligned memory if it knows what the memory is used for?n, would it be legal for the machine to change w to an arbitrarily aligned address immediately after the line char *w = malloc(sizeof(long));?The opposite behavior would be that once memory has been assigned by malloc, it has to keep its maximally compatible alignment.
As far as I understand, the above code doesn't violate alignment requirements or strict aliasing rules.
Is my understanding correct?
Yes, the dynamically allocated memory is guaranteed to have alignment sufficient for long, and the standard permits using dynamically allocated memory flexibly; storing to dynamically allocated memory with a new non-character object type changes its effective type to be the type used for the store.
We can analyze the requirements of the standard in three steps.
The C standard specifies the behavior of a program in an abstract machine. In this abstract machine:
char *w = malloc(sizeof(long)); allocates memory suitably aligned for a long in all C standards from C 1999 to the prospective C 2023 (possibly more restrictively aligned in some versions of the standard). For purposes of discussion, we assume the malloc succeeds.*w = 'x'; writes the character “x” into the memory.printf("some character: %c\n", *w); prints the message with the character “x”.n = (long *)w; converts the address to long *. This is defined by C 2018 6.3.2.3 7, which supports converting between pointers to object types provided the address is suitably aligned, which we know it is.*n = LONG_MAX; writes LONG_MAX into the memory. This does not alias the memory as a different type because C 6.5 6 says the effective type for dynamically allocated memory becomes the type used to store to it, so the memory is treated as having type long for this access.printf("LONG_MAX: %ld", *n); accesses the memory with type long, which is the current effective type of the memory, so this prints the message with the LONG_MAX value.C 2018 5.1.2.3 says a conforming C implementation is only required to produce the observable behavior, and the observable behavior is:
So the observable behavior of the code is:
LONG_MAX in decimal and a new-line character.Can a compiler optimize calls to
mallocto return more weakly aligned memory if it knows what the memory is used for?
If the compiler produces a program that:
LONG_MAX in decimal and a new-line character.then it has satisfied the requirements of the C standard. So, if the compiler produces a program that uses more weakly aligned memory than it might be abstractly required to in C 2018 and the program prints those messages, then it has conformed to the C standard. For example, if the fundamental alignment requirement is 16 bytes, then the malloc in the abstract machine is required to produce an address aligned to a multiple of 16 bytes. However, that is not observable behavior, so the actual program produced by the compiler is not required to do that. As long as it prints the required messages, it conforms.
Similarly, is the compiled code allowed to rearrange memory behind the scenes to weaker alignment, as long as there are no alignment violations? That is: In the above case, if we omitted all lines dealing with
n, would it be legal for the machine to changewto an arbitrarily aligned address immediately after the linechar *w = malloc(sizeof(long));?
If the n lines were removed, the observable behavior of the program would be:
As long as the compiler produced a program that printed that message, the alignment of any memory used to do that would be irrelevant to whether it conformed to the C standard or not. All the C standard requires is that that message be printed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With