Recently I stumbled over a comparison between Rust and C and they use the following code:
bool f(int* a, const int* b) { *a = 2; int ret = *b; *a = 3; return ret != 0; } In Rust (same code, but with Rust syntax), it produces the following Assembler Code:
cmp dword ptr [rsi], 0 mov dword ptr [rdi], 3 setne al ret While with gcc it produces the following:
mov DWORD PTR [rdi], 2 mov eax, DWORD PTR [rsi] mov DWORD PTR [rdi], 3 test eax, eax setne al ret The text claims that the C function can't optimize the first line away, because a and b could point to the same number. In Rust this is not allowed so the compiler can optimize it away.
Now to my question:
The function takes a const int* which is a pointer to a const int. I read this question and it states that modifying a const int with a pointer should result in a compiler warning and in the worst cast in UB.
Could this function result in a UB if I call it with two pointers to the same integer?
Why can't the C compiler optimize the first line away, under the assumption, that two pointers to the same variable would be illegal/UB?
Link to godbolt
Because the data type being pointed to is const, the value being pointed to can't be changed. We can also make a pointer itself constant. A const pointer is a pointer whose address can not be changed after initialization.
Compiler can optimize away this const by not providing storage for this variable; instead it can be added to the symbol table. So a subsequent read just needs indirection into the symbol table rather than instructions to fetch value from memory.
Why can't the C Compiler optimize the first line away, under the assumption, that two pointers to the same variable would be illegal/UB?
Because you haven't instructed the C compiler to do so -- that it is allowed to make that assumption.
C has a type qualifier for exactly this called restrict which roughly means: this pointer does not overlap with other pointers (not exactly, but play along).
The assembly output for
bool f(int* restrict a, const int* b) { *a = 2; int ret = *b; *a = 3; return ret != 0; } is
mov eax, DWORD PTR [rsi] mov DWORD PTR [rdi], 3 test eax, eax setne al ret ... which removes/optimizes-away the assignment *a = 2
From https://en.wikipedia.org/wiki/Restrict
In the C programming language, restrict is a keyword that can be used in pointer declarations. By adding this type qualifier, a programmer hints to the compiler that for the lifetime of the pointer, only the pointer itself or a value directly derived from it (such as pointer + 1) will be used to access the object to which it points.
The function int f(int *a, const int *b); promises to not change the contents of b through that pointer... It makes no promises regarding access to variables through the a pointer.
If a and b point to the same object, changing it through a is legal (provided the underlying object is modifiable, of course).
Example:
int val = 0; f(&val, &val);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With