Suppose we take a very big array of unsigned chars.
std::array<uint8_t, 100500> blob;
// ... fill array ...
(Note: it is aligned already, question is not about alignment.)
Then we take it as uint64_t[] and trying to access it:
const auto ptr = reinterpret_cast<const uint64_t*>(blob.data());
std::cout << ptr[7] << std::endl;
Casting to uint64_t and then reading from it looks suspicious as for me.
But UBsan, -Wstrict-aliasing is not triggering about it.
Google uses this technique in FlatBuffers.
Also, Cap'n'Proto uses this too.
Is it undefined behavior?
You cannot access an unsigned char object value through a glvalue of an other type. But the opposite is authorized, you can access the value of any object through an unsigned char glvalue [basic.lval]:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: [...]
- a
char,unsigned char, orstd::bytetype.
So, to be 100% standard compliant, the idea is to reverse the reinterpret_cast:
uint64_t i;
std::memcpy(&i, blob.data() + 7*sizeof(uint64_t), sizeof(uint64_t));
std::cout << i << std::endl;
And it will produces the exact same assembly.
The cast itself is well defined (a reinterpret_cast never has UB), but the lvalue to rvalue conversion in expression "ptr[7]" would be UB if no uint64_t object has been constructed in that address.
As "// ... fill array ..." is not shown, there could have been constructed a uint64_t object in that address (assuming as you say, the address has sufficient alignment):
const uint64_t* p = new (blob.data() + 7 * sizeof(uint64_t)) uint64_t();
If a uint64_t object has been constructed in that address, then the code in question has well defined behaviour.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With