As following code shows, why uint32_t prevents the compiler (GCC 12.1 + O3) from optimizing by auto vectorization. See godbolt.
#include <cstdint>
// no auto vectorization
void test32(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to) {
for (uint32_t i = from; i < to; i++) {
array[nread++] = i;
}
}
// auto vectorization
void test64(uint32_t *array, uint64_t &nread, uint32_t from, uint32_t to) {
for (uint32_t i = from; i < to; i++) {
array[nread++] = i;
}
}
// no auto vectorization
void test_another_32(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to) {
uint32_t index = nread;
for (uint32_t i = from; i < to; i++) {
array[index++] = i;
}
nread = index;
}
// auto vectorization
void test_another_64(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to) {
uint64_t index = nread;
for (uint32_t i = from; i < to; i++) {
array[index++] = i;
}
nread = index;
}
After I ran the command g++ -O3 -fopt-info-vec-missed -c test.cc -o /dev/null, I got the following result. How to interpret it?
bash> g++ -O3 -fopt-info-vec-missed -c test.cc -o /dev/null
test.cc:5:31: missed: couldn't vectorize loop
test.cc:6:24: missed: not vectorized: not suitable for scatter store *_5 = i_18;
test.cc:21:31: missed: couldn't vectorize loop
test.cc:22:24: missed: not vectorized: not suitable for scatter store *_4 = i_22;
Look at the function
void test32(uint32_t *array, uint32_t &nread, uint32_t from, uint32_t to)
and how it should behave if you call it like this:
uint32_t arr[16];
test32(arr, arr[3], &arr[0], &arr[15]);
This is called aliasing. The nread parameter might alias elements from array because they have the same type. But when you have
void test64(uint32_t *array, uint64_t &nread, uint32_t from, uint32_t to)
then no aliasing can occur because an uint32_t and uint64_t can never have the same address.
Note: passing a reference to a function internally passes the address so it's equivalent to a pointer for the argument of aliasing.
There are some types with special rules called aliasing types. The C++ standard says that you can cast an uint32_t* to char* and then access the raw memory underlying the uint32_t. That means an uint32_t* and char* can legally point at the same address. char* is an aliasing type because it aliases with any other type of (data) pointer. So is unsigned char* or any other variation of char including std::byte.
But you can tell the compiler that 2 pointers are not allowed to alias even if the type would permit it by using restrict.
void test32(uint32_t *array, uint32_t & restrict nread, uint32_t from, uint32_t to)
PS: test_another_32 looks like a missed compiler optiomization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With