We permute a vector in a few places, and we need the distinguished 0 value to use with the vec_perm built-in. We have not been able to locate a vec_zero() or similar, so we would like to know how we should handle things.
The code currently use two strategies. The first strategy is a vector load:
__attribute__((aligned(16)))
static const uint8_t z[16] =
    { 0,0,0,0,  0,0,0,0,  0,0,0,0,  0,0,0,0 };
const uint8x16_p8 zero = vec_ld(0, z);
The second strategy is an xor using the mask we intend to use:
__attribute__((aligned(16)))
static const uint8_t m[16] =
    { 15,14,13,12,  11,10,9,8,  7,6,5,4, 3,2,1,0 };
const uint8x16_p8 mask = vec_ld(0, m);
const uint8x16_p8 zero = vec_xor(mask, mask);
We have not started benchmarks (yet), so we don't know if one is better than the other. The first strategy uses a VMX load and it could be expensive. The second strategy avoids the load but introduces a data dependency.
How do we obtain a VSX value of zero?
I'd suggest to let the compiler handle it for you. Just initialise to zero:
const uint8x16_p8 zero = {0};
- which will likely compile to an xor.
For example, a simple test:
vector char foo(void)
{
    const vector char zero = {0};
    return zero;
}
On my machine, this compiles to:
0000000000000000 <foo>:
   0:   d7 14 42 f0     xxlxor  vs34,vs34,vs34
   4:   20 00 80 4e     blr
    ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With