Here is a SSCCE:
class Vec final {
public:
float data[4];
inline Vec(void) {}
inline ~Vec(void) {}
};
Vec operator*(float const& scalar, Vec const& vec) {
Vec result;
#if 1
for (int k=0;k<4;++k) result.data[k]=scalar*vec.data[k];
#else
float const*__restrict src = vec.data;
float *__restrict dst = result.data;
for (int k=0;k<4;++k) dst[k]=scalar*src[k];
#endif
return result;
}
int main(int /*argc*/, char* /*argv*/[]) {
Vec vec;
Vec scaledf = 2.0f * vec;
return 0;
}
When compiling, MSVC 2013 informs me (/Qvec-report:2) that
main.cpp(11) : info C5002: loop not vectorized due to reason '1200'
This means that the "[l]oop contains loop-carried data dependences".
I have noticed that commenting either the constructor or the destructor for Vec (edit: or defaulting them, e.g. Vec()=default;) causes it to vectorize successfully. My question: why?
Note: Toggling the #if will also make it work. The __restrict is important.
Note: Changing float const& scalar to float const scalar causes the vectorization to report 1303 (vectorization wouldn't be a win), I suspect because the reference can be passed directly into an SSE register while the pass-by-value needs another copy.
Why do you declare an empty non virtual destructor inline ~Vec(void) {} with an empty default constructor inline Vec(void) {}?
As a result the compiler does not generate default copy constructor. Thus the code return result; can't be compiled without it because this requires to copy result into a temporary returned object (that is may not what you want).
Either define a copy constructor, or don't define the empty constructor and destructor at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With