Here is a SSCCE:
class Vec final {
    public:
        float data[4];
        inline Vec(void) {}
        inline ~Vec(void) {}
};
Vec operator*(float const& scalar, Vec const& vec) {
    Vec result;
    #if 1
        for (int k=0;k<4;++k) result.data[k]=scalar*vec.data[k];
    #else
        float const*__restrict src =    vec.data;
        float      *__restrict dst = result.data;
        for (int k=0;k<4;++k) dst[k]=scalar*src[k];
    #endif
    return result;
}
int main(int /*argc*/, char* /*argv*/[]) {
    Vec vec;
    Vec scaledf = 2.0f * vec;
    return 0;
}
When compiling, MSVC 2013 informs me (/Qvec-report:2) that
main.cpp(11) : info C5002: loop not vectorized due to reason '1200'
This means that the "[l]oop contains loop-carried data dependences".
I have noticed that commenting either the constructor or the destructor for Vec (edit: or defaulting them, e.g. Vec()=default;) causes it to vectorize successfully.  My question: why?
Note: Toggling the #if will also make it work.  The __restrict is important.
Note: Changing float const& scalar to float const scalar causes the vectorization to report 1303 (vectorization wouldn't be a win), I suspect because the reference can be passed directly into an SSE register while the pass-by-value needs another copy.
Why do you declare an empty non virtual destructor inline ~Vec(void) {} with an empty default constructor inline Vec(void) {}?
As a result the compiler does not generate default copy constructor. Thus the code return result; can't be compiled without it because this requires to copy result into a temporary returned object (that is may not what you want).
Either define a copy constructor, or don't define the empty constructor and destructor at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With