Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

specify simd level of a function that compiler can use

Tags:

c

gcc

simd

I wrote some code and compiled it using gcc with the native architecture option.

Typically I can take this code and run it on an older computer that doesn't have AVX2 (only AVX), and it works fine. It seems however that the compiler is actually emitting AVX2 instructions (finally!), rather than me needing to include SIMD intrinsics myself.

I'd like to modify the program so that both pathways are supported (AVX2 and non-AVX2). In other words I'd like something the following pseudocode.

if (AVX2){
   callAVX2Version();
}else if (AVX){
   callAVXVersion();
}else{
   callSSEVersion();
}

void callAVX2Version(){
#pragma gcc -mavx2
}

void callAVXVersion(){
#pragma gcc -mavx
}

I know how to do the runtime detection part, my question is whether it is possible to do the function specific SIMD selection part.

like image 873
Jimbo Avatar asked Sep 19 '25 06:09

Jimbo


1 Answers

The simple and clean Option

The gcc target attribute can be used out of hand like so

[[gnu::target("avx")]]
void foo(){}

[[gnu::target("default")]]
void foo(){}

[[gnu::target("arch=sandybridge")]]
void foo(){}

the call then becomes

foo();

This option does away with the need to name a function differently. If you check out godbolt for example you will see that it creates @gnu_indirect_function for you. set it first to a .resolver function. Which reads the __cpu_model to find out what can be used and set the indirect function to that pointer so any subsequent calls will be a simple function indirect. simple aint it. But you might need to remain closer to you original code base therefore there are other ways

function switching

If you do need function switching like in your original example. the following can be used. Which uses nicely worded buildtins so its clear that you are switching on architecture

[[gnu::target("avx")]]
int foo_avx(){ return 1;}

[[gnu::target("default")]]
int foo(){return 0;}

[[gnu::target("arch=sandybridge")]]
int foo_sandy(){return 2;}

int main ()
{
    if (__builtin_cpu_is("sandybridge"))
        return foo_sandy();
    else if (__builtin_cpu_supports("avx"))
        return  foo_avx();
    else
        return foo();
}

Define your own indirect function

Because of reasons to be more verbose to others or platforms concerns were indirect functions might not be a supported use case. Below is a way that does the same as the first option but all in c++ code. using a static local function pointer. This means you could order the priority for targets to your own liking or on cases were the build in isn't supported. You can supply your own.

auto foo()
{
    using T = decltype(foo_default);
    static T* pointer = nullptr;
    //static int (*pointer)() = nullptr; 
    if (pointer == nullptr)
    {
    if (__builtin_cpu_is("sandybridge"))
        pointer = &foo_sandy;
    else if (__builtin_cpu_supports("avx"))
        pointer = &foo_avx;
    else
        pointer = &foo_default;        
    }
    return pointer();
};

As a bonus note

the following templated example on godbolt uses template<class ... Ts> to deal with overloads of your functions meaning if you define a family of callXXXVersion(int) then foo(int) will happily call the overloaded version for you. as long as you defined the entire family.

like image 150
Mellester Avatar answered Sep 20 '25 22:09

Mellester