Is there any caveats of this usage of thread_local storage duration:
template <class T>
inline T &thread_local_get()
{
  thread_local T t;
  return t;
}
Then in different threads (for example)
thread_local_get<float>() += 1.f;
The doc at cppreference says this about thread local storage duration:
thread storage duration. The object is allocated when the thread begins and deallocated when the thread ends. Each thread has its own instance of the object. Only objects declared thread_local have this storage duration. thread_local can appear together with static or extern to adjust linkage.
Does this correctly allocate one thread_local instance for each T (during compilation) and each calling thread ? Is there any situation that can lead to e.g undefined behavior ?
I don't see theoretical caveats, as after the instantiation(s) the template should behave (from the point of view of the compiler) exactly like a normal function.
Still, I would recommend checking your compiler support for thread_local before using it: for example gcc had a bug with class static thread_local members which seems to be still present at least in the latest TDM-GCC distribution featuring gcc 5.1.0. I don't know if this particular bug also affects static members of functions (it should not) and probably you are using a different compiler, but still my suggestion is to make some experiments before using this feature.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With