Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why should I use thread-specific data?

Since each thread has its own stack, its private data can be put on it. For example, each thread can allocate some heap memory to hold some data structure, and use the same interface to manipulate it. Then why thread-specific data is helpful?

The only case that I can think of is that, each thread may have many kinds of private data. If we need to access the private data in any function called within that thread, we need to pass the data as arguments to all those functions, which is boring and error-prone.

like image 495
pjhades Avatar asked Oct 19 '25 20:10

pjhades


2 Answers

Thread-local storage is a solution for avoiding global state. If data isn't shared across threads but is accessed by several functions, you can make it thread-local. No need to worry about breaking reentrancy. Makes debugging that much easier.

From a performance point of view, using thread-local data is a way of avoiding false sharing. Let's say you have two threads, one responsible for writing to a variable x, and the other responsible for reading from a variable y. If you were to define these as global variables, they could be on the same cache line. This means that if one of the threads writes to x, the CPU will update the cache line, and this of course includes the variable y, so cache performance will degrade, because there was no reason to update y.

If you used thread-local data, one thread would only store the variable x and the other would only store the variable y, thus avoiding false sharing. Bear in mind, though, that there are other ways to go about this, e.g. cache line padding.

like image 166
someguy Avatar answered Oct 21 '25 09:10

someguy


Unlike the stack (which, like thread-local data is dedicated to each thread), thread-local data is useful because it persists through function calls (unlike stack data which may already be overwritten if used out of its function).

The alternative would be to use adjacent pieces of global data dedicated to each thread, but that has some performance implications when the CPU caches are concerned. Since different threads are likely to run on different cores, such "sharing" of a global piece of data may bring some undesirable performance degradation because an access from one core may invalidate the cache-line of another, with the latter contributing to more inter-core traffic to ensure cache consistency.

In contrast, working with thread-local data should conceptually not involve messing up with the cache of other cores.

like image 42
Blagovest Buyukliev Avatar answered Oct 21 '25 09:10

Blagovest Buyukliev