Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do mutex lock waits for unlock at low level?

I want to know how the mutex (or the other locking implementation) implements the wait functionallity for the lock function. I mean, is that a cpu instruction that queue the mutex.lock calls, is that implemented in OS only or what?

In the tests I did, I think this wait functionallity is done only in OS layer and is made creating some sort of spinning, checking is the lock is available to proceed and if not putting the thread to sleep. Is that right?

Thank you very much.

like image 836
andresantacruz Avatar asked Nov 25 '25 16:11

andresantacruz


1 Answers

It's platform dependent. Typically there is a spin lock portion which falls back to blocking in the operating system if a fixed spin limit is reached.

The spinlock is typically implemented by reading a memory address that contains a particular value when the mutex is unlocked. If it is seen as unlocked, an attempt is made to atomically change that value from the unlocked value to the locked value. If that atomic exchange succeeds, the mutex is locked. Typically the number of spins is counted and if a limit is reached, we switch to blocking in the OS.

The block in the OS is typically implemented much the same way except that instead of sleeping, the thread adds itself to a list of things waiting for the lock. When a thread releases the lock, it checks if anything is waiting in the OS and, if so, unblocks it. This causes the OS to schedule that thread. It typically then tries to perform the same atomic exchange that a spinlock would try, blocking again in the OS if it fails.

In pseudo-code:

Lock:

  1. Check the memory location to see if the lock is locked. If so, go to step 3.
  2. Try to atomically switch the memory location from unlocked to locked. If we succeed, stop, we hold the lock.
  3. Increment the spin count. If we haven't spun too many times, go to step 1.
  4. Atomically increment the number of threads waiting for this lock.
  5. Try to atomically switch the memory location from unlocked to locked. If we succeed, decrement the number of waiting threads and stop, we hold the lock.
  6. Conditionally block in the OS.
  7. Go to step 5.

Unlock:

  1. Atomically set the memory location holding the lock state to unlocked.
  2. If the number of threads waiting for this lock in the OS is greater than zero, tell the OS to unblock any threads waiting for this lock.

Note that the OS has to implement some mechanism to avoid the race where the request to unblock any threads waiting in the OS happens just before a thread manages to block. The method varies from OS to OS. For example, Linux has something called a "futex", which is essentially a way to implement steps 4, 5, and 6 of the locking pseudo-code atomically.

Caution: If you attempt to implement this algorithm in code, understand that you will likely produce a toy that will not perform nearly as well as a proper implementation. You need deep, platform-specific knowledge to avoid nasty performance-sucking traps you can fall into. For example, it's easy to code a spinlock such that it steals core execution resources from another thread sharing the physical core in CPUs with hyper-threading. And it's easy to code the successful exchange so that the CPU's branch prediction predicts it will fail and you take a horrible branch misprediction penalty when you acquire the lock.

like image 196
David Schwartz Avatar answered Nov 27 '25 06:11

David Schwartz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!