This was a complete surprise for me. Can someone explain what is the reason behind readIORef blocking, when there is an atomicModifyIORef in flight? I understand that the assumption is that the modifying function supplied to the latter function is suppose to be very quick, but that is besides the point.
Here is a sample piece of code that reproduces what I am talking about:
{-# LANGUAGE NumericUnderscores #-}
module Main where
import Control.Concurrent
import Control.Concurrent.Async
import Control.Monad
import Data.IORef
import Say (sayString)
import Data.Time.Clock
import System.IO.Unsafe
main :: IO ()
main = do
ref <- newIORef (10 :: Int)
before <- getCurrentTime
race_ (threadBusy ref 10_000_000) (threadBlock ref)
after <- getCurrentTime
sayString $ "Elapsed: " ++ show (diffUTCTime after before)
threadBlock :: IORef Int -> IO ()
threadBlock ref = do
sayString "Below threads are totally blocked on a busy IORef"
race_ (forever $ sayString "readIORef: Wating ..." >> threadDelay 500_000) $ do
-- need to give a bit of time to ensure ref is set to busy by another thread
threadDelay 100_000
x <- readIORef ref
sayString $ "Unblocked with value: " ++ show x
threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do
sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
y <- atomicModifyIORef' ref (\x -> unsafePerformIO (threadDelay n) `seq` (x * 10000, x))
-- threadDelay is not required above, a simple busy loop that takes a while works just as well
sayString $ "Finished blocking the IORef, returned with value: " ++ show y
Running this piece of code produces:
$ stack exec --package time --package async --package say --force-dirty --resolver nightly -- ghc -O2 -threaded atomic-ref.hs && ./atomic-ref
Setting IORef to busy for 10000000 μs
Below threads are totally blocked on a busy IORef
readIORef: Wating ...
Unblocked with value: 100000
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Elapsed: 10.003357215s
Note that readIORef: Wating ... is printed only twice, once before blocking and one more time afterwards. This is very unexpected, since it is an action that runs in a totally separate thread. This means that blocking on IORef affects other threads than the one that invoked readIORef, which is even more surprising.
Are those semantics expected, or is it a bug? I fit is not a bug, why is this expected? I'll open a ghc bug later, unless someone has an explanation for this behavior that I can't think of. I won't be surprised that this is some limitation of ghc runtime, in which case I will provide an answer here later. Regardless of the outcome it is very useful to know about this behavior.
Edit 1
The busy loop I tried that does not require unsafePerformIO was requested in comments, so here it is
threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do
sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
y <- atomicModifyIORef ref (\x -> busyLoop 10000000000 `seq` (x * 10000, x))
sayString $ "Finished blocking the IORef, returned with value: " ++ show y
busyLoop :: Int -> Int
busyLoop n = go 1 0
where
go acc i
| i < n = go (i `xor` acc) (i + 1)
| otherwise = acc
The outcome is exactly the same, except the runtime is slightly different.
Setting IORef to busy for 10000000 μs
Below threads are totally blocked on a busy IORef
readIORef: Wating ...
Unblocked with value: 100000
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Elapsed: 8.545412986s
Edit 2
It turns out that sayString was the reason for no output not appearing. Here is what the out is when sayString is swapped for putStrLn:
Below threads are totally blocked on a busy IORef
Setting IORef to busy for 10000000 μs
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Unblocked with value: 100000
Elapsed: 10.002272691s
That still does not answer the question, why readIORef block. In fact I just stumbled upon a quote from a book "Haskell High Performance" By Samuli Thomasson that tells us that blocking should not happen:

I think I understand what happens now. TLDR, readIORef is not a blocking operation! Big thanks to everyone who commented on the question.
The way I break down the logic mentally is (same as in question, but with added Thread names):
threadBlock :: IORef Int -> IO ()
threadBlock ref = do
race_ ({- Thread C -} forever $ sayString "readIORef: Wating ..." >> threadDelay 500_000) $ do
{- Thread B -}
threadDelay 100_000
x <- readIORef ref
sayString $ "Unblocked with value: " ++ show x
threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do {- Thread A -}
sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
y <- atomicModifyIORef' ref (\x -> unsafePerformIO (threadDelay n) `seq` (x * 10000, x))
sayString $ "Finished blocking the IORef, returned with value: " ++ show y
ref with a thunk that will be filled when this computation is done unsafePerformIO (threadDelay n) `seq` (x * 10000, x). The important part is that because atomicModifyIORef' is most likely implemented with CAS (compare-and-swap) and the swap was successful, since expected value matched and the new value was updated with the thunk that has not been evaluated yet. Because atomicModifyIORef' is a strict it has to wait until the value is computed, which will take 10 sec before returning. So thread A blocks. ref with readIORef WITHOUT blocking. Now once an attempt is made to print the new content of a thunk x it has to stop and wait until it is filled with a value, which still is in a process of being computed. Because of that it has to wait thus it looks like it is blocked.sayString, but it fails to do so and therefore behaved as it was blocked as well. From a quick look say package and GHC.IO.Handle it looks like a Handle for stdout gets blocked by thread B, because printing in say package suppose to happen without interleaving and for that reason thread C could not do any printing either, thus it looked like it was blocked as well. That is why switching to putStrLn unblocked Thread C and allowed it to print a message every 0.5 sec.This definitely convinces me, but if anyone has a better explanation I'll be happy to accept another answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With