Suppose that I want to use a boolean status flag for cooperative cancellation between threads. (I realize that one should preferably use CancellationTokenSource instead; that is not the point of this question.)
private volatile bool _stopping;
public void Start()
{
    var thread = new Thread(() =>
    {
        while (!_stopping)
        {
            // Do computation lasting around 10 seconds.
        }
    });
    thread.Start();
}
public void Stop()
{
    _stopping = true;
}
Question: If I call Start() at 0s and Stop() at 3s on another thread, is the loop guaranteed to exit at the end of the current iteration at around 10s?
The overwhelming majority of sources I've seen indicate that the above should work as expected; see: MSDN; Jon Skeet; Brian Gideon; Marc Gravell; Remus Rusanu.
However, volatile only generates an acquire-fence on reads and a release-fence on writes:
A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence. (C# Specification)
Therefore, there is no guarantee that a volatile write and a volatile read will not (appear to) be swapped, as observed by Joseph Albahari. Consequently, it is possible that the background thread would keep reading the stale value of _stopping (namely, false) after the end of the current iteration. Concretely, if I call Start() at 0s and Stop() at 3s, it is possible that the background task will not terminate at 10s as expected, but at 20s, or 30s, or never at all.
Based on acquire and release semantics, there are two issues here. First, the volatile read would be constrained to refresh the field from memory (abstractly speaking) not at the end of the current iteration, but at the end of the subsequent one, since the acquire-fence occurs after the read itself. Second, more critically, there is nothing to force the volatile write to ever commit the value to memory, so there is no guarantee that the loop will ever terminate at all.
Consider the following sequence flow:
Time   |     Thread 1                     |     Thread 2
       |                                  |
 0     |     Start() called:              |        read value of _stopping
       |                                  | <----- acquire-fence ------------
 1     |                                  |     
 2     |                                  |             
 3     |     Stop() called:               |             ↑
       | ------ release-fence ----------> |             ↑
       |        set _stopping to true     |             ↑
 4     |             ↓                    |             ↑
 5     |             ↓                    |             ↑
 6     |             ↓                    |             ↑
 7     |             ↓                    |             ↑
 8     |             ↓                    |             ↑
 9     |             ↓                    |             ↑
 10    |             ↓                    |        read value of _stopping
       |             ↓                    | <----- acquire-fence ------------
 11    |             ↓                    |    
 12    |             ↓                    |             
 13    |             ↓                    |             ↑
 14    |             ↓                    |             ↑
 15    |             ↓                    |             ↑
 16    |             ↓                    |             ↑
 17    |             ↓                    |             ↑
 18    |             ↓                    |             ↑
 19    |             ↓                    |             ↑
 20    |                                  |        read value of _stopping
       |                                  | <----- acquire-fence ------------
The most important parts are the memory fences, marked with --> and <--, which represent the thread synchronization points. The volatile read of _stopping can only (appear to) be moved up to its thread's previous acquire-fence at most. However, the volatile write can (appear to) be moved down indefinitely, since there is no other release-fence following it on its thread. In other words, there is no “synchronizes-with” (“happens-before”, “is-visible-to”) relation between the write to _stopping and any of its reads.
P.S. I am aware that MSDN gives very strong guarantees on the volatile keyword. However, the expert consensus is that MSDN is incorrect (and not backed up by the ECMA spec):
The MSDN documentation states that use of the volatile keyword “ensures that the most up-to-date value is present in the field at all times”. This is incorrect, since as we’ve seen [in the previous example], a write followed by a read can be reordered. (Joseph Albahari)
If I call
Start()at 0s andStop()at 3s on another thread, is the loop guaranteed to exit at the end of the current iteration at around 10s?
Yes, 7 seconds is definitely sufficient for one thread to percieve change of _stopping variable.
For every variable which provides any type of visibility barrier (memory order), specification for any language should provide a garantee that:
Any change of the variable (with special memory order) from one thread will be observed in other threads during finit and bounded period of time.
Without this garantee, even memory order features of the variable are useless.
Specification for C# definitely provides such garantee about volatile variable, but I cannot find corresponded text.
Note, that such garantee about finit time is unrelated to memory orders garantees ("acquire", "release" and so on), and it cannot be deduced from the definitions of barriers and memory orders.
When say
I call
Stop()at 3s
one implies, that there was some visible effect (e.g., information printed into the terminal), which allows him to claim about 3s timestamp (because print statement has been issued after the Stop()).
With that C# spec plays gracefully ("10.10 Execution order"):
Execution shall proceed such that the side effects of each executing thread are preserved at critical execution points. A side effect is defined as a read or write of a volatile field, a write to a non-volatile variable, a write to an external resource, and the throwing of an exception. The critical execution points at which the order of these side effects shall be preserved are references to volatile fields (§17.4.3), lock statements (§15.12), and thread creation and termination.
Assuming printing is a critical execution point (likely it uses locks), you may be confident that at the moment assignment to _stopping volatile variable as a side effect is visible to the other thread, which checks given variable.
While a compiler is allowed to move assignment of volatile variable forward in the code, it cannot do that indefinitely:
the assignment cannot be moved after the function call, because the compiler cannot assume anything about the function's body.
If assignment is performed inside a cycle, it should be completed before another assigment in the next cycle.
while one can imagine code with 1000 consecutive simple assignments (to other variables), so volatile assignment could be deffered for 1000 instructions, the compiler simply does perform such deffering. And even if it does, execution of 1000 simple instructions on modern CPU takes no more than several microseconds.
From the side of a CPU, situation is simpler: none CPU will deffer assignment to memory cell more than limited number of instructions.
In total, assignment to volatile variable can be deffered only on very limited number of instructions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With