New to this website, so let me know if I'm not posting in an accepted manner.
I've frequently coded something along the lines of the sample below(with stuff like Dispose ommited for clarity. ). My question is, are the volatiles needed as shown? Or does the ManualResetEvent.Set have an implicit memory barrier as I've read Thread.Start does? Or would an explicit MemoryBarrier call be better than the volatiles? Or is it completely wrong? Also, the fact that the "implicit memory barrier behavior" in some operations is not documented as far as I've seen is quite frutrating, is there a list of these operations somewhere?
Thanks, Tom
:
class OneUseBackgroundOp
{
   // background args
   private string _x;
   private object _y;
   private long _z;
   // background results
   private volatile DateTime _a
   private volatile double _b;
   private volatile object _c;
   // thread control
   private Thread _task;
   private ManualResetEvent _completedSignal;
   private volatile bool _completed;
   public bool DoSomething(string x, object y, long z, int initialWaitMs)
   {
      bool doneWithinWait;
      _x = x;
      _y = y;
      _z = z;
      _completedSignal = new ManualResetEvent(false);
      _task = new Thread(new ThreadStart(Task));
      _task.IsBackground = true;
      _task.Start()
      doneWithinWait = _completedSignal.WaitOne(initialWaitMs);
      return doneWithinWait;
   }
   public bool Completed
   {
      get
      {
         return _completed;
      }
   }
   /* public getters for the result fields go here, with an exception
      thrown if _completed is not true; */
   private void Task()
   {
      // args x, y, and z are written once, before the Thread.Start
      //    implicit memory barrier so they may be accessed freely.
      // possibly long-running work goes here
      // with the work completed, assign the result fields _a, _b, _c here
      _completed = true;
      _completedSignal.Set();
   }
}
Note that this is off the cuff, without studying your code closely. I don't think Set performs a memory barrier, but I don't see how that's relevant in your code? Seems like more important would be if Wait performs one, which it does. So unless I missed something in the 10 seconds I devoted to looking at your code, I don't believe you need the volatiles.
Edit: Comments are too restrictive. I'm now referring to Matt's edit.
Matt did a good job with his evaluation, but he's missing a detail. First, let's provide some definitions of things thrown around, but not clarified here.
A volatile read reads a value and then invalidates the CPU cache. A volatile write flushes the cache, and then writes the value. A memory barrier flushes the cache and then invalidates it.
The .NET memory model ensures that all writes are volatile. Reads, by default, are not, unless an explicit VolatileRead is made, or the volatile keyword is specified on the field. Further, interlocked methods force cache coherency, and all of the synchronization concepts (Monitor, ReaderWriterLock, Mutex, Semaphore, AutoResetEvent, ManualResetEvent, etc.) call interlocked methods internally, and thus ensure cache coherency.
Again, all of this is from Jeffrey Richter's book, "CLR via C#".
I said, initially, that I didn't think Set performed a memory barrier. However, upon further reflection about what Mr. Richter said, Set would be performing an interlocked operation, and would thus also ensure cache coherency.
I stand by my original assertion that volatile is not needed here.
Edit 2: It looks as if you're building a "future". I'd suggest you look into PFX, rather than rolling your own.
The volatile keyword should not confused as to making _a, _b, and _c thread-safe. See here for a better explanation. Further, the ManualResetEvent does not have any bearing on the thread-safety of _a, _b, and _c. You have to manage that separately.
EDIT: With this edit, I am attempting to distill all of the information that has been put in various answers and comments regarding this question.
The basic question is whether or not the result variables (_a, _b, and _c) will be 'visible' at the time the flag variable (_completed) returns true.
For a moment, let's assume that none of the variables are marked volatile. In this case, it would be possible for the result variables to be set after the flag variable is set in Task(), like this:
   private void Task()
   {
      // possibly long-running work goes here
      _completed = true;
      _a = result1;
      _b = result2;
      _c = result3;
      _completedSignal.Set();
   }
This is clearly not what we want, so how do we deal with this?
If these variables are marked volatile, then this reordering will be prevented. But that is what prompted the original question - are the volatiles required or does the ManualResetEvent provide an implicit memory barrier such that reordering does not occur, in which case the volatile keyword is not really necessary?
If I understand correctly, wekempf's position is that the WaitOne() function provides an implicit memory barrier which fixes the problem. BUT that doesn't seem sufficient to me. The main and background threads could be executing on two separate processors. So, if Set() does not also provide an implicit memory barrier, then the Task() function could end up being executed like this on one of the processors (even with the volatile variables):
   private void Task()
   {
      // possibly long-running work goes here
      _completedSignal.Set();
      _a = result1;
      _b = result2;
      _c = result3;
      _completed = true;
   }
I have searched high and low for information regarding memory barriers and the EventWaitHandles, and I have come up with nothing. The only reference I have seen is the one wekempf has made to Jeffrey Richter's book. The problem I have with this is that the EventWaitHandle is meant to synchronize threads, not access to data. I have never seen any example where EventWaitHandle (e.g., ManualResetEvent) is used to synchronize access to data. As such, I'm hard-pressed to believe that EventWaitHandle does anything with regard to memory barriers. Otherwise, I would expect to find some reference to this on the internet.
EDIT #2: This is a response to wekempf's response to my response... ;)
I managed to read the section from Jeffrey Richter's book at amazon.com. From page 628 (wekempf quotes this too):
Finally, i should point out that whenever a thread calls an interlocked method, the CPU forces cache coherency. So if you are manipulating variables via interlocked methods, you do not have to worry about all of this memory model stuff. Furthermore, all thread synchronization locks (Monitor, ReaderWriterLock, Mutex, Semaphone, AutoResetEvent, ManualResetEvent, etc.) call interlocked methods internally.
So it would seem that, as wekempf pointed out, that the result variables do not require the volatile keyword in the example as shown since the ManualResetEvent ensures cache coherency.
Before closing this edit, there are two additional points I'd like to make.
First, my initial assumption was that the background thread would potentially run multiple times. I obviously overlooked the name of the class (OneUseBackgroundOp)! Given that it is only run once, it is not clear to me why the DoSomething() function calls WaitOne() in the manner that it does. What is the point of waiting initialWaitMs milliseconds if the background thread may or may not be done at the time DoSomething() returns? Why not just kickoff the background thread and use a lock to synchronize access to the results variables OR simply execute the contents of the Task() function as part of the thread that calls DoSomething()? Is there a reason not to do this?
Second, it seems to me that not using some kind of locking mechanism on the results variables is still a bad approach. True, it is not needed in the code as shown. But at some point down the road, another thread may come along and try to access the data. It would be better in my mind to prepare for this possibility now rather than try to track down mysterious behavior anomalies later.
Thanks to everyone for bearing with me on this. I've certainly learned a lot by participating in this discussion.
Wait functions have an implicit memory barrier. See http://msdn.microsoft.com/en-us/library/ms686355(v=vs.85).aspx
First, I'm not sure if I should "Answer my own question" or use a comment for this, but here goes:
My understanding is that volatile prevents code/memory optimizations from moving the accesses to my result variables (and the completed boolean) such that the the thread that reads the result will see upt-to-date data.
You wouldn't want the _completed boolean made visible to all threads after the Set() due to compiler or emmpry optimaztions/reordering. Likewise, you wouldn't want the writes to the results _a, _b, _c being seen after the Set().
EDIT: Further explaination/clarification on the question, in regards to items mentioned by Matt Davis:
Finally, i should point out that whenever a thread calls an interlocked method, the CPU forces cache coherency. So if you are manipulating variables via interlocked methods, you do not have to worry about all of this memory model stuff. Furthermore, all thread synchronization locks (Monitor, ReaderWriterLock, Mutex, Semaphone, AutoResetEvent, ManualResetEvent, etc.) call interlocked methods internally.
So it would seem that, as wekempf pointed out, that the result variables do not require the volatile keyword in the example as shown since the ManualResetEvent ensures cache coherency.
So you are both in agreement that such an operation takes care of caching between processors or in registers etc.
But does it prevent reording to guarantee such that BOTH the results are assigned before the completed flag, and that the completed flag is assigned true before the ManualResetEvent is Set?
First, my initial assumption was that the background thread would potentially run multiple times. I obviously overlooked the name of the class (OneUseBackgroundOp)! Given that it is only run once, it is not clear to me why the DoSomething() function calls WaitOne() in the manner that it does. What is the point of waiting initialWaitMs milliseconds if the background thread may or may not be done at the time DoSomething() returns? Why not just kickoff the background thread and use a lock to synchronize access to the results variables OR simply execute the contents of the Task() function as part of the thread that calls DoSomething()? Is there a reason not to do this?
The concept of the sample is to execute a possibly long-runnig task. If the task can be completed within an exceptable amount of time, then the calling thread will get access to the result and continue with normal processing. But sometime a task can take quite a long time and the claiing thread cannot be blocked for that period and can take reasonable steps to deal with that. That can include checking back later on the operation using the Completed property.
A concrete example: A DNS resolve is often very quick (subsecond) and worth waiting for even from a GUI, but sometimes it can take many many seconds. So by using a utility class like the sample, one could gets a result easily from the point-of-view of the caller 95% of the time and not lock up the GUI the other 5%. One could use a Background worker, but that can be overkill for an operation that the vast majority of the time doesn't need all that plumbing.
Second, it seems to me that not using some kind of locking mechanism on the results variables is still a bad approach. True, it is not needed in the code as shown.
The result (and completed flag) data is meant to be write-once, read-many. If I added a lock to assign the results and flag, I'd also have to lock in my result getters, and I never liked seeing getters lock just to return a data point. From my reading, such fine-grained locking is not appropriate. If an operation has 5 or 6 results, the caller has to take and release the lock 5 or 6 times needlessly.
But at some point down the road, another thread may come along and try to access the data. It would be better in my mind to prepare for this possibility now rather than try to track down mysterious behavior anomalies later.
Because I have a volatile completed flag that is guarenteed to be set before the volatile results are, and the only access to the results is through the getters, and as mentioned in the smaple, an exception is thrown if the getter is called and the operation is not yet complete, I'd expect that the Completed and result getters CAN be invoked by a thread other than the one that called DoSomething(). That's my hope anyway. I believe this to be true with the volatiles anyway.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With