In this article: http://www.drdobbs.com/parallel/volatile-vs-volatile/212701484?pgno=2
says, that we can't do any optimization for volatile, even such as (where: volatile int& v = *(address);):
v = 1;                // C: write to v
local = v;            // D: read from v
can't be optimized to this:
v = 1;                // C: write to v
local = 1;            // D: read from v  // but it can be done for std::atomic<>
It is can't be done, because between 1st and 2nd lines may v value be changed by hardware device (not CPU where can't work cache coherence: network adapter, GPU, FPGA, etc...) (sequentila/concurrency), which mapped to this memory location. But it is make sense only if v can't be cached in CPU-cache L1/2/3, because for usual (non-volatile) variable between 1st and 2nd line too small time and is likely to trigger cached. 
Does volatile qualifier guarantees no caching for this memory location?
ANSWER:
volatile doesn't guarantee no caching for this memory location, and there aren't anything about this in C/C++ Standards or compiler manual.volatile tries to perform the same two lines, it also cancels the cache memory of the device (e.g. in the cache GPU-L2). And need not to do GPU-cache-flushing and need not to do CPU-cache-flushing. Also for CPU might need to use std::atomic_thread_fence(std::memory_order_seq_cst); if L3-cache(LLC) coherency with DMA over PCIE, but L1/L2 is not. And for nVidia CUDA we can use: void __threadfence_system();
KeFlushIoBuffers(), FlushAdapterBuffers())
The volatile qualifier declares a data object that can have its value changed in ways outside the control or detection of the compiler (such as a variable updated by the system clock or by another program).
Conclusion: According to both Intel and AMD, cache consistency is managed by the hardware and thus volatile has nothing to do with caches. And the "volatiles are forced to live in main memory" is a myth. It does, however, probably indirectly cause additional cache invalidations, since STORE's are used more frequently.
Volatile and Non-Volatile Memory are both types of computer memory. Volatile Memory is used to store computer programs and data that CPU needs in real time and is erased once computer is switched off. RAM and Cache memory are volatile memory.
The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler. Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time.
volatile ensures that the variable won't be "cached" in CPU register. CPU cache is transparent to the programmer and if another CPU writes to the memory mapped by another CPU's cache, the second CPU's cache gets invalidated, therefore it will reload the value from the memory again during the next access.
Something about Cache coherence
As for the external memory writes (via DMA or another CPU-independent channel), you might need to flush the cache manually (see this SO question)
C Standard §6.7.3 7:
What constitutes an access to an object that has volatile-qualified type is implementation-defined.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With