I have multiple processes that communicate with each other running on different cores of a dual-processor X86-64 Linux machine. The content of the communication includes timestamps. I want to write time-related logic of the programs with the simple assumption that all the timestamps are from the same global clock. Can I count on clock_gettime(CLOCK_MONOTONIC) to give me monotonic timestamps even across different threads running on different cores?
Particularly, suppose Process A takes a timestamp X and sends it to Process B via shared memory. Process B reads it and then takes a timestamp Y. X cannot be greater than Y.
Does the timestamp taken using clock_gettime(CLOCK_MONOTONIC) have the above property? If not, what are some other types of monotonic timestamp that have this property?
Can I count on clock_gettime(CLOCK_MONOTONIC) to give me monotonic timestamps even across different threads running on different cores?
The timestamps are guaranteed monotonic only on the same core. That is, if you have
Thread on CPU A core C                 Thread on CPU B core D
pthread_mutex_lock(&lock);
T1 = clock_gettime(CLOCK_MONOTONIC);
pthread_mutex_unlock(&lock);           pthread_mutex_lock(&lock);
                                       T2 = clock_gettime(CLOCK_MONOTONIC);
                                       pthread_mutex_unlock(&lock);
there is no absolute guarantee that T2 > T1.
The Linux kernel does its best to ensure that T2 > T1, but the issue is the hardware: some hardware just doesn't have a time source that is kept in sync well enough. On such hardware, creating a reliably monotonic clock that is kept in sync across all CPUs and cores would require an interprocess interrupt or some other way to keep a single clock value somewhere, and that is just too slow to be efficient.
There are some configurations where the clock source is known to be synchronized across all CPU cores. For example, if all physical ID: fields are the same in /proc/cpuinfo, and all flags: fields have tsc_reliable, then the Time Stamp Counter register is known to be synchronized across all cores and is used as the time source. However, in practice, you do not do such checks, because the results are inferred, not guaranteed by the kernel, and can therefore be wrong.
In practice, we calculate things as if we assumed that CLOCK_MONOTONIC is monotonic across cores, but are pragmatic and check.
For timing message passing or signals between threads on potentially different cores, we measure the round-trip time. Use a large number of round-trips, and pick the median of the times: that gives you a reliable result, and you can say "at least half the round-trips complete within time T" with confidence and without ambiquity. (Often you might pick a much higher point, say 68.3% or 95%.)
If you need reliable CLOCK_MONOTONIC -derived timestamp across processes that have access to the same shared memory segment, you can implement it by storing the "current" timestamp in that shared memory.
Whenever a process wants a timestamp, it does the equivalent of
Do:
    T0 = clock_gettime(CLOCK_MONOTONIC)
    Ts = shared timestamp
    T = max(T0, Ts)
While CompareExchange(shared timestamp, Ts, T) fails.
Use T as timestamp.
That is, it compares the local monotonic clock and the shared timestamp, updates the shared timestamp to the higher of the two, and uses that as the timestamp as well.
You can use GCC's __atomic_compare_exchange_n() built-in to update the shared timestamp, without holding any locks.  (It is not necessary to atomically read the timestamp from the shared memory, because the atomic compare-and-exchange takes care of the atomicity.)
The only downside is that if many threads do this often, you do get some overhead due to cacheline ping-pong.
Note that if you use uint64_t (in nanoseconds) for the timestamp, you will want to account for wraparound in the max function:
static inline uint64_t  max_wraparound(const uint64_t  a, const uint64_t  b)
{
    return ((uint64_t)(a - b) < UINT64_C(9223372036854775808)) ? a : b;
}
This way, the difference between any two such timestamps, before and after, is always (uint64_t)(after - before), even if the timestamp value wrapped around in between.
POSIX defines CLOCK_MONOTONIC in terms of a system-wide clock. System-wide means all cores, sockets, clusters that conform to a single system image [ie. one kernel].
I read Nominal Animals answer with interest, it seems a bit shocking to me that one could observe a CLOCK_MONOTONIC going backwards and seems to break the POSIX contract.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With