__inline__ uint64_t rdtsc() {
uint32_t low, high;
__asm__ __volatile__ (
"xorl %%eax,%%eax \n cpuid"
::: "%rax", "%rbx", "%rcx", "%rdx" );
__asm__ __volatile__ (
"rdtsc" : "=a" (low), "=d" (high));
return (uint64_t)high << 32 | low;
}
I have used the above rdtsc function as a timer in my program: The following code results in 312-344 clock cycles:
start = rdtsc();
stop = rdtsc();
elapsed_ticks = (unsigned)((stop-start));
printf("\n%u ticks\n",elapsed_ticks);
every time I run the above code I get different values. Why is that?
I ran the same code in Visual C++ which uses an rdtsc function in "intrin.h". I was getting a constant value of 18 clocks.Yes, it was constant on every run! Can someone please explain? Thanks!
It's quite difficult to get reliable timestamps using the TSC. The main problems are:
Your function is executing the cpuid instruction and ignoring its result, as well as reading the TSC, to try to mitigate the last issue. That's a serialising instruction, which forces in-order execution. However, it's also rather a slow instruction, so will affect the result if you try to measure an extremely short time.
If I remove that instruction from the function to make it equivalent to the intrinsic you're using in VC++:
inline uint64_t rdtsc() {
uint32_t low, high;
asm volatile ("rdtsc" : "=a" (low), "=d" (high));
return (uint64_t)high << 32 | low;
}
then I get more consistent values, but reintroduce the potential instruction-ordering issue.
Also, make sure you're compiling with optimisation (e.g. -O3 if you're using GCC), otherwise the function may not be inlined.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With