When we have a CPU that supports some form of multithreading, each logical CPU has it's own set of registers (as a minimum), including a CR3 register.
Since we are working on the vitual address space of the same process when executing different threads and a context switch never happens (neither the TLB cache gets invalidated when switching threads of the same process), why do we need a CR3 register to point to the page table and page directory in the logical CPU?
Isn't the value always the same as the value in the CR3 of the physcial CPU?
Since we are working on the vitual address space of the same process when executing different threads
That's not all HT is capable of. I think you're confusing "hardware thread" (execution context / logical core) with "software thread".
Two logical cores run on one physical core, with one physical iTLB / dTLB / L2TLB. The logical cores are very much independent, and don't have to be running threads from the same process.
This is a desirable property in an SMT design like Intel's HT: If the OS had to carefully avoid scheduling threads with different page tables onto different logical cores of the same physical core, it would require more synchronization between cores.
Two threads of different processes (with separate CR3 page tables) can share one TLB because the entries are tagged with a PCID (process-context ID). IIRC, hardware virtualization also uses similar (or the same?) tagging to avoid needing TLB flushes on VM exits or when switching between guests.
The OS can set a PCID (low 12 bits of CR3) to avoid needing TLB flushes on context switches, and as a bonus enables concurrent TLB usage by 2 processes. Does Linux use x86 CPU's PCID feature for TLB? If not, why? (According to that, Linux doesn't generally use PCID, but I assume it does for HT.)
Hmm, I'm not sure I have the details exactly right, but physically there is some kind of tagging of TLB entries to keep them separate even when the two logical cores have different CR3.
According to an Intel forum thread, SnB-family CPUs statically partition the iTLB (so each logical core gets half the entries). That automatically solves any sharing problems.
The dTLB and L2TLB are competitively shared, so they do need tagging.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With