What happens when active (i.e. the one pointed by TR register) TSS's fields are changed? In particular, does a change to ESP0/RSP0 field have immediate effect? Or does the processor mantain a cache of TSS as it does with segment selectors, so an LTR instruction is needed to force the processor to reload TSS fields?
The processor uses the TSS to store the current context and load the next-to-be-scheduled context during task switching.
Changing the TSS structure won't affect any context until the CPU switches to such TSS. 
The CPU performs a task switch when
Software or the processor can dispatch a task for execution in one of the following ways:
• A explicit call to a task with the CALL instruction.
• A explicit jump to a task with the JMP instruction.
• An implicit call (by the processor) to an interrupt-handler task.
• An implicit call to an exception-handler task.
• A return (initiated with an IRET instruction) when the NT flag in the EFLAGS register is set.
You can read about TSS on Chapter 7 of Intel manual 3.
The ltr doesn't perform a switch, from the Intel manual 2:
After the segment selector is loaded in the task register, the processor uses the segment selector to locate the segment descriptor for the TSS in the global descriptor table (GDT).
It then loads the segment limit and base address for the TSS from the segment descriptor into the task register.
The task pointed to by the task register is marked busy, but a switch to the task does not occur.
EDIT: I've actually tested if the CPU cached the static values from the TSS.
The test consisted in a boot program (attached) that
tr.  On my Haswell and on Bochs the result is 2, meaning that the CPU read the TSS from the memory (hierarchy) when needed.
Though a test on a model cannot be generalised to the ISA, it is unlikely that this is not the case.
BITS 16
xor ax, ax          ;Most EFI CPS need the first instruction to be this
;But I like to have my offset to be close to 0, not 7c00h
jmp 7c0h : WORD __START__
__START__:
  cli
  ;Set up the segments to 7c0h
  mov ax, cs
  mov ss, ax
  xor sp, sp
  mov ds, ax
  ;Switch to PM
  lgdt [GDT]
  mov eax, cr0
  or ax, 1
  mov cr0, eax
  ;Set CS
  jmp CS_DPL0 : WORD __PM__ + 7c00h
__PM__:
  BITS 32
  ;Set segments
  mov ax, DS_DPL0
  mov ss, ax
  mov ds, ax
  mov es, ax
  mov esp, ESP_VALUE0
  ;Make a minimal TSS BEFORE loading TR
  mov eax, DS_DPL0
  mov DWORD [TSS_BASE + TSS_SS0], eax
  mov DWORD [TSS_BASE + TSS_ESP0], ESP_VALUE1
  ;Load TSS in TR
  mov ax, TSS_SEL
  ltr ax
  ;Go to CPL = 3
  push DWORD DS_DPL3 | RPL_3
  push DWORD ESP_VALUE0
  push DWORD CS_DPL3 | RPL_3
  push DWORD __PMCPL3__ + 7c00h
  retf
__PMCPL3__:
  ;UPDATE ESP IN TSS
  mov ax, DS_DPL3 | RPL_3
  mov ds, ax
  mov DWORD [TSS_BASE + TSS_ESP0], ESP_VALUE2
  ;SWITCH STACK
  call CALL_GATE : 0
  jmp $
__PMCG__:
  mov eax, esp
  mov bx, 0900h | '1'
  cmp eax, ESP_VALUE1 - 10h
  je __write
  mov bl, '2'
  cmp eax, ESP_VALUE2 - 10h
  je __write
  mov bl, '0'
__write:
  mov WORD [0b8000h + 80*5*2], bx
  cli
  hlt
GDT dw 37h
    dd GDT + 7c00h      ;GDT symbol is relative to 0 for the assembler
                ;We translate it to linear
    dw 0
    ;Index 1 (Selector 08h)
    ;TSS starting at 8000h and with length = 64KiB
    dw 0ffffh
    dw TSS_BASE
    dd 0000e900h
    ;Index 2 (Selector 10h)
    ;Code segment with DPL=3
    dd 0000ffffh, 00cffa00h
    ;Index 3 (Selector 18h)
    ;Data segment with DPL=0
    dd 0000ffffh, 00cff200h
    ;Index 4 (Selector 20h)
    ;Code segment with DPL=0
    dd 0000ffffh, 00cf9a00h
    ;Index 5 (Selector 28h)
    ;Data segment with DPL=0
    dd 0000ffffh, 00cf9200h
    ;Index 6 (Selector 30h)
    ;Call gate with DPL = 3 for SEL=20
    dw __PMCG__ + 7c00h
    dw CS_DPL0
    dd 0000ec00h
  ;Fake partition table entry
  TIMES 446-($-$$) db 0
  db 80h, 0,0,0, 07h
  TIMES 510-($-$$) db 0
  dw 0aa55h
  TSS_BASE  EQU     8000h
  TSS_ESP0  EQU     4
  TSS_SS0   EQU     8
  ESP_VALUE0    EQU 7c00h
  ESP_VALUE1    EQU 6000h
  ESP_VALUE2    EQU 7000h
  CS_DPL0   EQU 20h
  CS_DPL3   EQU 10h
  DS_DPL0   EQU 28h
  DS_DPL3   EQU 18h
  TSS_SEL   EQU 08h
  CALL_GATE EQU 30h
  RPL_3     EQU 03h
The TSS is only read when necessary, and there's no special TSS cache. (The TSS descriptor in the GDT is cached like with segment descriptors, but not the contents of the TSS itself. The TSS can be cached in the normal L1/L2/L3 memory cache like any other region of memory.)
There's three different areas of the TSS that are read from in different circumstances. Changing any of the values in TSS has no effect until the appropriate circumstance occurs. They are:
Note that in 64-bit mode only cases 1, 3 and 5 can occur as 64-bit mode doesn't support Virtual 8086 mode and nor does it support task switching.
The LTR instruction doesn't cause any region of memory to be read except the entry in the GDT corresponding to the selector given, nor is there is there any internal TSS cache for it to flush.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With