I am currently trying to find the best (*) way to have two threads running alternatively and make them wait on each other.
(*) best combination of being quick while having low CPU cost
I found three ways so far which I put together in some demo application to show the problems I found.
Using a TMonitor following the classic wait/pulse pattern is performing not very well because of all the locking (according to SamplingProfiler is burns most of the time in these functions). I tried the same using Windows events (SyncObjs.TEvent) but it performed similar (i.e. bad).
Using a wait loop that calls TThread.Yield performs best but obviously burns CPU cycles like crazy. That does not matter if the switching happens very quick, but hurts when the thread is actually waiting (you can see that in the demo).
Using the TSpinWait performs great (if not best of these three) but only if the switches happen very quick. The longer it takes for switching the worse the performance gets because of how TSpinWait works.
Since multithreading is not one of my strengths I was wondering if there is some combination of these ways or some completely different approach to achieve a good performance in both scenarios (fast and slow switches).
program PingPongThreads;
{$APPTYPE CONSOLE}
{$R *.res}
uses
  Classes,
  Diagnostics,
  SyncObjs,
  SysUtils;
type
  TPingPongThread = class(TThread)
  private
    fCount: Integer;
  protected
    procedure Execute; override;
    procedure Pong; virtual;
  public
    procedure Ping; virtual;
    property Count: Integer read fCount;
  end;
  TPingPongThreadClass = class of TPingPongThread;
  TMonitorThread = class(TPingPongThread)
  protected
    procedure Pong; override;
    procedure TerminatedSet; override;
  public
    procedure Ping; override;
  end;
  TYieldThread = class(TPingPongThread)
  private
    fState: Integer;
  protected
    procedure Pong; override;
  public
    procedure Ping; override;
  end;
  TSpinWaitThread = class(TPingPongThread)
  private
    fState: Integer;
  protected
    procedure Pong; override;
  public
    procedure Ping; override;
  end;
{ TPingPongThread }
procedure TPingPongThread.Execute;
begin
  while not Terminated do
    Pong;
end;
procedure TPingPongThread.Ping;
begin
  TInterlocked.Increment(fCount);
end;
procedure TPingPongThread.Pong;
begin
  TInterlocked.Increment(fCount);
end;
{ TMonitorThread }
procedure TMonitorThread.Ping;
begin
  inherited;
  TMonitor.Enter(Self);
  try
    if Suspended then
      Start
    else
      TMonitor.Pulse(Self);
    TMonitor.Wait(Self, INFINITE);
  finally
    TMonitor.Exit(Self);
  end;
end;
procedure TMonitorThread.Pong;
begin
  inherited;
  TMonitor.Enter(Self);
  try
    TMonitor.Pulse(Self);
    if not Terminated then
      TMonitor.Wait(Self, INFINITE);
  finally
    TMonitor.Exit(Self);
  end;
end;
procedure TMonitorThread.TerminatedSet;
begin
  TMonitor.Enter(Self);
  try
    TMonitor.Pulse(Self);
  finally
    TMonitor.Exit(Self);
  end;
end;
{ TYieldThread }
procedure TYieldThread.Ping;
begin
  inherited;
  if Suspended then
    Start
  else
    fState := 3;
  while TInterlocked.CompareExchange(fState, 2, 1) <> 1 do
    TThread.Yield;
end;
procedure TYieldThread.Pong;
begin
  inherited;
  fState := 1;
  while TInterlocked.CompareExchange(fState, 0, 3) <> 3 do
    if Terminated then
      Abort
    else
      TThread.Yield;
end;
{ TSpinWaitThread }
procedure TSpinWaitThread.Ping;
var
  w: TSpinWait;
begin
  inherited;
  if Suspended then
    Start
  else
    fState := 3;
  w.Reset;
  while TInterlocked.CompareExchange(fState, 2, 1) <> 1 do
    w.SpinCycle;
end;
procedure TSpinWaitThread.Pong;
var
  w: TSpinWait;
begin
  inherited;
  fState := 1;
  w.Reset;
  while TInterlocked.CompareExchange(fState, 0, 3) <> 3 do
    if Terminated then
      Abort
    else
      w.SpinCycle;
end;
procedure TestPingPongThread(threadClass: TPingPongThreadClass; quickSwitch: Boolean);
const
  MAXCOUNT = 10000;
var
  t: TPingPongThread;
  i: Integer;
  sw: TStopwatch;
  w: TSpinWait;
begin
  t := threadClass.Create(True);
  try
    for i := 1 to MAXCOUNT do
    begin
      t.Ping;
      if not quickSwitch then
      begin
        // simulate some work
        w.Reset;
        while w.Count < 20 do
          w.SpinCycle;
      end;
      if i = 1 then
      begin
        if not quickSwitch then
        begin
          Writeln('Check CPU usage. Press <Enter> to continue');
          Readln;
        end;
        sw := TStopwatch.StartNew;
      end;
    end;
    Writeln(threadClass.ClassName, ' quick switches: ', quickSwitch);
    Writeln('Duration: ', sw.ElapsedMilliseconds, ' ms');
    Writeln('Call count: ', t.Count);
    Writeln;
  finally
    t.Free;
  end;
end;
procedure Main;
begin
  TestPingPongThread(TMonitorThread, False);
  TestPingPongThread(TYieldThread, False);
  TestPingPongThread(TSpinWaitThread, False);
  TestPingPongThread(TMonitorThread, True);
  TestPingPongThread(TYieldThread, True);
  TestPingPongThread(TSpinWaitThread, True);
end;
begin
  try
    Main;
  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
  Writeln('Press <Enter> to exit');
  Readln;
end.
Update:
I came up with a combination of an event and the spinwait:
constructor TSpinEvent.Create;
begin
  inherited Create(nil, False, False, '');
end;
procedure TSpinEvent.SetEvent;
begin
  fState := 1;
  inherited;
end;
procedure TSpinEvent.WaitFor;
var
  startCount: Cardinal;
begin
  startCount := TThread.GetTickCount;
  while TInterlocked.CompareExchange(fState, 0, 1) <> 1 do
  begin
    if (TThread.GetTickCount - startCount) >= YieldTimeout then // YieldTimeout = 10
      inherited WaitFor(INFINITE)
    else
      TThread.Yield;
  end;
end;
This performs only roughly 5 to 6 times slower than a fiber based implementation when doing quick switching and less than 1% slower when adding some work between the Ping calls. It of course runs on 2 cores instead of only one when using the fiber.
Synchronization is built around an internal entity known as the lock or monitor. Every object has a lock associated with it. By convention, a thread that needs consistent access to an object's fields has to acquire the object's lock before accessing them, and then release the lock when it's done with them.
To use a thread object in your application (and to create a descendant of Classes. TThread): Choose one: File > New > Other > Delphi Projects > Delphi Files > Thread Object.
Thread synchronization is the concurrent execution of two or more threads that share critical resources. Threads should be synchronized to avoid critical resource use conflicts. Otherwise, conflicts may arise when parallel-running threads attempt to modify a common variable at the same time.
When I find myself in situations like this, I like to use Windows events. They are exposed in Delphi using a TEvent class, which you WaitForSingleObject.
So, you could use two events: Thread1NotActive and Thread2NotActive. Once Thread1 is done, it sets the Thread1NotActive flag, which is waited on by Thread2. Conversely, if Thread2 stops processing, it sets Thread2NotActive, which is monitored by Thread1.
This should allow you to avoid race conditions(which is why I am suggesting to use two events instead of 1) and should keep you sane in the process, while not consuming inordinate amounts of CPU time.
If you need a more complete example, you'll have to wait tomorrow :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With