Race condition between TROBaseSuperTCPServer.IntExecute and TROBaseSuperTCPServer.DisconnectAllClients

Hello,

I have just updated to the RO SDK 10.0.0.1573 version and am faced with access violations when deactivating a server that has clients.
The access violation occurs in a call stack that originates on the l_worker.Disconnect line inside TROBaseSuperTCPServer.DisconnectAllClients.
Tracing it, I discovered this is because the l_worker instance has been destroyed by the FreeOrDisposeOf(l_worker) call inside TROBaseSuperTCPServer.IntExecute method.

Looking at the code from both methods, I see an issue

  • TROBaseSuperTCPServer.DisconnectAllClients calls fClients.LockList but uses the first item in the list outside the protected section.

With this situation, it’s quite easy to have TROBaseSuperTCPServer.DisconnectAllClients retrieve the TROSCServerWorker instance that was created by TROBaseSuperTCPServer.IntExecute and work on it right after the moment FreeOrDisposeOf has been called on it.

I’m not too sure how to fix this situation. I could call Extract inside TROBaseSuperTCPServer.DisconnectAllClients to make sure it’s no longer in the list and then place it back but that would defeat the whole purpose of the code inside TROBaseSuperTCPServer.IntExecute that wants to destroy the instance it has created.
Alternatively, I could place the call to Disconnect inside the locked section but this would require changing TROBaseSuperTCPServer.IntExecute so that FreeOrDisposeOf is called inside a locked section as TROBaseSuperTCPServer.DisconnectAllClients could have retrieved the instance in between the calls to Add and Remove only to work on the instance once FreeOrDisposeOf has been called.

That said, I’m not too sure about this as I have a feeling that l_worker.Disconnect was meant to be outside the locked section in order not too lock the list for too long.

Any suggestion to fix this situation?

Hi,

Can you test this workaround with your environment, pls?

update uROBaseSuperTCPServer.pas as

procedure TROBaseSuperTCPServer.DisconnectAllClients;
..
      l_list.Delete(0);  //<<<<added
      l_worker.Disconnect; //<<< moved
    finally
      fClients.UnlockList;
    end;
  end;
end;

Logged as bugs://D19402.

Thanks, that’s my second suggestion with an additional call to Delete(0)
My tests show that it prevents the access violation I was seeing.

bugs://D19402 was closed as fixed.