Acces violation in TROBinMessage.WriteStream from TROEventReceiver.Invoke_GetEventsData

Hello,

I’m using TROEventReceiver from version 10.0.0.1489 of the SDK to allow my server to notify its clients (nodes) of new events.
While this works most of the time, it happens that the one of the node experiences a series of exceptions originating in TROEventReceiver.OnTimerTick that calls TROEventReceiver.Invoke_GetEventsData.
Here is an extract of a bug report generated by madExcept:

exception class    : EPrivilege
exception message  : Privileged instruction.

thread $2938 (TROThreadTimer):
00450043 +113 AGNode.exe   System.SysUtils               .TThreadLocalCounter
008474f9 +0cd AGNode.exe   uROEventReceiver      569 +27 TROEventReceiver.OnTimerTick
77ec3be1 +021 ntdll.dll                                  KiUserExceptionDispatcher
0076a9dc +0d8 AGNode.exe   uROTransportChannel   797 +21 TROTransportChannel.Dispatch
00846d29 +135 AGNode.exe   uROEventReceiver      419 +20 TROEventReceiver.Invoke_GetEventsData
008474a4 +078 AGNode.exe   uROEventReceiver      562 +20 TROEventReceiver.OnTimerTick
0076600c +014 AGNode.exe   uROThreadTimer        170  +3 TROThreadTimer.RunEvent
00765fd1 +049 AGNode.exe   uROThreadTimer        152 +28 TROThreadTimer.IntExecute
0076543f +07f AGNode.exe   uROInitializedThread   58  +7 TROInitializedThread.Execute
004b08ff +02b AGNode.exe   madExcept                     HookedTThreadExecute
0054cfdd +049 AGNode.exe   System.Classes                ThreadProc
0040b1bc +028 AGNode.exe   System               1556  +0 ThreadWrapper
004b07e5 +00d AGNode.exe   madExcept                     CallThreadProcSafe
004b084a +032 AGNode.exe   madExcept                     ThreadExceptFrame
77aa0417 +017 KERNEL32.DLL                               BaseThreadInitThunk
>> created by thread $282c (TROPooledThread) at:
0054d0a0 +018 AGNode.exe   System.Classes                TThread.Create

thread $2938 (TROThreadTimer), inner exception level 1:
>> EAccessViolation, Access violation at address 0053E0A7 in module 'AGNode.exe'. Read of address 00000000
0053e0a7 +00b AGNode.exe   System.Classes                TStream.SetPosition
00934fef +0bf AGNode.exe   uROBinMessage         494 +12 TROBinMessage.WriteStream
00934a8a +04e AGNode.exe   uROBinMessage         395  +4 TROBinMessage.WriteToStream
0076a9dc +0d8 AGNode.exe   uROTransportChannel   797 +21 TROTransportChannel.Dispatch
00846d29 +135 AGNode.exe   uROEventReceiver      419 +20 TROEventReceiver.Invoke_GetEventsData
008474a4 +078 AGNode.exe   uROEventReceiver      562 +20 TROEventReceiver.OnTimerTick
0076600c +014 AGNode.exe   uROThreadTimer        170  +3 TROThreadTimer.RunEvent
00765fd1 +049 AGNode.exe   uROThreadTimer        152 +28 TROThreadTimer.IntExecute
0076543f +07f AGNode.exe   uROInitializedThread   58  +7 TROInitializedThread.Execute
004b08ff +02b AGNode.exe   madExcept                     HookedTThreadExecute
0054cfdd +049 AGNode.exe   System.Classes                ThreadProc
0040b1bc +028 AGNode.exe   System               1556  +0 ThreadWrapper
004b07e5 +00d AGNode.exe   madExcept                     CallThreadProcSafe
004b084a +032 AGNode.exe   madExcept                     ThreadExceptFrame
77aa0417 +017 KERNEL32.DLL                               BaseThreadInitThunk
0054d0a0 +018 AGNode.exe   System.Classes                TThread.Create

The outer exception is raised as a consequence of the first one which looks as if the fStream private field from the TROBinMessage instance is nil

Note that this exception has a tragic consequence because it is not caught anywhere in the call stack and thus leads to the death of the timer thread.
In the default case, this means no further event will ever be received, but because I configured madExcept to do so, my nodes are instantly killed whenever an uncaught exception is triggered.

Is this something that you have already experienced? Is there a workaround?

Hi,

try to use personal message for TROEventReceiver.
it may solve TROBinMessage.FStream = nil issue.

I need a simple testcase that reproduces this case for more detailed solution/workaround.

Well, we create it like this:

  FMessage := TROBinMessage.Create;

  FChannel := TROSynapseSuperTCPChannel.Create(nil);
  FChannel.MaxPackageSize := 10 * FChannel.MaxPackageSize;
  FChannel.AutoReconnect := True;
  FChannel.SynchronizeEvents := False;
  FChannel.OnConnected := ChannelConnected;
  FChannel.OnDisconnected := ChannelDisconnected;

  FReceiver := TROEventReceiver.Create(nil);
  FReceiver.Message := FMessage;
  FReceiver.Channel := FChannel;
  FReceiver.ServiceName := 'NodeManagementService';
  FReceiver.LegacyEvents := False;
  FReceiver.Interval := 1;
  FReceiver.SynchronizeInvoke := False;

  FReceiver.RegisterEventHandler(EID_AGNodeEventSink, Self);

So I’m quite certain it has its own message instance.
But looking at the code, this message instance is also used for other communications, most times via a IROmessageCloneable(FMessage).Clone but not always.
I’m wondering if I should not use a separate message for the event receiver because there might be a race condition where TROBinMessage.fStream is freed and set to nil while the timer tick is triggered.

That, however, won’t help with the death of the timer thread when an exception is raised with the OnPollException handler as above. For this, I suggest the following patch:

uROThreadTimer.pas.patch.zip (799 Bytes)

I placed the try..except block at the lowest possible place, but you might feel it to be better placed around the entire body of the outer while not Terminated loop.

Hi,

it will just hide exceptions from OnPollException because you still won’t have access to TROThreadTimer.OnException
in brief, it is similar to putting code to try/except inside the OnPollException event


regarding to your issue. I think, you receive events in one thread and handle them in another thread, so you could have a new event when old one isn’t handled yet…

Not exactly as it will protect all timers based on TROTimerThread, with the added burden of giving an OnException handler to that class. But at least, it is protected against exceptions in exception handler.

There’s something like that, yes, or maybe message reuse in different threads, but clearly, a threading issue is the most likely culprit.

Thanks for the suggestions

try to put this line of TROEventReceiver.OnTimerTick method:

      cnt := Invoke_GetEventsData(eventsdata);

into a TCriticalSection/TRTLCriticalSection/etc like

  fCr.Enter;
  try
      cnt := Invoke_GetEventsData(eventsdata);
  finally
     fCr.Leave;
  end;
```

this may solve your original issue with TROBinMessage.fStream = nil
1 Like

does it solve original issue?

My patch? It avoids the death of the timer thread and thus allows the node to continue working.

It does not avoid the original access violation which happens only on rare occasions. I’ll apply different things and will see how it behaves in the future

I meaned workaround with critical section around

 cnt := Invoke_GetEventsData(eventsdata);

Alas no, it did not help.
What appears to help, though, is to make sure that no other code is using the message instance given to the event receiver.
My code was not that clean in that respect when that message could be used for calling remote methods while the events keep coming through.