Lockup issues

Hi,

I had some major issues with my server application locking up earlier this year which I spent a great deal of time resolving. It was all down to deadlocks, some within my own code and one in particular within the RO framework which was fixed.

I’m now having similar problems and trying desperately to track them down. Unfortunately, and yet again, the issue is very sporadic and I can’t reproduce it myself, it only happens at customer sites which makes diagnosis a nightmare. From the debug logs I’ve gather from customers, it appears to be something to do with session management. It appears that, when the problem “hits”, the server becomes unresponsive because it’s failing to create server-side sessions in response to client requests. I can see the client requests coming in and the relevant service instance is created but then never activated. Every request blocks in this way, creating a dead, stuck thread which never activates or gets any further and the server has to be restarted.

My current suspicion is that I’m doing something which is destabilising the session management system. Many of the service functions called by clients call other server functions which create temporary “server side” sessions to do their work. I do this because the client’s own session doesn’t necessarily have sufficient permissions and also because some of these server functions need to operate autonomously and not in response to a client request so it seemed logical to have them operate within their own dedicated session which is created for that purpose and then destroyed afterward.

This obviously leads to a lot of sessions being created and destroyed as functions are called. It all seems to work 99% of the time but I have to admit that, whenever these lockups occur, it does seem to be around the time that these sessions are being destroyed.

I obviously need to do more investigations but I was wondering if anyone has any vague ideas of what could be causing these problems?

When I create these special sessions, I’m calling the session manager’s CreateSession function with a new GUID, then ReleaseSession with True as the NewSession parameter. When I’m done with it I’m calling DeleteSession with False as the IsExpired parameter.

Is this the correct way to do this or should I be doing this differently? Thanks for any advice, I appreciate there’s very little to go on here but I thought I’d make an initial pre-emptive post in case any geniuses spot anything obvious I’ve missed.

Been poring over log files and it’s definitely the session manager.

It looks very much like the TROCustomSessionManager’s fCritical critical section is being acquired and not released at some point and this is then blocking everything. It looks like DeleteSession is calling the OnBeforeDeleteSession handler and then blocking when trying to acquire the critical section on the next line.

Once locked, all client calls to services which require a session cause a service object instance to be created but then never activate it - I believe the framework is trying to create/obtain a session between these two events.

Any ideas on what could possibly be causing this?

Ignore me, I think I’ve found it.

what was the problem?

Would you believe a re-occurrence of the previous bug we found earlier this year - our build machine had been reinstalled and someone forgot to re-instate the bugfix as we’re still using 8.0.83.1137. Need to upgrade the latest version soon methinks - epic fail on our part, sorry.