Two TDAMemDataTable.Open calls blocking each other

tobygroves · February 20, 2015, 11:07am

Got a strange one here. My server connects to multiple databases. At startup, a thread is created for each database which checks connectivity and initialises various things.

The idea is that one or more of these database connections may be unavailable for some reason but I need the others to still “work”, hence using a separate initialisation thread for each. These are all spawned concurrently at startup.

In order to test connectivity I’m attempting to retrieve data by opening a TDAMemDataTable. If the SQL Server is unavailable then this can take a short while to timeout (20-ish seconds for me) which is fine.

The problem is that it seems this Open call is blocking all other Open calls in the other threads. If the first thread to try calling Open happens to be for a connection which isn’t available, then it blocks for the aforementioned 20 seconds but the other threads then block on their Open calls until the first thread’s has returned, even though their connections are ok. This is happening even if I have two connections, each to a totally different database server. This is no good to me as it defeats the whole purpose of the concurrent threads.

Why is this happening? Is there some non-reentrant code buried somewhere in the Open functionality which is blocking all other threads while it times out?

tobygroves · February 20, 2015, 11:29am

Think I’ve found the problem. The TDAConnectionManager.NewConnection function calls fConnectionCache.LockList and UnlockList. Unfortunately if the SQL Server is unavailable then my 20-second timeout is taking place within this block and preventing any other thread from creating a new connection.

This is surprising and a tad alarming to be honest. If, whilst my server is running, one of the SQL Server connections goes down for whatever reason, then the next time something tries to access it this timeout will block any other database connections from being made, effectively hanging my entire server for the duration of the timeout.

tobygroves · February 20, 2015, 2:01pm

I’m wondering whether I can change the code to move the connection creation outside the lock, like this:

    try
      for i := list.Count -1 downto 0 do begin
        tempconn := TCachedConnection(list[i]);
        if not tempconn.Connection.IsAlive then begin
           tempconn.Connection:=nil;
           list.Delete(i);
        end;
      end;
      for i := 0 to list.Count -1 do begin
        tempconn := TCachedConnection(list[i]);

        if _CompareConnection(conn,tempconn.Connection,lUser,lPass) then begin
          list.Delete(i);
          Result := tempconn.Connection;
          tempconn.Free;
          if Assigned(fOnConnectionAcquired) then fOnConnectionAcquired(Self, result);
          break;
        end;
      end;
    finally
      fConnectionCache.UnlockList;
    end;
    if Result = nil then begin
      // If it doesn't find one and the max poolsize is not reached then creates it...
      if Cardinal(fTotalConnections) < fMaxPoolSize then begin
        result := CreateNewConnection(aConnectionName, OpenConnection, UserID, Password);
        result.ConnectionPool := self;
        inc(fTotalConnections);
        if Assigned(fOnConnectionAcquired) then fOnConnectionAcquired(Self, result);
      end;
    end;

I think this is safe but I’m not 100% sure. The final IF block containing the CreateNewConnection call doesn’t seem to access fConnectionCache at all, any new connection is added to the cache by a subsequent call to ReleaseConnection. Can you confirm whether this modification is safe or not?

tobygroves · February 23, 2015, 1:11pm

Hi, don’t wish to hassle but any chance someone could have a look at the above and let me know if this modification is safe? Thanks.

EvgenyK · February 23, 2015, 2:49pm

modifying fTotalConnections outside lock can have potential problems:

          if Result = nil then begin
            if Cardinal(fTotalConnections) < fMaxPoolSize then begin   //<<<<<<<<<<<<<<
              result := CreateNewConnection(aConnectionName, OpenConnection, UserID, Password);
              result.ConnectionPool := self;
              inc(fTotalConnections);  //<<<<<<<<<<<<<<
              if Assigned(fOnConnectionAcquired) then fOnConnectionAcquired(Self, result);
            end;
          end;

tobygroves · February 23, 2015, 2:52pm

Ok so that’s a non-starter then.

Is there anything that can be done about this problem? As mentioned above, as my server can potentially connect to multiple SQL Servers, if one of these goes down for any reason then every attempt to connect to it will create a 20-ish second pause while it times out, during which no other connections can be made to other, working SQL Servers. This will bring the server to its knees

EvgenyK · February 23, 2015, 3:00pm

You can get closed connection inside lock and open it later if you passed OpenConnection as False.

tobygroves · February 23, 2015, 3:22pm

I thought about that but wasn’t sure how to make use of it. At present I’m testing connectivity by attempt to open a dataset contained within a schema in one of my services - if this fails then I know the server is inaccessible. Obviously doing it this way, there’s no way I can make use of the OpenConnection parameter in the CreateNewConnection call as this is made by the framework.

I tried calling CreateNewConnection directly (had to derive my own class from TDAConnectionManager to change the visibility) but this seems to create sporadic exceptions from the FireDAC driver so I presumed wasn’t a viable route.

Is there a recommended way of creating a new connection manually where I can defer opening the connection until after I’ve obtained it or, alternatively, could the code above be changed to open the connection outside the lock?

tobygroves · February 23, 2015, 3:37pm

Aha - just tried it by using the NewConnection public method and it seems to work without the sporadic exception I was getting earlier, but I’m now getting a small memory leak (61-68 bytes: Unknown) each time. Will try to track this down.

UPDATE: Seems the memory leak is my fault so, if I can work that out, this might work.

Question - would it make sense to change the NewConnection function to never open the connection in the call to CreateNewConnection but then attempt to open it outside the lock? Surely that would be safe and would prevent this issue of blocking all other connection creation calls when one is busy timing out trying to connect? Something like this:

  while result = nil do begin
    list := fConnectionCache.LockList;
    try
      ...
      if Result = nil then begin
        // If it doesn't find one and the max poolsize is not reached then creates it...
        if Cardinal(fTotalConnections) < fMaxPoolSize then begin
          result := CreateNewConnection(aConnectionName, False, UserID, Password);
          result.ConnectionPool := self;
          inc(fTotalConnections);
          if Assigned(fOnConnectionAcquired) then fOnConnectionAcquired(Self, result);
        end;
      end;
    finally
      fConnectionCache.UnlockList;
    end;
    if Result = nil then begin
      // Otherwise either waits and tries again or raises an error
      case PoolBehaviour of
        pbRaiseError: DAError(True, err_MaxPoolSizeReached);
        pbWait:  DAError(fConnectionWait.WaitFor(1000 * fWaitIntervalSeconds) <> wrSignaled,err_MaxPoolSizeReached);
        pbIgnoreAndReturn: break;
      end;
    end
    else if OpenConnection then
      result.Open;
  end;

RemObjectsSoftware · February 23, 2015, 4:23pm

Thanks, logged as bugs://71340

EvgenyK · February 23, 2015, 4:28pm

I’ve logged it for reviewing.
Note: it won’t come into current release because it is already locked

tobygroves · February 24, 2015, 8:12am

Thanks.

RemObjectsSoftware · March 2, 2015, 7:23pm

bugs://71340 got closed with status fixed.

tobygroves · March 4, 2015, 3:51pm

Aha, good news Is the fix as per my code above or can I get hold of the changed code so I can integrate it into my version?

EvgenyK · March 5, 2015, 7:21am

it’s based on your code
I’ve a bit refactored this method:
fOnConnectionAcquired event will be fired before except instead of firing on several places.