IpSuperTcp channels: SuperChannel TimeOut or Actively refused

Hello RemObjects support team,

We use the IpSuperTcpSuperChannel and IpSuperTcpClientChannel in our .NET WinForms application (version 9.4.109.1377). We have one client where they get the ‘SuperChannel timeout’ or ‘Actively refused’ error at least once a day. The only solution then is restarting the application service. It’s our biggest client, so with lots of clients and many server calls.

What could be the problem here? There is no virus scanner on the server and the domain firewall is disabled (+ we made an exception in the firewall for the used port for when the firewall would be enabled again).

Thanks for the assistance.

Wouter

Hello

If you’ll consider that the questions below contain sensitive information then please send answers directly to support@

How many calls are performed by client and how many clients are there?
Do you actively use server-sent events?
Are the client channel instances recreated on every client->server call or they are reused within the client app?
Do you use TLS encryption?
Which platform/.NET version uses the host where the server application is run?
Please run netstat -a -o -n several hours after the server app has been started and once it stops to accept client requests

Hello Anton,

There are 87 clients. A lot of calls are made with the remotedataadapter (hundreds of calls a minute I presume spread over all the different clients) and quit a lot of calls are made with RODL functions (tens of calls a minute spread over all the different client pc’s).

We do not use server-sent events.
The clientchannel instance is reused.

No TLS encryption.

Server side .NET version: .NET 4.7 Framework running on a Windows Server 2012 R2.

We’ll run the netstat command and get back to you.

Thank you.

Hello Anton,

The TCP IP port we use is 8090.

As promised a netstat when the application was working:

And a netstat when it stopped accepting client requests:
netstat.txt (10.4 KB)

Thank you.

Ok, that’s not so good. The 2nd listing shows lot of hanged up connections from a single client host.

What is different with the host 10.197.25.6 ? Different app? Flaky connection (or just a connection via Internet rather than LAN)?

Thanks, logged as bugs://82853

It’s a pc that’s connected to their network with a VPN connection. It’s the same app.

We implemented the IpHttpServerChannel. And now we get the FIN_WAIT_2 state and a lot of ‘Connection was closed’ problems. Any idea’s? Please, the problem is getting very urgent…

Thanks.

Are FIN-WAIT-2 connections present on server or on client side? Are they present on all clients, or only on ones that are connected via VPN?

The FIN-WAIT-2 state means that connection was closed properly (remote side did not confirm that the connection is actually closed).

Possible solutions -

Ideal - to make VPN connection more reliable

If that is not possible then you can
a) Set client channel’s property KeepAlive to false and check if that helps.

b) If that did not help then adjust the FIN-WAIT-2 timeout after which such stalled sockets will be released by the system itself:


Set registry key:

Key : HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
Value Type : REG_DWORD

Valid Range : 30–294,967,295
Default : 120

Recommended value : 30


Disable tcp protocol autotuning:

netsh int tcp set global autotuninglevel=disabled

Restart the host

They are present on servers side… And it’s not only the clients with VPN connections who are experiencing the problem but also some clients who work in the local network.

Did you try to adjust the settings?

Setting the KeepAlive setting to false results in a much slower program…

Setting the registry key didn’t help.
We didn’t get the permission to disable tcp protocol autotuning.

We just had the SuperChannel timeout problem with another of our clients in an entire other program… We are starting to get worried and without vpn connections…

As well as do I

Ok. Let’s evaluate this one more time:

  1. When did this started to happen? Were there any changes in hardware or software prior to this?
  2. Is it possible to try in your environment a server and client app built using the latest v10 version (either with SuperTCP-base connections or with Http+Keep-Alive set to true)? There were changes for a possible connection leak (however that was for a client side)

If the situation persists then please add this code to the application startup (this should be literally the 1st lines):

		AppDomain.CurrentDomain.FirstChanceException +=
			(object source, FirstChanceExceptionEventArgs e) =>
			{
				Console.WriteLine(DateTime.UtcNow.ToString() + ": FirstChanceException event raised in {0}: {1}",
					e.Exception.Message, e.Exception.ToString());
			};

Instead of Console.WriteLine here should be a call to some logger that can persist even after the application exit. This code would allow to check if some exception prevents RO SDK from shutting down the sockets.

  1. It started to happen in march of this year. When it started it was a weekly or 2-weekly problem, but since June it’s practically daily. No changes in hardware or software. Before march it worked for over more than a year.

  2. We will install the v10 version next week and release a new build for both of the programs

Should we add this logging code to our server and client application?

Thank you for thinking with us.

Maybe a significant workload increase or amount of clients increase?
(BTW is the TLS protection enabled on the clients?)

Thanks!

Yes, if possible

Oh. One more thing.
If/when the server side is again in FIN_WAIT state then could you also run netstate on the corresponding client host to check if there is a set of CLOSE_WAIT connections?

The only thing that happened is they are in the process of upgrading the client pc’s to Windows 10.
But that’s not the case with the other customer who had the same issue today.

The number of clients increased over time from 65 to approximatly 90.

TLS Question:
image

Theoretically it could be that there is an issue with RO SDK socket management that had been revealed by Win 10 TCP protocol implementation (or bug either).

Anyway when this happens next time please try to gather both client and server netstat results.

I’ll do my best to gather the nestat results

Hello,

An update on the issue:
The problem occurs less frequently since upgrading to RemObjects 10. BUT when it occurs, they don’t get the ‘Actively refused’ or ‘SuperChannel timeout’ anymore. The application just hangs. After killing it with Task Manager and restarting they can continue working. We are waiting for a detailed list of how many times it occurs, which OS it occurs on, … so we can let you know.