Hello,
My current setup is as follows:
A central server using Synapse SuperTCPServer receives requests from Clients to perform tasks via ClientService
.
It does not process those tasks itself but rather sends them to “Nodes” that will perform them. Those nodes connect to the server via NodeService
and receive the order to perform their various actions via events.
So the usual course of action is as follows:
Node conne* cts to Server and waits for tasks
- Client connects to Server
- Client sends a task to Server via
ClientService.PerformTask
, which is a blocking call - Server sends an event to Node with the Task to perform, which is a non blocking call.
- Server starts waiting on a signal from the Node
- Node performs the Task and calls
NodeService.TaskEnded
to signal the Server that it has processed the task - Server ends its wait and finishes the
PerformTask
call - Client receives result for its task.
This setup with events between Server and Node is made so that there is only the need to open a firewall port on the server, not on any of the potentially numerous nodes.
The tasks themselves can be quite lengthy but this works just fine on reliable networks.
However, this all goes down the drain when the network disappears between any parts of this setup.
For instance, if the network is down when the Node notifies the Server it has finished working, the Server will never receive its notification.
Conversely, when the network goes down just before sending an event to the Node, I don’t get a notification and the Server waits indefinitely.
And finally, when the network goes down in the middle of a ClientService.PerformTask
call, the Client gets a EROTimeout
exception and has lost all its work, even if the task itself is still running on a node and would succeed a few seconds later.
So, what I’d like to do is to have a way to make this whole setup more reliable when faced with erratic network behavior.
For the Node to Server replies, I could use the Retry
parameter in the OnException handler, but that requires being able to detect the situation.
For the Server to Node requests, I don’t see how I can have the event being sent again if it never reached its destination.
For the Client to Server connections, I believe I could use the Async interfaces, but the documentation does not tell me how it behaves when the network goes down between Invoke_
and Receive_
calls.
I would appreciate any suggestions that would allow me to make this whole setup more robust.