I’m using the SDK to transfer large content from the client to the server and am facing performances issues, where the gigabit network link does not get saturated.
To illustrate, I have created the following example:
As you will see, it can create a 1GB source file and transfers it to the distant server. For comparison, the sample also uses the CopyFile API to transfer the same file to a location of your choosing. For a fair test, the location should be a shared folder on the same machine where the server part is running.
Here, using those requirements, I get the following results:
The first line reads like this: RO for RemObjects transfer 71 for the number of calls to PutNextChunk False for the value of UseCompression on both sides 15242800 for the chunk size
Those last two values are those that I found to be ideal here on my network. Indeed, if I leave the compression activated, the time it takes to transfer a file is multiplied by 5!
If I give a higher or lower value for the chunk size, the performance decreases as well.
The transfer via CopyFile is done to a shared folder and using it directly via a UNC path or via a mounted drive has little impact on performances.
There’s a 32% difference in performance between the two methods, and while I don’t expect to reach the speed of CopyFile, I would be glad if you could give me hints as to what I could try to get the best performance.
Yes, that’s because it’s done on the same machine and so you are hitting the hard drive limits.
I have changed the example so that the file is loaded in RAM before being sent, and similarly in the server, so that only network transfer speed is measured.
Please use this one: RO Performance.zip (113.0 KB)
I also changed the factory to TROPerClientClassFactory but it does not help that much.
Just for reference, I added a button called “No Send” that does the same work as RO except it never sends the data down the line. Basically, it measures the speed at which the RAM gives out the data.
I don’t know if I can do anything more, because I have to find a trade-off between speed and memory used by the server. But if there are protocol options that can be changed, then I’d be happy to know about them.
data is loaded into binary (TClientForm.btnROClick)
method’s parameters (i.e.binary) are written to BinMessage (TROPerformanceService_Proxy.PutNextChunk)
BinMessage is written to stream, i.e. server call request is generated (TROTransportChannel.Dispatch)
SuperTCP request is generated, from above stream (TROBaseSuperChannelWorker.IntSendData)
Synapse’s black box - data is transferred
server-side:
request is read from SuperTCP request(TROBaseSuperChannelWorker.ReadStream)
BinMessage is read from request (MainProcessMessage)
method’s parameters (i.e. binary) are read from BinMessage (TROPerformanceService_Invoker.Invoke_PutNextChunk)
data is written from binary to file (TROPerformanceService.PutNextChunk)
as you can see, content of binary is copied several times before it will be suitable for passing to server and can be read from client’s request.
it will be a reason why direct copying gives much better results
Yes, I understand all that, but I’m wondering how I can take the best out of this architecture. I know I won’t reach the performance of CopyFile but if I can approach it by tweaking a few parameters, it’s worth a shot.
you can test, say, plain Indy/synapse http/tcp server w/o RO. for this specific case, i.e. uploading a large file it can better performance because some steps that is present in RO architecture will be omitted.
Yes, that’s something that I must think about, but it means rewriting other parts of the application and opening a second port on the server which is not always easy at client side.
Out of curiosity, would there be a way to “hijack” the existing Synapse socket in such a way that it could serve both modes?