Large file transfer performance

Hello,

I’m using the SDK to transfer large content from the client to the server and am facing performances issues, where the gigabit network link does not get saturated.
To illustrate, I have created the following example:

RO Performance.zip (102.0 KB)

As you will see, it can create a 1GB source file and transfers it to the distant server. For comparison, the sample also uses the CopyFile API to transfer the same file to a location of your choosing. For a fair test, the location should be a shared folder on the same machine where the server part is running.
Here, using those requirements, I get the following results:

RO - 71 - False - 15242800: 00:00:14.2603242 - 73 531,01 KB/s
CopyFile: 00:00:09.6710517 - 108 424,20 KB/s

The first line reads like this:
RO for RemObjects transfer
71 for the number of calls to PutNextChunk
False for the value of UseCompression on both sides
15242800 for the chunk size

Those last two values are those that I found to be ideal here on my network. Indeed, if I leave the compression activated, the time it takes to transfer a file is multiplied by 5!
If I give a higher or lower value for the chunk size, the performance decreases as well.

The transfer via CopyFile is done to a shared folder and using it directly via a UNC path or via a mounted drive has little impact on performances.

There’s a 32% difference in performance between the two methods, and while I don’t expect to reach the speed of CopyFile, I would be glad if you could give me hints as to what I could try to get the best performance.

Hi,

try to use buffer size 131072. it shows the best result for me.

also you can change server-side for using TROPerClientClassFactory class factory - it also can give boost up to 1-2 seconds.

Note: I’ve used client and server on one pc

Yes, that’s because it’s done on the same machine and so you are hitting the hard drive limits.

I have changed the example so that the file is loaded in RAM before being sent, and similarly in the server, so that only network transfer speed is measured.
Please use this one: RO Performance.zip (113.0 KB)

I also changed the factory to TROPerClientClassFactory but it does not help that much.

Just for reference, I added a button called “No Send” that does the same work as RO except it never sends the data down the line. Basically, it measures the speed at which the RAM gives out the data.

I don’t know if I can do anything more, because I have to find a trade-off between speed and memory used by the server. But if there are protocol options that can be changed, then I’d be happy to know about them.

how things work:

  • client-side:
    • data is loaded into binary (TClientForm.btnROClick)
    • method’s parameters (i.e.binary) are written to BinMessage (TROPerformanceService_Proxy.PutNextChunk)
    • BinMessage is written to stream, i.e. server call request is generated (TROTransportChannel.Dispatch)
    • SuperTCP request is generated, from above stream (TROBaseSuperChannelWorker.IntSendData)
  • Synapse’s black box - data is transferred
  • server-side:
    • request is read from SuperTCP request(TROBaseSuperChannelWorker.ReadStream)
    • BinMessage is read from request (MainProcessMessage)
    • method’s parameters (i.e. binary) are read from BinMessage (TROPerformanceService_Invoker.Invoke_PutNextChunk)
    • data is written from binary to file (TROPerformanceService.PutNextChunk)

as you can see, content of binary is copied several times before it will be suitable for passing to server and can be read from client’s request.
it will be a reason why direct copying gives much better results

Yes, I understand all that, but I’m wondering how I can take the best out of this architecture. I know I won’t reach the performance of CopyFile but if I can approach it by tweaking a few parameters, it’s worth a shot.

you can test, say, plain Indy/synapse http/tcp server w/o RO. for this specific case, i.e. uploading a large file it can better performance because some steps that is present in RO architecture will be omitted.

Yes, that’s something that I must think about, but it means rewriting other parts of the application and opening a second port on the server which is not always easy at client side.

Out of curiosity, would there be a way to “hijack” the existing Synapse socket in such a way that it could serve both modes?

see Using OnCustomResponseEvent in a ROSDK Server snippet.
here you can handle unknown requests