I recently got confronted with a difficult problem, in which I was unable to find what was the issue. Nevertheless it’s something most network engineers will be confronted with eventually. I can take no credit for the solution though, a colleague solved it (congrats, G).
The problem
The problem is as following:
- Client PC A, behind a WAN link, experiences a slow file transfer rate when uploading to server C. Speed: 1,88 MB per second (MBps).
- From A to server D, uploading is fast, at 21,43 MBps.
- Local PC B experiences great uploading speeds towards both servers: 26,08 MBps to C, 108,72 MBps to D.
- Downloading on PC A from both server C and server D goes reasonable fast as well: 7,06 MBps for both.
At first sight, this is just not logical: no single component can be isolated to be the cause of the problem: the WAN link works fine towards server D and server C works fine when approached locally. On top of that, it’s only uploading to server C that is slow, downloading is faster.
Let’s start simple and look for bottlenecks. Testing shows the WAN line is capable of 200 Mbps throughput. A continuous ping from PC A to both servers averages to 8,30 ms, with little jitter. A similar continuous ping from PC B to the servers gives 0,60 ms. Vice versa, from servers to clients, it’s the same: 0,60 locally, 8,30 over the WAN link. All internal links at both the remote site and the main site use gigabit.
So the biggest bottleneck is the 200 Mbps WAN line. 200 divided by 8 gives 25 MBps. It roughly explains the speed from PC A to server D, but that’s about it. So what is causing it? Taking captures of the transfer (PC A to server C) reveals nothing at first sight:
And that for gigabytes of transferred data. Never a lost packet, never an event, just the seemingly endless data-data-ack, data-data-ack,… But, let’s take a closer look at an interesting parameter:
TCP Window, negotiated at 17kB. Would this differ for the other transfers? Let’s look at the transfer from PC A to server D:
3100kB! And when looking at the downloads from the servers on PC A, these both show 64kB. And they both have the same throughput. Now, a pattern is starting to emerge.
The explanation
Indeed, TCP Windowing becomes an important factor over WAN lines. Windowing determines how many packets of data can be sent to the receiver before an acknowledgement is needed. The bigger the window, the more packets can be sent before a return packet is needed. And the sender will not start sending more data before that acknowledgement is received. In a very low latency environment, this is barely noticeable: with 0,60 ms round-trip-time (RTT) and a TCP window of 17 kB, the sender needs to wait 0,60 ms every few packets (two in this case). Over the WAN link, it will need to wait 8,30 ms every two packets for the return packet. At a TCP window of around 3100kB, dozens of packets can be sent before a return packet is needed. Take for example 100 packets for each ACK, that would be 8,30 ms extra time for the single return packet, while an ACK every two packets over the WAN link for 100 packets would mean 415 ms of extra time, waiting for the ACK packets.
This can be put into a formula: TCP Window in bits divided by the round-trip-time in seconds. Source: http://en.wikipedia.org/wiki/TCP_tuning
This gives the following results:
This comes very close to the test results! TCP Window and latency are used to calculate throughput. The Max throughput is the slowest of both Throughput and link speed (you can’t transfer files faster than 200 Mbps over the WAN link). Data throughput is the Max throughput minus 10 percent: this is to take into account the overhead from the headers (frame header, IP header, TCP header). The actual payload throughput is then converted to MBps (divided by eight) to give the final result.
The solution
The solution to this particular situation is checking for the TCP Window size parameters on the server C. With any luck these can be changed. Server C also didn’t rewindow (sending an ACK with a new TCP Window value) during the transfer, so it never became faster. Server D, while starting with a rather low window size, did rewindow, scaling up to 3100kB eventually. Of course, packet loss wasn’t taken into account int this scenario. Should there have been packetloss, server D probably wouldn’t have rewindowed up to 3100kB and might even have had to choose lower values than the one it started with.
TCP Windowing issues can be a lot of things, among which the following:
- Settings on the receiving host TCP/IP stack, depending on the operating system: both starting window and the ability to rewindow.
- Host parameters such as CPU and memory usage, disk I/O,… If the receiving host cannot keep up with the rate at which data is received, it will not rewindow to a higher value. After all, the received data has to be stored in RAM, processed, and written to disk. If any of these resources is not available, throughput will be limited.
- Packet loss on the connection leads to missing data for which no ACK is sent. Throughput will not increase, or even go down as packet loss increases.
This concludes troubleshooting for now. The upcoming series of blog posts will be more theoretical and cover the OSI layer in greater detail. Stay tuned!