Excessive TCP Dup ACK and TCP Retransmissions
I'm doing an SFTP transfer between two servers about 70ms RTT apart and seeing excessive TCP Dup ACK and TCP Retransmissions. The circuit size is 50 mbit/sec, but I'm getting a transfer speed of 500 kbit/sec or less. What could be causing this?
Comments
Could you please enable SACK on both endpoints and do the capture again? An absence of SACK option makes loss recovery very inefficient.
What is the "sender" capture location? It has a bit strange IP TTL of 60. Is it several hops away from the endpoint or just non-usual TTL?
The sender is an AIX server which is why the TTL is unusual and starts at 60. The receiver is a Linux server with SACK already enabled. I will check the setting on the AIX sender.
Based on what you're seeing so far, what do you think is the most likely cause?
I need to take a closer look on it but actually it looks like micro-bursting with 1Gbit interface speed hitting a buffer or policer so strongly so it is causing bulk packet loss. At the same time recovery process is extremely slow because of SACK absence.
The 3-way handshake in the sender capture tells us that the AIX sender doesn't support SACK. It's possible that SACKs may not help in this case - but as @Packet_vlad suggests, SACK is more efficient in general and so you should enable it if you can.
The MSS=1380 in the server's SYN-ACK is a strong clue that there's a Cisco ASA firewall in the path.
The minimum RTT of 68.1 ms means that the client and server are relatively far apart.
Thanks for this very interesting capture. There are a couple of elements to the problem.