1Demonstrations of tcpretrans, the Linux eBPF/bcc version.
2
3
4This tool traces the kernel TCP retransmit function to show details of these
5retransmits. For example:
6
7# ./tcpretrans
8TIME     PID    IP LADDR:LPORT          T> RADDR:RPORT          STATE
901:55:05 0      4  10.153.223.157:22    R> 69.53.245.40:34619   ESTABLISHED
1001:55:05 0      4  10.153.223.157:22    R> 69.53.245.40:34619   ESTABLISHED
1101:55:17 0      4  10.153.223.157:22    R> 69.53.245.40:22957   ESTABLISHED
12[...]
13
14This output shows three TCP retransmits, the first two were for an IPv4
15connection from 10.153.223.157 port 22 to 69.53.245.40 port 34619. The TCP
16state was "ESTABLISHED" at the time of the retransmit. The on-CPU PID at the
17time of the retransmit is printed, in this case 0 (the kernel, which will
18be the case most of the time).
19
20Retransmits are usually a sign of poor network health, and this tool is
21useful for their investigation. Unlike using tcpdump, this tool has very
22low overhead, as it only traces the retransmit function. It also prints
23additional kernel details: the state of the TCP session at the time of the
24retransmit.
25
26
27A -l option will include TCP tail loss probe attempts:
28
29# ./tcpretrans -l
30TIME     PID    IP LADDR:LPORT          T> RADDR:RPORT          STATE
3101:55:45 0      4  10.153.223.157:22    R> 69.53.245.40:51601   ESTABLISHED
3201:55:46 0      4  10.153.223.157:22    R> 69.53.245.40:51601   ESTABLISHED
3301:55:46 0      4  10.153.223.157:22    R> 69.53.245.40:51601   ESTABLISHED
3401:55:53 0      4  10.153.223.157:22    L> 69.53.245.40:46444   ESTABLISHED
3501:56:06 0      4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
3601:56:06 0      4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
3701:56:08 0      4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
3801:56:08 0      4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
3901:56:08 1938   4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
4001:56:08 0      4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
4101:56:08 0      4  10.153.223.157:22    R> 69.53.245.40:46444   ESTABLISHED
42[...]
43
44See the "L>" in the "T>" column. These are attempts: the kernel probably
45sent a TLP, but in some cases it might not have been ultimately sent.
46
47To spot heavily retransmitting flows quickly one can use the -c flag. It will
48count occurring retransmits per flow.
49
50# ./tcpretrans.py -c
51Tracing retransmits ... Hit Ctrl-C to end
52^C
53LADDR:LPORT              RADDR:RPORT             RETRANSMITS
54192.168.10.50:60366  <-> 172.217.21.194:443         700
55192.168.10.50:666    <-> 172.213.11.195:443         345
56192.168.10.50:366    <-> 172.212.22.194:443         211
57[...]
58
59This can ease to quickly isolate congested or otherwise awry network paths
60responsible for clamping tcp performance.
61
62USAGE message:
63
64# ./tcpretrans -h
65usage: tcpretrans [-h] [-l]
66
67Trace TCP retransmits
68
69optional arguments:
70  -h, --help       show this help message and exit
71  -l, --lossprobe  include tail loss probe attempts
72  -c, --count      count occurred retransmits per flow
73
74examples:
75    ./tcpretrans           # trace TCP retransmits
76    ./tcpretrans -l        # include TLP attempts
77