1Demonstrations of zfsdist, the Linux eBPF/bcc version.
2
3
4zfsdist traces ZFS reads, writes, opens, and fsyncs, and summarizes their
5latency as a power-of-2 histogram. It has been written to work on ZFS on Linux
6(http://zfsonlinux.org). For example:
7
8# ./zfsdist
9Tracing ZFS operation latency... Hit Ctrl-C to end.
10^C
11
12operation = 'read'
13     usecs               : count     distribution
14         0 -> 1          : 0        |                                        |
15         2 -> 3          : 0        |                                        |
16         4 -> 7          : 4479     |****************************************|
17         8 -> 15         : 1028     |*********                               |
18        16 -> 31         : 14       |                                        |
19        32 -> 63         : 1        |                                        |
20        64 -> 127        : 2        |                                        |
21       128 -> 255        : 6        |                                        |
22       256 -> 511        : 1        |                                        |
23       512 -> 1023       : 1256     |***********                             |
24      1024 -> 2047       : 9        |                                        |
25      2048 -> 4095       : 1        |                                        |
26      4096 -> 8191       : 2        |                                        |
27
28operation = 'write'
29     usecs               : count     distribution
30         0 -> 1          : 0        |                                        |
31         2 -> 3          : 0        |                                        |
32         4 -> 7          : 0        |                                        |
33         8 -> 15         : 0        |                                        |
34        16 -> 31         : 0        |                                        |
35        32 -> 63         : 0        |                                        |
36        64 -> 127        : 0        |                                        |
37       128 -> 255        : 75       |****************************************|
38       256 -> 511        : 11       |*****                                   |
39       512 -> 1023       : 0        |                                        |
40      1024 -> 2047       : 0        |                                        |
41      2048 -> 4095       : 0        |                                        |
42      4096 -> 8191       : 0        |                                        |
43      8192 -> 16383      : 0        |                                        |
44     16384 -> 32767      : 0        |                                        |
45     32768 -> 65535      : 0        |                                        |
46     65536 -> 131071     : 13       |******                                  |
47    131072 -> 262143     : 1        |                                        |
48
49operation = 'open'
50     usecs               : count     distribution
51         0 -> 1          : 0        |                                        |
52         2 -> 3          : 2        |****************************************|
53
54This output shows a bimodal distribution for read latency, with a faster
55mode of around 5 thousand reads that took between 4 and 15 microseconds, and a
56slower mode of 1256 reads that took between 512 and 1023 microseconds. It's
57likely that the faster mode was a hit from the in-memory file system cache,
58and the slower mode is a read from a storage device (disk).
59
60The write latency is also bimodal, with a faster mode between 128 and 511 us,
61and the slower mode between 65 and 131 ms.
62
63This "latency" is measured from when the operation was issued from the VFS
64interface to the file system (via the ZFS POSIX layer), to when it completed.
65This spans everything: block device I/O (disk I/O), file system CPU cycles,
66file system locks, run queue latency, etc. This is a better measure of the
67latency suffered by applications reading from the file system than measuring
68this down at the block device interface.
69
70Note that this only traces the common file system operations previously
71listed: other file system operations (eg, inode operations including
72getattr()) are not traced.
73
74
75An optional interval and a count can be provided, as well as -m to show the
76distributions in milliseconds. For example:
77
78# ./zfsdist 1 5
79Tracing ZFS operation latency... Hit Ctrl-C to end.
80
8106:55:41:
82
83operation = 'read'
84     usecs               : count     distribution
85         0 -> 1          : 0        |                                        |
86         2 -> 3          : 0        |                                        |
87         4 -> 7          : 3976     |****************************************|
88         8 -> 15         : 1181     |***********                             |
89        16 -> 31         : 18       |                                        |
90        32 -> 63         : 4        |                                        |
91        64 -> 127        : 17       |                                        |
92       128 -> 255        : 16       |                                        |
93       256 -> 511        : 0        |                                        |
94       512 -> 1023       : 1275     |************                            |
95      1024 -> 2047       : 36       |                                        |
96      2048 -> 4095       : 3        |                                        |
97      4096 -> 8191       : 0        |                                        |
98      8192 -> 16383      : 1        |                                        |
99     16384 -> 32767      : 1        |                                        |
100
10106:55:42:
102
103operation = 'read'
104     usecs               : count     distribution
105         0 -> 1          : 0        |                                        |
106         2 -> 3          : 0        |                                        |
107         4 -> 7          : 12751    |****************************************|
108         8 -> 15         : 1190     |***                                     |
109        16 -> 31         : 38       |                                        |
110        32 -> 63         : 7        |                                        |
111        64 -> 127        : 85       |                                        |
112       128 -> 255        : 47       |                                        |
113       256 -> 511        : 0        |                                        |
114       512 -> 1023       : 1010     |***                                     |
115      1024 -> 2047       : 49       |                                        |
116      2048 -> 4095       : 12       |                                        |
117
11806:55:43:
119
120operation = 'read'
121     usecs               : count     distribution
122         0 -> 1          : 0        |                                        |
123         2 -> 3          : 0        |                                        |
124         4 -> 7          : 80925    |****************************************|
125         8 -> 15         : 1645     |                                        |
126        16 -> 31         : 251      |                                        |
127        32 -> 63         : 24       |                                        |
128        64 -> 127        : 16       |                                        |
129       128 -> 255        : 12       |                                        |
130       256 -> 511        : 0        |                                        |
131       512 -> 1023       : 80       |                                        |
132      1024 -> 2047       : 1        |                                        |
133
13406:55:44:
135
136operation = 'read'
137     usecs               : count     distribution
138         0 -> 1          : 0        |                                        |
139         2 -> 3          : 0        |                                        |
140         4 -> 7          : 81207    |****************************************|
141         8 -> 15         : 2075     |*                                       |
142        16 -> 31         : 2005     |                                        |
143        32 -> 63         : 177      |                                        |
144        64 -> 127        : 3        |                                        |
145
14606:55:45:
147
148operation = 'read'
149     usecs               : count     distribution
150         0 -> 1          : 0        |                                        |
151         2 -> 3          : 0        |                                        |
152         4 -> 7          : 74364    |****************************************|
153         8 -> 15         : 865      |                                        |
154        16 -> 31         : 4960     |**                                      |
155        32 -> 63         : 625      |                                        |
156        64 -> 127        : 2        |                                        |
157
158This workload was randomly reading from a file that became cached. The slower
159mode can be seen to disappear by the final summaries.
160
161
162USAGE message:
163
164# ./zfsdist -h
165usage: zfsdist [-h] [-T] [-m] [-p PID] [interval] [count]
166
167Summarize ZFS operation latency
168
169positional arguments:
170  interval            output interval, in seconds
171  count               number of outputs
172
173optional arguments:
174  -h, --help          show this help message and exit
175  -T, --notimestamp   don't include timestamp on interval output
176  -m, --milliseconds  output in milliseconds
177  -p PID, --pid PID   trace this PID only
178
179examples:
180    ./zfsdist            # show operation latency as a histogram
181    ./zfsdist -p 181     # trace PID 181 only
182    ./zfsdist 1 10       # print 1 second summaries, 10 times
183    ./zfsdist -m 5       # 5s summaries, milliseconds
184