1Demonstrations of fileslower, the Linux eBPF/bcc version. 2 3 4fileslower shows file-based synchronous reads and writes slower than a 5threshold. For example: 6 7# ./fileslower 8Tracing sync read/writes slower than 10 ms 9TIME(s) COMM PID D BYTES LAT(ms) FILENAME 100.000 randread.pl 4762 R 8192 12.70 data1 118.850 randread.pl 4762 R 8192 11.26 data1 1212.852 randread.pl 4762 R 8192 10.43 data1 13 14This showed a few reads from a "randread.pl" program, each 8 Kbytes in size, 15and from a "data1" file. These all had over 10 ms latency. 16 17This "latency" is measured from when the read or write was issued at the VFS 18interface, to when it completed. This spans everything: block device I/O (disk 19I/O), file system CPU cycles, file system locks, run queue latency, etc. This 20is a better measure of the latency suffered by applications reading from the 21file system than measuring this down at the block device interface. 22 23Note that this only traces file reads and writes: other file system operations 24(eg, directory operations, open(), fflush()) are not traced. 25 26 27The threshold can be provided as an argument. Eg, I/O slower than 1 ms: 28 29# ./fileslower 1 30Tracing sync read/writes slower than 1 ms 31TIME(s) COMM PID D BYTES LAT(ms) FILENAME 320.000 randread.pl 6925 R 8192 1.06 data1 330.082 randread.pl 6925 R 8192 2.42 data1 340.116 randread.pl 6925 R 8192 1.78 data1 350.153 randread.pl 6925 R 8192 2.31 data1 360.330 randread.pl 6925 R 8192 1.14 data1 370.345 randread.pl 6925 R 8192 1.52 data1 380.359 randread.pl 6925 R 8192 1.04 data1 390.532 randread.pl 6925 R 8192 2.56 data1 400.609 supervise 1892 W 18 3.65 status.new 410.610 randread.pl 6925 R 8192 1.37 data1 420.614 randread.pl 6925 R 8192 3.04 data1 430.729 randread.pl 6925 R 8192 2.90 data1 440.755 randread.pl 6925 R 8192 1.12 data1 450.762 randread.pl 6925 R 8192 2.62 data1 460.771 randread.pl 6925 R 8192 1.07 data1 470.816 randread.pl 6925 R 8192 10.50 data1 480.983 randread.pl 6925 R 8192 1.73 data1 490.989 randread.pl 6925 R 8192 2.12 data1 500.992 randread.pl 6925 R 8192 2.17 data1 511.001 randread.pl 6925 R 8192 1.93 data1 521.007 randread.pl 6925 R 8192 2.03 data1 531.210 randread.pl 6925 R 8192 1.82 data1 541.213 randread.pl 6925 R 8192 2.58 data1 551.219 randread.pl 6925 R 8192 2.20 data1 561.430 randread.pl 6925 R 8192 1.01 data1 571.448 randread.pl 6925 R 8192 2.22 data1 58[...] 59 60There's now much more output (this spans only 1.4 seconds, the previous output 61spanned 12 seconds), and the lower threshold is catching more I/O. 62 63 64In the following example, the file system caches were dropped before running 65fileslower, and then in another session a "man ls" was executed. The command 66and files read from disk can be seen: 67 68# echo 3 > /proc/sys/vm/drop_caches; ./fileslower 1 69Tracing sync read/writes slower than 1 ms 70TIME(s) COMM PID D BYTES LAT(ms) FILENAME 710.000 bash 9647 R 128 5.83 man 720.050 man 9647 R 832 19.52 libmandb-2.6.7.1.so 730.066 man 9647 R 832 15.79 libman-2.6.7.1.so 740.123 man 9647 R 832 56.36 libpipeline.so.1.3.0 750.135 man 9647 R 832 9.79 libgdbm.so.3.0.0 760.323 man 9647 R 4096 59.52 locale.alias 770.540 man 9648 R 8192 11.11 ls.1.gz 780.558 man 9647 R 72 6.97 index.db 790.563 man 9647 R 4096 5.12 index.db 800.723 man 9658 R 128 12.06 less 810.725 man 9656 R 128 14.52 nroff 820.779 man 9655 R 128 68.86 tbl 830.814 nroff 9660 R 128 14.55 locale 840.830 pager 9658 R 4096 28.27 .lesshst 850.866 man 9654 R 128 163.12 preconv 860.980 nroff 9684 R 128 13.80 groff 870.999 groff 9684 R 4096 14.29 DESC 881.036 groff 9685 R 128 5.94 troff 891.038 groff 9686 R 128 7.76 grotty 901.065 troff 9685 R 4096 6.33 R 911.082 troff 9685 R 4096 10.52 BI 921.096 troff 9685 R 4096 8.70 troffrc 931.176 troff 9685 R 4096 80.12 composite.tmac 941.195 troff 9685 R 4096 19.20 fallbacks.tmac 951.202 troff 9685 R 4096 6.79 tty.tmac 961.221 troff 9685 R 4096 7.87 man.local 972.977 supervise 1876 W 18 4.23 status.new 98 99This caught an individual I/O reaching 163.12 ms, for the "preconv" file. While 100the file system cache was flushed, causing these to need to be read from disk, 101the duration here may not be entirely disk I/O: it can include file system 102locks, run queue latency, etc. These can be explored using other commands. 103 104 105USAGE message: 106 107# ./fileslower -h 108usage: fileslower.py [-h] [-p PID] [-a] [min_ms] 109 110Trace slow synchronous file reads and writes 111 112positional arguments: 113 min_ms minimum I/O duration to trace, in ms (default 10) 114 115optional arguments: 116 -h, --help show this help message and exit 117 -p PID, --pid PID trace this PID only 118 -a, --all-files include non-regular file types 119 120examples: 121 ./fileslower # trace sync file I/O slower than 10 ms (default) 122 ./fileslower 1 # trace sync file I/O slower than 1 ms 123 ./fileslower -p 185 # trace PID 185 only 124