1Demonstrations of oomkill, the Linux eBPF/bcc version.
2
3
4oomkill is a simple program that traces the Linux out-of-memory (OOM) killer,
5and shows basic details on one line per OOM kill:
6
7# ./oomkill
8Tracing oom_kill_process()... Ctrl-C to end.
921:03:39 Triggered by PID 3297 ("ntpd"), OOM kill of PID 22516 ("perl"), 3850642 pages, loadavg: 0.99 0.39 0.30 3/282 22724
1021:03:48 Triggered by PID 22517 ("perl"), OOM kill of PID 22517 ("perl"), 3850642 pages, loadavg: 0.99 0.41 0.30 2/282 22932
11
12The first line shows that PID 22516, with process name "perl", was OOM killed
13when it reached 3850642 pages (usually 4 Kbytes per page). This OOM kill
14happened to be triggered by PID 3297, process name "ntpd", doing some memory
15allocation.
16
17The system log (dmesg) shows pages of details and system context about an OOM
18kill. What it currently lacks, however, is context on how the system had been
19changing over time. I've seen OOM kills where I wanted to know if the system
20was at steady state at the time, or if there had been a recent increase in
21workload that triggered the OOM event. oomkill provides some context: at the
22end of the line is the load average information from /proc/loadavg. For both
23of the oomkills here, we can see that the system was getting busier at the
24time (a higher 1 minute "average" of 0.99, compared to the 15 minute "average"
25of 0.30).
26
27oomkill can also be the basis of other tools and customizations. For example,
28you can edit it to include other task_struct details from the target PID at
29the time of the OOM kill.
30
31
32The following commands can be used to test this program, and invoke a memory
33consuming process that exhausts system memory and is OOM killed:
34
35sysctl -w vm.overcommit_memory=1              # always overcommit
36perl -e 'while (1) { $a .= "A" x 1024; }'     # eat all memory
37
38WARNING: This exhausts system memory after disabling some overcommit checks.
39Only test in a lab environment.
40