1Demonstrations of cpuunclaimed, the Linux eBPF/bcc version.
2
3
4This tool samples the length of the CPU run queues and determine when there are
5idle CPUs, yet queued threads waiting their turn. It reports the amount of idle
6(yet unclaimed by waiting threads) CPU as a system-wide percentage. For
7example:
8
9# ./cpuunclaimed.py
10Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
11%CPU  83.00%, unclaimed idle 0.12%
12%CPU  87.25%, unclaimed idle 0.38%
13%CPU  85.00%, unclaimed idle 0.25%
14%CPU  85.00%, unclaimed idle 0.25%
15%CPU  80.88%, unclaimed idle 0.00%
16%CPU  82.25%, unclaimed idle 0.00%
17%CPU  83.50%, unclaimed idle 0.12%
18%CPU  81.50%, unclaimed idle 0.00%
19%CPU  81.38%, unclaimed idle 0.00%
20[...]
21
22This shows a system running at over 80% CPU utilization, and with less than
230.5% unclaimed idle CPUs.
24
25Unclaimed idle CPUs can happen for a number of reasons:
26
27- An application has been bound to some, but not all, CPUs, and has runnable
28  threads that cannot migrate to other CPUs due to this configuration.
29- CPU affinity: an optimization that leaves threads on CPUs where the CPU
30  caches are warm, even if this means short periods of waiting while other
31  CPUs are idle. The wait period is tunale (see sysctl, kernel.sched*).
32- Scheduler bugs.
33
34An unclaimed idle of < 1% is likely to be CPU affinity, and not usually a
35cause for concern. By leaving the CPU idle, overall throughput of the system
36may be improved. This tool is best for identifying larger issues, > 2%, due
37to the coarseness of its 99 Hertz samples.
38
39
40This is an 8 CPU system, with an 8 CPU-bound threaded application running that
41has been bound to one CPU (via taskset):
42
43# ./cpuunclaimed.py
44Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
45%CPU  12.63%, unclaimed idle 86.36%
46%CPU  12.50%, unclaimed idle 87.50%
47%CPU  12.63%, unclaimed idle 87.37%
48%CPU  12.75%, unclaimed idle 87.25%
49%CPU  12.50%, unclaimed idle 87.50%
50%CPU  12.63%, unclaimed idle 87.37%
51%CPU  12.50%, unclaimed idle 87.50%
52%CPU  12.50%, unclaimed idle 87.50%
53[...]
54
55It shows that 7 of the 8 CPUs (87.5%) are idle at the same time there are
56queued threads waiting to run on CPU. This is an artificial situation caused
57by binding threads to the same CPU, to demonstrate how the tool works.
58
59
60This is an 8 CPU system running a Linux kernel build with "make -j8", and -T
61to print timestamps:
62
63# ./cpuunclaimed.py -T
64Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
6522:25:55 %CPU  98.88%, unclaimed idle 0.12%
6622:25:56 %CPU  99.75%, unclaimed idle 0.25%
6722:25:57 %CPU  99.50%, unclaimed idle 0.50%
6822:25:58 %CPU  99.25%, unclaimed idle 0.75%
6922:25:59 %CPU  99.75%, unclaimed idle 0.25%
7022:26:00 %CPU  99.50%, unclaimed idle 0.50%
7122:26:01 %CPU  99.25%, unclaimed idle 0.75%
7222:26:02 %CPU  99.25%, unclaimed idle 0.75%
7322:26:03 %CPU  99.01%, unclaimed idle 0.87%
7422:26:04 %CPU  99.88%, unclaimed idle 0.12%
7522:26:05 %CPU  99.38%, unclaimed idle 0.62%
76
77There's now a consistent, yet small, amount of unclaimed idle CPU. This is
78expected to be deliberate: CPU affinity, as mentioned earlier.
79
80
81The -j option will print raw samples: around one hundred lines of output
82every second. For the same system with a Linux kernel build of "make -j8":
83
84# ./cpuunclaimed.py -j
85TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7
86514606928954752,1,1,1,1,1,1,1,1
87514606939054312,1,1,1,1,1,1,1,2
88514606949156518,1,1,1,1,1,1,1,1
89514606959256596,2,2,1,1,1,1,1,1
90514606969357989,1,1,1,1,1,2,1,1
91514606979459700,1,2,1,1,1,2,1,1
92514606989560481,1,1,1,1,1,1,1,1
93514606999661396,1,1,1,1,1,1,2,1
94514607009795601,1,1,1,1,1,1,1,2
95514607019862711,1,1,1,1,1,1,1,1
96514607029963734,1,1,1,1,1,1,1,1
97514607040062372,1,1,1,1,1,1,1,1
98514607050197735,1,1,1,2,1,1,1,1
99514607060266464,1,1,1,1,1,1,1,2
100514607070368025,1,1,1,1,1,2,1,1
101514607080468375,1,1,1,1,1,1,1,2
102514607090570292,3,2,1,1,1,1,1,1
103514607100670725,1,1,1,1,1,2,1,1
104514607110771946,1,2,1,1,1,1,1,1
105514607120873489,1,1,1,1,2,1,2,1
106514607130973857,2,1,1,1,3,1,1,1
107514607141080056,0,1,1,1,1,2,1,3
108514607151176312,1,1,1,2,1,1,1,1
109514607161277753,1,1,1,1,1,1,2,1
110514607171379095,1,1,1,1,1,1,1,1
111514607181479262,1,1,1,1,1,1,1,1
112514607191580794,3,1,1,1,1,1,1,1
113514607201680952,1,1,1,1,1,1,2,1
114514607211783683,1,1,1,1,1,1,1,1
115514607221883274,1,1,1,1,1,1,0,1
116514607231984244,1,1,1,1,1,1,1,1
117514607242085698,1,1,1,1,1,1,1,1
118514607252216898,1,2,1,1,1,1,1,1
119514607262289420,1,1,1,1,1,2,1,1
120514607272389922,1,1,1,1,1,1,1,1
121514607282489413,1,1,1,1,1,1,1,1
122514607292589950,1,3,1,1,1,1,1,1
123514607302693367,1,1,1,1,2,1,1,1
124514607312793792,1,1,1,1,1,1,1,1
125514607322895249,1,1,1,3,1,1,3,1
126514607332994278,1,0,1,1,1,2,1,2
127514607343095836,1,1,1,1,1,2,1,1
128514607353196533,1,1,1,1,2,1,1,1
129514607363297749,1,1,1,1,1,1,1,2
130514607373399011,1,1,1,1,1,1,1,2
131514607383499730,1,1,1,1,1,1,1,2
132514607393601510,1,1,1,1,1,1,1,2
133514607403704117,2,1,1,1,1,1,1,2
134514607413802700,1,1,1,1,2,1,0,1
135514607423904559,1,1,1,1,1,1,1,1
136[...]
137
138The output is verbose: printing out a timestamp, and then the length of each
139CPU's run queue. The second last line, of timestamp 514607413802700, is an
140example of what this tool detects: CPU 4 has a run queue length of 4, which
141means one thread running and one thread queued, while CPU 6 has a run queue
142length of 0: idle. The very next sample shows all CPUs busy.
143
144
145The -J option prints raw samples with time offsets showing when the samples
146were collected on each CPU. It's mostly useful for debugging the tool itself.
147For example, during a Linux kernel build:
148
149# ./cpuunclaimed.py -J
150TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7
151514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792
152514722635299034,1,1,1,1,1,2,1,1,0,28809,51999,74183,89552,110011,131995,153519
153514722645400274,1,1,1,1,1,1,1,2,0,28024,51333,73652,88964,110075,131973,153568
154514722655501816,1,2,1,1,1,1,1,1,0,28893,51671,75233,89496,109430,131945,153694
155514722665602594,1,1,2,1,1,2,1,1,0,28623,50988,73866,89383,109186,131786,154555
156514722675703498,1,1,1,1,1,1,1,1,0,27379,51031,73175,89625,110380,131482,104811
157514722685804942,1,1,1,1,1,2,1,1,0,27213,50501,72183,88797,108780,130659,152153
158514722695906294,1,1,1,1,1,1,1,1,0,27036,51182,73420,87861,109585,130364,155089
159514722706005778,1,1,1,1,1,1,1,1,0,28492,51851,74138,89744,110208,132462,154060
160514722716060705,1,1,1,1,1,1,1,1,0,154499,152528,155232,155046,154502,178746,200001
161514722726209615,1,1,1,1,1,1,1,1,0,28170,49580,72605,87741,108144,130723,152138
162514722736309475,1,2,1,1,1,1,1,1,0,27421,51386,73061,89358,109457,131273,153005
163514722746410845,1,2,1,1,1,2,1,1,0,27788,50840,72720,88920,109111,131143,152979
164514722756511363,1,1,1,1,1,1,2,1,0,28280,50977,73559,89848,109659,131579,152693
165514722766613044,1,1,1,1,1,1,1,1,0,28046,50812,72754,89160,110108,130735,152948
166514722776712932,1,1,1,2,1,1,1,1,0,28586,51177,73974,89588,109947,132376,154162
167514722786815477,1,1,1,1,1,1,1,1,0,27973,71104,72539,88302,108896,130414,152236
168514722796914955,1,1,1,1,1,1,1,1,0,29054,52354,74214,89592,110615,132586,153925
169514722807044060,1,1,1,1,1,1,1,1,1587130,0,24079,46633,61787,82325,104706,125278
170514722817117432,2,1,2,1,1,1,1,1,0,27628,51038,75138,89724,109340,132426,155348
171514722827218254,1,1,1,1,1,1,2,1,0,29111,51868,74347,88904,109911,132764,153851
172514722837340158,1,1,1,1,1,1,1,1,0,7366,30760,53528,68622,89317,111095,132319
173514722847421305,1,1,1,1,1,1,1,1,0,28257,51105,73841,89037,110820,131605,153368
174514722857521112,1,1,1,1,1,1,1,1,0,28544,51441,73857,89530,110497,131915,153513
175514722867626129,0,2,1,1,1,1,1,1,0,24621,47917,70568,85391,106670,128081,150329
176514722877727183,2,1,1,1,1,1,1,1,0,24869,47630,71547,84761,106048,128444,149285
177514722887824589,1,1,1,1,1,1,2,1,0,28793,51212,73863,89584,109773,132348,153194
178514722897925481,1,1,1,1,1,1,2,1,0,29278,51163,73961,89774,109592,132029,153715
179514722908026097,1,1,1,1,1,1,1,1,0,30630,35595,36210,188001,190815,190072,190732
180514722918127439,1,1,1,1,1,1,1,1,0,28544,51885,73948,89987,109763,132632,154083
181514722928227399,1,1,1,1,1,1,1,1,0,31882,51574,74769,89939,110578,132951,154356
182514722938329471,1,1,1,1,1,1,1,1,0,28498,51304,74101,89670,110278,132653,153176
183514722948430589,1,1,1,1,1,1,1,1,0,27868,50925,73477,89676,109583,132360,153014
184514722958531802,1,1,1,1,1,1,1,1,0,28505,50886,73729,89919,109618,131988,152896
185514722968632181,1,1,1,1,1,1,1,1,0,28492,51749,73977,90334,109816,132897,152890
186514722978733584,1,1,1,1,1,1,1,1,0,28847,50957,74121,90014,110019,132377,152978
187514722988834321,1,1,1,1,1,1,1,1,0,28601,51437,74021,89968,110252,132233,153623
188514722998937170,1,1,2,1,1,1,1,1,0,27007,50044,73259,87725,108663,132194,152459
189514723009036821,1,2,1,2,1,1,1,1,0,28226,50937,73983,89110,110476,131740,153663
190514723019137577,1,1,1,1,1,1,1,1,0,30261,52357,75657,87803,61823,131850,153585
191514723029238745,1,1,1,1,1,1,1,1,0,28030,50752,74452,89240,110791,132187,153327
192514723039339069,1,1,1,1,1,1,1,1,0,29791,52636,75996,90475,110414,132232,154714
193514723049439822,1,1,1,1,2,1,1,1,0,29133,56662,74153,89520,110683,132740,154708
194514723059541617,1,1,1,1,1,1,1,1,0,27932,51480,74644,89656,109176,131499,153732
195514723069642500,1,1,2,1,1,1,2,1,0,27678,51509,73984,90136,110124,131554,153459
196514723079743525,2,1,1,1,1,1,1,1,0,28029,51424,74394,90056,110087,132383,152963
197514723089844091,2,1,1,2,1,1,1,1,0,28944,51692,74440,90339,110402,132722,154083
198514723099945957,1,1,2,1,1,1,1,1,0,28425,51267,73164,89322,115048,114630,115187
199514723110047020,1,1,2,0,1,1,1,2,0,28192,50811,76814,89835,109370,131265,153511
200514723120216662,1,1,2,1,1,2,1,1,29,34,0,4514,19268,40293,62674,84009
201[...]
202
203This is a Xen guest system, and it shows that CPU 0 usually completes first (an
204offset of 0), followed by CPU 1 around 28000 nanoseconds later, and so on.
205The spread of offsets is triggered by the bcc Python library that initializes
206the timers, which steps through the CPUs in sequence, with a small delay
207between them merely from executing its own loop code.
208
209Here's more output during a Linux kernel build:
210
211# ./cpuunclaimed.py -J
212TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7
213514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792
214515700745758947,2,1,1,1,1,1,1,1,0,19835,34891,49397,59364,71988,87571,102769
215515700755860451,2,1,1,1,1,1,1,2,0,19946,34323,49855,59844,72741,87925,102891
216515700765960560,1,1,1,1,1,1,1,1,0,20805,35339,50436,59677,73557,88661,104796
217515700776061744,1,1,1,1,1,1,1,1,1626,77,0,190,153452,154665,178218,154116
218515700786162017,1,1,1,1,1,1,1,1,0,20497,35361,51552,59787,74451,147789,104545
219515700796262811,1,1,1,1,1,1,1,2,0,20910,35657,50805,60175,73953,88492,103527
220515700806364951,1,1,1,1,1,1,1,1,0,20140,35023,50074,59726,72757,88040,102421
221515700816465253,1,1,1,1,1,1,2,1,0,20952,34899,50262,60048,72890,88067,103545
222515700826566573,1,1,1,1,1,1,1,1,0,20898,35490,50609,59805,74060,88550,103354
223515700836667480,1,1,1,1,1,1,2,1,0,20548,34760,50959,59490,73059,87820,103006
224515700846768182,1,1,1,1,1,1,2,1,0,20571,35113,50777,59962,74139,88543,103192
225515700856869468,1,1,2,1,1,2,2,1,0,20932,35382,50510,60106,73739,91818,103684
226515700866971905,1,1,1,2,1,1,1,1,0,19780,33018,49075,58375,71949,86537,102136
227515700877073459,2,1,1,1,1,1,1,1,0,20065,73966,48989,58832,71408,85714,101067
228515700887172772,1,1,1,1,1,1,1,1,0,20909,34608,51493,59890,73564,88668,103454
229515700897273292,1,2,1,1,1,1,1,1,0,20353,35292,50114,59773,73948,88615,103383
230515700907374341,1,1,2,1,1,1,1,1,0,20816,35206,50915,60062,73878,88857,103794
231515700917475331,1,1,6,1,1,2,1,1,0,20752,34931,50280,59764,73781,88329,103234
232515700927576958,1,1,1,1,1,1,1,1,0,19929,34703,50181,59364,73004,88053,103127
233515700937677298,1,1,2,2,1,1,1,1,0,21178,34724,50740,61193,73452,89030,103390
234515700947778409,2,1,1,1,1,1,1,1,0,21059,35604,50853,60098,73919,88675,103506
235515700957879196,2,1,1,1,1,1,1,1,0,21326,35939,51492,60083,74249,89474,103761
236[...]
237
238Notice the tighter range of offsets? I began executing cpuunclaimed when the
239system was idle, and it initialized the CPU timers more quickly, and then I
240began the Linux kernel build.
241
242Here's some different output, this time from a physical system with 4 CPUs,
243also doing a kernel build,
244
245# ./cpuunclaimed.py -J
246TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3
2474429382557480,1,1,1,1,0,6011,10895,16018
2484429392655042,2,1,1,1,0,8217,13661,19378
2494429402757604,1,1,1,1,0,6879,12433,18000
2504429412857809,1,1,1,1,0,8303,13190,18719
2514429422960709,2,1,1,1,0,6095,11234,17079
2524429433060391,1,1,1,2,0,6747,12480,18070
2534429443161699,1,1,1,1,0,6560,12264,17945
2544429453262002,1,2,1,1,0,6992,12644,18341
2554429463363706,1,2,1,1,0,6211,12071,17853
2564429473465571,1,1,1,1,0,5766,11495,17638
2574429483566920,1,1,1,1,0,5223,11736,16358
2584429493666279,1,1,1,1,0,6964,12653,18410
2594429503769113,1,1,1,1,0,5161,11399,16612
2604429513870744,1,1,1,1,0,5943,10583,15768
2614429523969826,1,1,1,1,0,6533,12336,18189
2624429534070311,1,1,1,1,0,6834,12816,18488
2634429544170456,1,1,1,1,0,7284,13401,19129
2644429554274467,1,2,1,1,0,5941,11160,16594
2654429564372365,1,2,1,1,0,7514,13618,19190
2664429574474406,1,2,1,1,0,6687,12650,18248
2674429584574220,1,2,1,1,0,7912,13705,19136
268[...]
269
270If the offset range becomes too great, we can no longer conclude about when
271some CPUs were idle and others had queued work. The tool will detect this,
272and print an error message and exit.
273
274
275Some systems can power down CPUs when idle, and when they wake up again the
276timed samples may resume from different offsets. If this happens, this tool
277can no longer draw conclusions about when some CPUs were idle and others
278had queued work, so it prints an error, and exits. Eg:
279
280# ./cpuunclaimed.py 1
281Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end.
282%CPU   0.25%, unclaimed idle 0.00%
283%CPU   0.75%, unclaimed idle 0.00%
284%CPU   0.25%, unclaimed idle 0.00%
285%CPU   0.00%, unclaimed idle 0.00%
286%CPU   0.00%, unclaimed idle 0.00%
287%CPU   0.12%, unclaimed idle 0.00%
288%CPU   0.00%, unclaimed idle 0.00%
289%CPU   0.25%, unclaimed idle 0.00%
290%CPU   0.00%, unclaimed idle 0.00%
291%CPU   0.12%, unclaimed idle 0.00%
292%CPU   0.13%, unclaimed idle 0.00%
293%CPU   0.12%, unclaimed idle 0.00%
294%CPU   0.00%, unclaimed idle 0.00%
295%CPU   0.00%, unclaimed idle 0.00%
296%CPU   0.00%, unclaimed idle 0.00%
297%CPU   0.00%, unclaimed idle 0.00%
298ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 4328176 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting.
299
300It's expected that this will only really occur on idle systems.
301
302USAGE:
303
304# ./cpuunclaimed.py -h
305usage: cpuunclaimed.py [-h] [-j] [-J] [-T] [interval] [count]
306
307Sample CPU run queues and calculate unclaimed idle CPU
308
309positional arguments:
310  interval         output interval, in seconds
311  count            number of outputs
312
313optional arguments:
314  -h, --help       show this help message and exit
315  -j, --csv        print sample summaries (verbose) as comma-separated values
316  -J, --fullcsv    print sample summaries with extra fields: CPU sample
317                   offsets
318  -T, --timestamp  include timestamp on output
319
320examples:
321    ./cpuunclaimed            # sample and calculate unclaimed idle CPUs,
322                              # output every 1 second (default)
323    ./cpuunclaimed 5 10       # print 5 second summaries, 10 times
324    ./cpuunclaimed -T 1       # 1s summaries and timestamps
325    ./cpuunclaimed -j         # raw dump of all samples (verbose), CSV
326