1Demonstrations of criticalstat: Find long atomic critical sections in the kernel.
2
3criticalstat traces and reports occurences of atomic critical sections in the
4kernel with useful stacktraces showing the origin of them. Such critical
5sections frequently occur due to use of spinlocks, or if interrupts or
6preemption were explicity disabled by a driver. IRQ routines in Linux are also
7executed with interrupts disabled. There are many reasons. Such critical
8sections are a source of long latency/responsive issues for real-time systems.
9
10This works by probing the preempt/irq and cpuidle tracepoints in the kernel.
11Since this uses BPF, only the root user can use this tool. Further, the kernel
12has to be built with certain CONFIG options enabled inorder for it to work:
13CONFIG_PREEMPTIRQ_EVENTS
14CONFIG_DEBUG_PREEMPT
15Additionally, the following options should be turned off on older kernels:
16CONFIG_PROVE_LOCKING
17CONFIG_LOCKDEP
18
19USAGE:
20# ./criticalstat -h
21usage: criticalstat [-h] [-p] [-i] [-d DURATION]
22
23Trace long critical sections
24
25optional arguments:
26  -h, --help            Show this help message and exit
27  -p, --preemptoff      Find long sections where preemption was off
28  -i, --irqoff          Find long sections where IRQ was off
29  -d DURATION, --duration DURATION
30                        Duration in uS (microseconds) below which we filter
31
32examples:
33    ./criticalstat          	# run with default options: irq off for more than 100 uS
34    ./criticalstat -p       	# find sections with preemption disabled for more than 100 uS
35    ./criticalstat -d 500   	# find sections with IRQs disabled for more than 500 uS
36    ./criticalstat -p -d 500	# find sections with preemption disabled for more than 500 uS
37
38The tool runs continuously until interrupted by Ctrl-C
39
40
41By default, criticalstat finds IRQ disable sections for > 100us.
42
43# ./criticalstat
44Finding critical section with IRQ disabled for > 100us
45===================================
46TASK: kworker/u16:5 (pid  5903 tid  5903) Total Time: 194.427  us
47
48Section start: __schedule -> schedule
49Section end:   _raw_spin_unlock_irq -> finish_task_switch
50  trace_hardirqs_on+0xdc
51  trace_hardirqs_on+0xdc
52  _raw_spin_unlock_irq+0x18
53  finish_task_switch+0xf0
54  __schedule+0x8c8
55  preempt_schedule_irq+0x38
56  el1_preempt+0x8
57===================================
58
59
60If too many sections are showing up, the user can raise the threshold to only
61show critical sections that are > 500us by passing "-d" option:
62
63# ./criticalstat -d 500
64Finding critical section with IRQ disabled for > 500us
65===================================
66TASK: crtc_commit:111 (pid   246 tid   246) Total Time: 580.730  us
67
68Section start: clk_enable_lock -> clk_enable
69Section end:   _raw_spin_unlock_irqrestore -> clk_enable
70  trace_hardirqs_on+0xdc
71  trace_hardirqs_on+0xdc
72  _raw_spin_unlock_irqrestore+0x24
73  clk_enable+0x80
74  msm_dss_enable_clk+0x7c
75  sde_power_resource_enable+0x578
76  _sde_crtc_vblank_enable_no_lock+0x68
77  sde_crtc_vblank+0x8c
78  sde_kms_enable_vblank+0x18
79  vblank_ctrl_worker+0xd0
80  kthread_worker_fn+0xf8
81  kthread+0x114
82  ret_from_fork+0x10
83===================================
84
85
86If instead of irq disabled sections, we want to see preempt disabled sections,
87then pass the "-p" option. Below we try to find preempt-disabled critical
88sections that are > 500us.
89
90# ./criticalstat -p -d 500
91Finding critical section with preempt disabled for > 500us
92===================================
93TASK: swapper/1 (pid     0 tid     0) Total Time: 618.437  us
94
95Section start: preempt_count_add -> preempt_count_add
96Section end:   preempt_count_sub -> preempt_count_sub
97  trace_preempt_on+0x98
98  trace_preempt_on+0x98
99  preempt_latency_stop+0x164
100  preempt_count_sub+0x50
101  schedule+0x74
102  schedule_preempt_disabled+0x14
103  cpu_startup_entry+0x84
104  secondary_start_kernel+0x1c8
105  [unknown]
106===================================
107
108
109criticalstat -p can also reflect kernel scheduler issues sometimes. These may
110show up as long preempt-off sections if the functions in the scheduler take a
111long time to run (such as pick_next_task_fair which selects the CPU for a task
112
113Follow is a report showing a preempt-off latency of 700us during the schedule
114loop's execution:
115
116===================================
117TASK: irq/296-cs35l36 (pid   666 tid   666) Total Time: 732.657  us
118
119Section start: schedule -> schedule
120Section end:   schedule -> schedule
121  trace_preempt_on+0x98
122  trace_preempt_on+0x98
123  preempt_count_sub+0xa4
124  schedule+0x78
125  schedule_timeout+0x80
126  wait_for_common+0xb4
127  wait_for_completion_timeout+0x28
128  geni_i2c_xfer+0x298
129  __i2c_transfer+0x4e0
130  i2c_transfer+0x8
131  irq_thread_fn+0x2c
132  irq_thread+0x160
133  kthread+0x118
134  ret_from_fork+0x10
135===================================
136
137
138See Also: Linux kernel's preemptoff and irqoff tracers which provide similar
139tracing but with some limitations.
140