1# Doing RAM dump of a Microdroid VM and analyzing it
2
3A debuggable Microdroid VM creates a RAM dump of itself when the kernel panics. This
4document explains how the dump can be obtained and analyzed.
5
6## Force triggering a RAM dump
7
8RAM dump is created automatically when there's a kernel panic. However, for
9debugging purpose, you can forcibly trigger it via magic SysRq key.
10
11```shell
12$ adb shell /apex/com.android.virt/bin/vm run-app ...     // run a Microdroid VM
13$ m vm_shell; vm_shell                                    // connect to the VM
14# echo c > /proc/sysrq-trigger                            // force trigger a crash
15```
16
17Then you will see following message showing that crash is detected and the
18crashdump kernel is executed.
19
20```
21[   14.949892][  T148] sysrq: Trigger a crash
22[   14.952133][  T148] Kernel panic - not syncing: sysrq triggered crash
23[   14.955309][  T148] CPU: 0 PID: 148 Comm: sh Kdump: loaded Not tainted 5.15.60-android14-5-04357-gbac79d727aea-ab9013362 #1
24[   14.957803][  T148] Hardware name: linux,dummy-virt (DT)
25[   14.959053][  T148] Call trace:
26[   14.959809][  T148]  dump_backtrace.cfi_jt+0x0/0x8
27[   14.961019][  T148]  dump_stack_lvl+0x68/0x98
28[   14.962137][  T148]  panic+0x160/0x3f4
29
30----------snip----------
31
32[   14.998693][  T148] Starting crashdump kernel...
33[   14.999411][  T148] Bye!
34Booting Linux on physical CPU 0x0000000000 [0x412fd050]
35Linux version 5.15.44+ (build-user@build-host) (Android (8508608, based on r450784e) clang version 14.0.7 (https://android.googlesource.com/toolchain/llvm-project 4c603efb0cca074e9238af8b4106c30add4418f6), LLD 14.0.7) #1 SMP PREEMPT Thu Jul 7 02:57:03 UTC 2022
36achine model: linux,dummy-virt
37earlycon: uart8250 at MMIO 0x00000000000003f8 (options '')
38printk: bootconsole [uart8250] enabled
39
40----------snip----------
41
42Run /bin/crashdump as init process
43Crashdump started
44Size is 98836480 bytes
45.....................................................................random: crng init done
46...............................done
47reboot: Restarting system with command 'kernel panic'
48```
49
50## Obtaining the RAM dump
51
52RAM dumps are sent to tombstone. To see which tombstone file is for
53the RAM dump, look into the log.
54
55```shell
56$ adb logcat | grep SYSTEM_TOMBSTONE
5709-22 17:24:28.798  1335  1504 I BootReceiver: Copying /data/tombstones/tombstone_47 to DropBox (SYSTEM_TOMBSTONE)
58```
59
60In the above example, the RAM dump is saved as `/data/tombstones/tombstone_47`.
61You can download this using `adb pull`.
62
63```shell
64$ adb root && adb pull /data/tombstones/tombstone_47 ramdump && adb unroot
65```
66
67## Analyzing the RAM dump
68
69### Building the crash(8) tool
70
71You first need to build the crash(8) tool for the target architecture, which in most case is aarch64.
72
73Download the source code and build it as follows. This needs to be done only once.
74
75```shell
76$ wget https://github.com/crash-utility/crash/archive/refs/tags/8.0.2.tar.gz -O - | tar xzv
77$ make -j -C crash-8.0.2 target=ARM64
78```
79
80### Obtaining vmlinux
81
82You also need the image of the kernel binary with debuggin enabled. The kernel
83binary should be the same as the actual kernel that you used in the Microdroid
84VM that crashed. To identify which kernel it was, look for the kernel version
85number in the logcat log.
86
87```
88[   14.955309][  T148] CPU: 0 PID: 148 Comm: sh Kdump: loaded Not tainted 5.15.60-android14-5-04357-gbac79d727aea-ab9013362 #1
89```
90
91Here, the version number is
92`5.15.60-android14-5-04357-gbac79d727aea-ab9013362`. What is important here is
93the last component: `ab9013362`. The numbers after `ab` is the Android Build ID
94of the kernel.
95
96With the build ID, you can find the image from `ci.android.com` and download
97it. The direct link to the image is `https://ci.android.com/builds/submitted/9013362/kernel_microdroid_aarch64/latest/vmlinux`.
98
99DON'T forget to replace `9013362` with the actual build ID of the kernel you used.
100
101### Running crash(8) with the RAM dump and the kernel image
102
103```shell
104$ crash-8.0.2/crash ramdump vmlinux
105```
106
107You can now analyze the RAM dump using the various commands that crash(8) provides. For example, `bt <pid>` command shows the stack trace of a process.
108
109```
110crash> bt
111PID: 148    TASK: ffffff8001a2d880  CPU: 0   COMMAND: "sh"
112 #0 [ffffffc00926b9f0] machine_kexec at ffffffd48a852004
113 #1 [ffffffc00926bb90] __crash_kexec at ffffffd48a948008
114 #2 [ffffffc00926bc40] panic at ffffffd48a86e2a8
115 #3 [ffffffc00926bc90] sysrq_handle_crash.35db4764f472dc1c4a43f39b71f858ea at ffffffd48ad985c8
116 #4 [ffffffc00926bca0] __handle_sysrq at ffffffd48ad980e4
117 #5 [ffffffc00926bcf0] write_sysrq_trigger.35db4764f472dc1c4a43f39b71f858ea at ffffffd48ad994f0
118 #6 [ffffffc00926bd10] proc_reg_write.bc7c2a3e70d8726163739fbd131db16e at ffffffd48ab4d280
119 #7 [ffffffc00926bda0] vfs_write at ffffffd48aaaa1a4
120 #8 [ffffffc00926bdf0] ksys_write at ffffffd48aaaa5b0
121 #9 [ffffffc00926be30] __arm64_sys_write at ffffffd48aaaa644
122#10 [ffffffc00926be40] invoke_syscall at ffffffd48a84b55c
123#11 [ffffffc00926be60] do_el0_svc at ffffffd48a84b424
124#12 [ffffffc00926be80] el0_svc at ffffffd48b0a29e4
125#13 [ffffffc00926bea0] el0t_64_sync_handler at ffffffd48b0a2950
126#14 [ffffffc00926bfe0] el0t_64_sync at ffffffd48a811644
127     PC: 00000079d880b798   LR: 00000064b4afec8c   SP: 0000007ff6ddb2e0
128    X29: 0000007ff6ddb360  X28: 0000007ff6ddb320  X27: 00000064b4b238e8
129    X26: 00000079d9c49000  X25: 0000000000000000  X24: b40000784870fda9
130    X23: 00000064b4b236f8  X22: 0000007ff6ddb340  X21: 0000007ff6ddb338
131    X20: b40000784870f618  X19: 0000000000000002  X18: 00000079daea4000
132    X17: 00000079d880b790  X16: 00000079d882dee0  X15: 0000000000000080
133    X14: 0000000000000000  X13: 0000008f00000160  X12: 000000004870f6ac
134    X11: 0000000000000008  X10: 000000000009c000   X9: b40000784870f618
135     X8: 0000000000000040   X7: 000000e70000000b   X6: 0000020500000210
136     X5: 00000079d883a984   X4: ffffffffffffffff   X3: ffffffffffffffff
137     X2: 0000000000000002   X1: b40000784870f618   X0: 0000000000000001
138    ORIG_X0: 0000000000000001  SYSCALLNO: 40  PSTATE: 00001000
139```
140
141Above shows that the shell process that executed `echo c > /proc/sysrq-trigger`
142actually triggered a crash in the kernel.
143
144For more commands of crash(8), refer to the man page, or embedded `help` command.
145