1# heapprofd - Android Heap Profiler
2
3Googlers, for design doc see: http://go/heapprofd-design
4
5**heapprofd requires Android Q.**
6
7heapprofd is a tool that tracks native heap allocations & deallocations of an
8Android process within a given time period. The resulting profile can be used
9to attribute memory usage to particular function callstacks, supporting a mix
10of both native and java code. The tool should be useful to Android platform
11developers, and app developers investigating memory issues.
12
13On debug Android builds, you can profile all apps and most system services.
14On "user" builds, you can only use it on apps with the debuggable or
15profileable manifest flag.
16
17## Quickstart
18
19<!-- This uses github because gitiles does not allow to get the raw file. -->
20
21Use the `tools/heap_profile` script to heap profile a process. If you are
22having trouble make sure you are using the [latest version](
23https://raw.githubusercontent.com/catapult-project/perfetto/master/tools/heap_profile).
24
25See all the arguments using `tools/heap_profile -h`, or use the defaults
26and just profile a process (e.g. `system_server`):
27
28```
29$ tools/heap_profile --name system_server
30Profiling active. Press Ctrl+C to terminate.
31^CWrote profiles to /tmp/heap_profile-XSKcZ3i (symlink /tmp/heap_profile-latest)
32These can be viewed using pprof. Googlers: head to pprof/ and upload them.
33```
34
35This will create a pprof-compatible heap dump when Ctrl+C is pressed.
36
37## Viewing the data
38
39The resulting profile proto contains four views on the data
40
41* space: how many bytes were allocated but not freed at this callstack the
42  moment the dump was created.
43* alloc\_space: how many bytes were allocated (including ones freed at the
44  moment of the dump) at this callstack
45* objects: how many allocations without matching frees were done at this
46  callstack.
47* alloc\_objects: how many allocations (including ones with matching frees) were
48  done at this callstack.
49
50**Googlers:** Head to http://pprof/ and upload the gzipped protos to get a
51visualization. *Tip: you might want to put `libart.so` as a "Hide regex" when
52profiling apps.*
53
54[Speedscope](https://speedscope.app) can also be used to visualize the heap
55dump, but will only show the space view. *Tip: Click Left Heavy on the top
56left for a good visualisation.*
57
58## Sampling interval
59heapprofd samples heap allocations. Given a sampling interval of n bytes,
60one allocation is sampled, on average, every n bytes allocated. This allows to
61reduce the performance impact on the target process. The default sampling rate
62is 4096 bytes.
63
64The easiest way to reason about this is to imagine the memory allocations as a
65steady stream of one byte allocations. From this stream, every n-th byte is
66selected as a sample, and the corresponding allocation gets attributed the
67complete n bytes. As an optimization, we sample allocations larger than the
68sampling interval with their true size.
69
70To make this statistically more meaningful, Poisson sampling is employed.
71Instead of a static parameter of n bytes, the user can only choose the mean
72value around which the interval is distributed. This makes sure frequent small
73allocations get sampled as well as infrequent large ones.
74
75## Startup profiling
76When a profile session names processes by name and a matching process is
77started, it gets profiled from the beginning. The resulting profile will
78contain all allocations done between the start of the process and the end
79of the profiling session.
80
81On Android, Java apps are usually not started, but the zygote forks and then
82specializes into the desired app. If the app's name matches a name specified
83in the profiling session, profiling will be enabled as part of the zygote
84specialization. The resulting profile contains all allocations done between
85that point in zygote specialization and the end of the profiling session.
86Some allocations done early in the specialization process are not accounted
87for.
88
89The Resulting `ProfileProto` will have `from_startup` set  to true in the
90corresponding `ProcessHeapSamples` message. This does not get surfaced in the
91converted pprof compatible proto.
92
93## Runtime profiling
94When a profile session is started, all matching processes (by name or PID)
95are enumerated and profiling is enabled. The resulting profile will contain
96all allocations done between the beginning and the end of the profiling
97session.
98
99The Resulting `ProfileProto` will have `from_startup` set  to false in the
100corresponding `ProcessHeapSamples` message. This does not get surfaced in the
101converted pprof compatible proto.
102
103## Concurrent profiling sessions
104If multiple sessions name the same target process (either by name or PID),
105only the first relevant session will profile the process. The other sessions
106will report that the process had already been profiled when converting to
107the pprof compatible proto.
108
109If you see this message but do not expect any other sessions, run
110```
111adb shell killall -KILL perfetto
112```
113to stop any concurrent sessions that may be running.
114
115
116The Resulting `ProfileProto` will have `rejected_concurrent` set  to true in
117otherwise empty corresponding `ProcessHeapSamples` message. This does not get
118surfaced in the converted pprof compatible proto.
119
120## Target processes
121Depending on the build of Android that heapprofd is run on, some processes
122are not be eligible to be profiled.
123
124On user builds, only Java applications with either the profileable or the
125debugable manifest flag set can be profiled. Profiling requests for other
126processes will result in an empty profile.
127
128On userdebug builds, all processes except for a small blacklist of critical
129services can be profiled. This restriction can be lifted by disabling
130SELinux by running `adb shell su root setenforce 0` or by passing
131`--disable-selinux` to the `heap_profile` script.
132
133|                         | userdebug setenforce 0 | userdebug | user |
134|-------------------------|------------------------|-----------|------|
135| critical native service |            y           |     n     |  n   |
136| native service          |            y           |     y     |  n   |
137| app                     |            y           |     y     |  n   |
138| profileable app         |            y           |     y     |  y   |
139| debugable app           |            y           |     y     |  y   |
140
141## Troubleshooting
142
143### Buffer overrun
144If the rate of allocations is too high for heapprofd to keep up, the profiling
145session will end early due to a buffer overrun. If the buffer overrun is
146caused by a transient spike in allocations, increasing the shared memory buffer
147size (passing `--shmem-size` to heap\_profile) can resolve the issue.
148Otherwise the sampling interval can be increased (at the expense of lower
149accuracy in the resulting profile) by passing `--interval` to heap\_profile.
150
151### Profile is empty
152Check whether your target process is eligible to be profiled by consulting
153[Target processes](#target-processes) above.
154
155## Known Issues
156
157* Does not work on x86 platforms (including the Android cuttlefish emulator).
158
159## Ways to count memory
160
161When using heapprofd and interpreting results, it is important to know the
162precise meaning of the different memory metrics that can be obtained from the
163operating system.
164
165**heapprofd** gives you the number of bytes the target program
166requested from the allocator. If you are profiling a Java app from startup,
167allocations that happen early in the application's initialization will not be
168visibile to heapprofd. Native services that do not fork from the Zygote
169are not affected by this.
170
171**malloc\_info** is a libc function that gives you information about the
172allocator. This can be triggered on userdebug builds by using
173`am dumpheap -m <PID> /data/local/tmp/heap.txt`. This will in general be more
174than the memory seen by heapprofd, depending on the allocator not all memory
175is immediately freed. In particular, jemalloc retains some freed memory in
176thread caches.
177
178**Heap RSS** is the amount of memory requested from the operating system by the
179allocator. This is larger than the previous two numbers because memory can only
180be obtained in page size chunks, and fragmentation causes some of that memory to
181be wasted. This can be obtained by running `adb shell dumpsys meminfo <PID>` and
182looking at the "Private Dirty" column.
183
184|                     | heapprofd         | malloc\_info | RSS |
185|---------------------|-------------------|--------------|-----|
186| from native startup |          x        |      x       |  x  |
187| after zygote init   |          x        |      x       |  x  |
188| before zygote init  |                   |      x       |  x  |
189| thread caches       |                   |      x       |  x  |
190| fragmentation       |                   |              |  x  |
191
192If you observe high RSS or malloc\_info metrics but heapprofd does not match,
193there might be a problem with fragmentation or the allocator.
194
195## Manual instructions
196*It is not recommended to use these instructions unless you have advanced
197requirements or are developing heapprofd. Proceed with caution*
198
199### Download trace\_to\_text
200Download the latest trace\_to\_text for [Linux](
201https://storage.googleapis.com/perfetto/trace_to_text-4ab1d18e69bc70e211d27064505ed547aa82f919)
202or [MacOS](https://storage.googleapis.com/perfetto/trace_to_text-mac-2ba325f95c08e8cd5a78e04fa85ee7f2a97c847e).
203This is needed to convert the Perfetto trace to a pprof-compatible file.
204
205Compare the `sha1sum` of this file to the one contained in the file name.
206
207### Start profiling
208To start profiling the process `${PID}`, run the following sequence of commands.
209Adjust the `INTERVAL` to trade-off runtime impact for higher accuracy of the
210results. If `INTERVAL=1`, every allocation is sampled for maximum accuracy.
211Otherwise, a sample is taken every `INTERVAL` bytes on average.
212
213```bash
214INTERVAL=4096
215
216echo '
217buffers {
218  size_kb: 100024
219}
220
221data_sources {
222  config {
223    name: "android.heapprofd"
224    target_buffer: 0
225    heapprofd_config {
226      sampling_interval_bytes: '${INTERVAL}'
227      pid: '${PID}'
228    }
229  }
230}
231
232duration_ms: 20000
233' | adb shell perfetto --txt -c - -o /data/misc/perfetto-traces/profile
234
235adb pull /data/misc/perfetto-traces/profile /tmp/profile
236```
237
238### Convert to pprof compatible file
239
240While we work on UI support, you can convert the trace into pprof compatible
241heap dumps.
242
243Use the trace\_to\_text file downloaded above, with XXXXXXX replaced with the
244`sha1sum` of the file.
245
246```
247trace_to_text-linux-XXXXXXX profile /tmp/profile
248```
249
250This will create a directory in `/tmp/` containing the heap dumps. Run
251
252```
253gzip /tmp/heap_profile-XXXXXX/*.pb
254```
255
256to get gzipped protos, which tools handling pprof profile protos expect.
257
258Follow the instructions in [Viewing the Data](#viewing-the-data) to visualise
259the results.
260