1# heapprofd - Android Heap Profiler 2 3Googlers, for design doc see: http://go/heapprofd-design 4 5**heapprofd requires Android Q.** 6 7heapprofd is a tool that tracks native heap allocations & deallocations of an 8Android process within a given time period. The resulting profile can be used 9to attribute memory usage to particular function callstacks, supporting a mix 10of both native and java code. The tool should be useful to Android platform 11developers, and app developers investigating memory issues. 12 13On debug Android builds, you can profile all apps and most system services. 14On "user" builds, you can only use it on apps with the debuggable or 15profileable manifest flag. 16 17## Quickstart 18 19<!-- This uses github because gitiles does not allow to get the raw file. --> 20 21Use the `tools/heap_profile` script to heap profile a process. If you are 22having trouble make sure you are using the [latest version]( 23https://raw.githubusercontent.com/catapult-project/perfetto/master/tools/heap_profile). 24 25See all the arguments using `tools/heap_profile -h`, or use the defaults 26and just profile a process (e.g. `system_server`): 27 28``` 29$ tools/heap_profile --name system_server 30Profiling active. Press Ctrl+C to terminate. 31^CWrote profiles to /tmp/heap_profile-XSKcZ3i (symlink /tmp/heap_profile-latest) 32These can be viewed using pprof. Googlers: head to pprof/ and upload them. 33``` 34 35This will create a pprof-compatible heap dump when Ctrl+C is pressed. 36 37## Viewing the data 38 39The resulting profile proto contains four views on the data 40 41* space: how many bytes were allocated but not freed at this callstack the 42 moment the dump was created. 43* alloc\_space: how many bytes were allocated (including ones freed at the 44 moment of the dump) at this callstack 45* objects: how many allocations without matching frees were done at this 46 callstack. 47* alloc\_objects: how many allocations (including ones with matching frees) were 48 done at this callstack. 49 50**Googlers:** Head to http://pprof/ and upload the gzipped protos to get a 51visualization. *Tip: you might want to put `libart.so` as a "Hide regex" when 52profiling apps.* 53 54[Speedscope](https://speedscope.app) can also be used to visualize the heap 55dump, but will only show the space view. *Tip: Click Left Heavy on the top 56left for a good visualisation.* 57 58## Sampling interval 59heapprofd samples heap allocations. Given a sampling interval of n bytes, 60one allocation is sampled, on average, every n bytes allocated. This allows to 61reduce the performance impact on the target process. The default sampling rate 62is 4096 bytes. 63 64The easiest way to reason about this is to imagine the memory allocations as a 65steady stream of one byte allocations. From this stream, every n-th byte is 66selected as a sample, and the corresponding allocation gets attributed the 67complete n bytes. As an optimization, we sample allocations larger than the 68sampling interval with their true size. 69 70To make this statistically more meaningful, Poisson sampling is employed. 71Instead of a static parameter of n bytes, the user can only choose the mean 72value around which the interval is distributed. This makes sure frequent small 73allocations get sampled as well as infrequent large ones. 74 75## Startup profiling 76When a profile session names processes by name and a matching process is 77started, it gets profiled from the beginning. The resulting profile will 78contain all allocations done between the start of the process and the end 79of the profiling session. 80 81On Android, Java apps are usually not started, but the zygote forks and then 82specializes into the desired app. If the app's name matches a name specified 83in the profiling session, profiling will be enabled as part of the zygote 84specialization. The resulting profile contains all allocations done between 85that point in zygote specialization and the end of the profiling session. 86Some allocations done early in the specialization process are not accounted 87for. 88 89The Resulting `ProfileProto` will have `from_startup` set to true in the 90corresponding `ProcessHeapSamples` message. This does not get surfaced in the 91converted pprof compatible proto. 92 93## Runtime profiling 94When a profile session is started, all matching processes (by name or PID) 95are enumerated and profiling is enabled. The resulting profile will contain 96all allocations done between the beginning and the end of the profiling 97session. 98 99The Resulting `ProfileProto` will have `from_startup` set to false in the 100corresponding `ProcessHeapSamples` message. This does not get surfaced in the 101converted pprof compatible proto. 102 103## Concurrent profiling sessions 104If multiple sessions name the same target process (either by name or PID), 105only the first relevant session will profile the process. The other sessions 106will report that the process had already been profiled when converting to 107the pprof compatible proto. 108 109If you see this message but do not expect any other sessions, run 110``` 111adb shell killall -KILL perfetto 112``` 113to stop any concurrent sessions that may be running. 114 115 116The Resulting `ProfileProto` will have `rejected_concurrent` set to true in 117otherwise empty corresponding `ProcessHeapSamples` message. This does not get 118surfaced in the converted pprof compatible proto. 119 120## Target processes 121Depending on the build of Android that heapprofd is run on, some processes 122are not be eligible to be profiled. 123 124On user builds, only Java applications with either the profileable or the 125debugable manifest flag set can be profiled. Profiling requests for other 126processes will result in an empty profile. 127 128On userdebug builds, all processes except for a small blacklist of critical 129services can be profiled. This restriction can be lifted by disabling 130SELinux by running `adb shell su root setenforce 0` or by passing 131`--disable-selinux` to the `heap_profile` script. 132 133| | userdebug setenforce 0 | userdebug | user | 134|-------------------------|------------------------|-----------|------| 135| critical native service | y | n | n | 136| native service | y | y | n | 137| app | y | y | n | 138| profileable app | y | y | y | 139| debugable app | y | y | y | 140 141## Troubleshooting 142 143### Buffer overrun 144If the rate of allocations is too high for heapprofd to keep up, the profiling 145session will end early due to a buffer overrun. If the buffer overrun is 146caused by a transient spike in allocations, increasing the shared memory buffer 147size (passing `--shmem-size` to heap\_profile) can resolve the issue. 148Otherwise the sampling interval can be increased (at the expense of lower 149accuracy in the resulting profile) by passing `--interval` to heap\_profile. 150 151### Profile is empty 152Check whether your target process is eligible to be profiled by consulting 153[Target processes](#target-processes) above. 154 155## Known Issues 156 157* Does not work on x86 platforms (including the Android cuttlefish emulator). 158 159## Ways to count memory 160 161When using heapprofd and interpreting results, it is important to know the 162precise meaning of the different memory metrics that can be obtained from the 163operating system. 164 165**heapprofd** gives you the number of bytes the target program 166requested from the allocator. If you are profiling a Java app from startup, 167allocations that happen early in the application's initialization will not be 168visibile to heapprofd. Native services that do not fork from the Zygote 169are not affected by this. 170 171**malloc\_info** is a libc function that gives you information about the 172allocator. This can be triggered on userdebug builds by using 173`am dumpheap -m <PID> /data/local/tmp/heap.txt`. This will in general be more 174than the memory seen by heapprofd, depending on the allocator not all memory 175is immediately freed. In particular, jemalloc retains some freed memory in 176thread caches. 177 178**Heap RSS** is the amount of memory requested from the operating system by the 179allocator. This is larger than the previous two numbers because memory can only 180be obtained in page size chunks, and fragmentation causes some of that memory to 181be wasted. This can be obtained by running `adb shell dumpsys meminfo <PID>` and 182looking at the "Private Dirty" column. 183 184| | heapprofd | malloc\_info | RSS | 185|---------------------|-------------------|--------------|-----| 186| from native startup | x | x | x | 187| after zygote init | x | x | x | 188| before zygote init | | x | x | 189| thread caches | | x | x | 190| fragmentation | | | x | 191 192If you observe high RSS or malloc\_info metrics but heapprofd does not match, 193there might be a problem with fragmentation or the allocator. 194 195## Manual instructions 196*It is not recommended to use these instructions unless you have advanced 197requirements or are developing heapprofd. Proceed with caution* 198 199### Download trace\_to\_text 200Download the latest trace\_to\_text for [Linux]( 201https://storage.googleapis.com/perfetto/trace_to_text-4ab1d18e69bc70e211d27064505ed547aa82f919) 202or [MacOS](https://storage.googleapis.com/perfetto/trace_to_text-mac-2ba325f95c08e8cd5a78e04fa85ee7f2a97c847e). 203This is needed to convert the Perfetto trace to a pprof-compatible file. 204 205Compare the `sha1sum` of this file to the one contained in the file name. 206 207### Start profiling 208To start profiling the process `${PID}`, run the following sequence of commands. 209Adjust the `INTERVAL` to trade-off runtime impact for higher accuracy of the 210results. If `INTERVAL=1`, every allocation is sampled for maximum accuracy. 211Otherwise, a sample is taken every `INTERVAL` bytes on average. 212 213```bash 214INTERVAL=4096 215 216echo ' 217buffers { 218 size_kb: 100024 219} 220 221data_sources { 222 config { 223 name: "android.heapprofd" 224 target_buffer: 0 225 heapprofd_config { 226 sampling_interval_bytes: '${INTERVAL}' 227 pid: '${PID}' 228 } 229 } 230} 231 232duration_ms: 20000 233' | adb shell perfetto --txt -c - -o /data/misc/perfetto-traces/profile 234 235adb pull /data/misc/perfetto-traces/profile /tmp/profile 236``` 237 238### Convert to pprof compatible file 239 240While we work on UI support, you can convert the trace into pprof compatible 241heap dumps. 242 243Use the trace\_to\_text file downloaded above, with XXXXXXX replaced with the 244`sha1sum` of the file. 245 246``` 247trace_to_text-linux-XXXXXXX profile /tmp/profile 248``` 249 250This will create a directory in `/tmp/` containing the heap dumps. Run 251 252``` 253gzip /tmp/heap_profile-XXXXXX/*.pb 254``` 255 256to get gzipped protos, which tools handling pprof profile protos expect. 257 258Follow the instructions in [Viewing the Data](#viewing-the-data) to visualise 259the results. 260