1# Android Jank detection with FrameTimeline
2
3NOTE: **FrameTimeline requires Android 12(S) or higher**
4
5A frame is said to be janky if the time the frame was presented on screen does
6not match the predicted present time given by the scheduler.
7
8A jank can cause:
9* Unstable frame rate
10* Increased latency
11
12FrameTimeline is a module within SurfaceFlinger that detects janks and reports
13the source of the jank.
14[SurfaceViews](https://developer.android.com/reference/android/view/SurfaceView)
15are currently **not supported**, but will be, in future.
16
17## UI
18
19Two new tracks are added for every application that had at least one frame on
20screen.
21
22![](/docs/images/frametimeline/timeline_tracks.png)
23
24* Expected Timeline
25Each slice represents the time given to the app for rendering the
26frame. To avoid janks in the system, the app is expected to finish within this
27time frame.
28
29* Actual Timeline
30These slices represent the actual time an app took to complete the frame
31(including GPU work) and send it to SurfaceFlinger for composition. **Note: The
32actual frame start of apps is not yet known to FrameTimeline. Expected start
33time is used instead**. The end time of the slices here represent `max(gpu time,
34post time)`. **Post time** is the time the app's frame was posted to
35SurfaceFlinger.
36
37![](/docs/images/frametimeline/app-timelines.png)
38
39Similarly, SurfaceFlinger also gets these two new tracks representing the
40expected time it's supposed to finish within, and the actual time it took to
41finish compositing frames and presenting on-screen. Here, SurfaceFlinger's work
42represents everything underneath it in the display stack. This includes the
43Composer and the DisplayHAL. So, the slices represent SurfaceFlinger main
44thread's start to on-screen update.
45
46The names of the slices represent the token received from
47[choreographer](https://developer.android.com/reference/android/view/Choreographer).
48You can compare a slice in the actual timeline track to its corresponding slice
49in the expected timeline track to see how the app performed compared to the
50expectations. In addition, for debugging purposes, the token is added to the
51app's **doFrame** and **RenderThread** slices. For SurfaceFlinger, the same
52token is shown in **onMessageReceived**.
53
54![](/docs/images/frametimeline/app-vsyncid.png)
55
56![](/docs/images/frametimeline/sf-vsyncid.png)
57
58### Selecting an actual timeline slice
59
60![](/docs/images/frametimeline/selection.png)
61
62The selection details provide more information on what happened with the frame.
63These include:
64
65* **Present Type**
66
67Was the frame early, on time or late.
68* **On time finish**
69
70Did the application finish its work for the frame on time?
71* **Jank Type**
72
73Was there a jank observed with this frame? If yes, this shows what type of jank
74was observed. If not, the type would be **None**.
75* **Prediction type**
76
77Did the prediction expire by the time this frame was received by FrameTimeline?
78If yes, this will say **Expired Prediction**. If not, **Valid Prediction**.
79* **GPU Composition**
80
81Boolean that tells if the frame was composited by the GPU or not.
82* **Layer Name**
83
84Name of the Layer/Surface to which the frame was presented. Some processes
85update frames to multiple surfaces. Here, multiple slices with the same token
86will be shown in the Actual Timeline. Layer Name can be a good way to
87disambiguate between these slices.
88* **Is Buffer?**
89
90Boolean that tells if the frame corresponds to a buffer or an animation.
91
92### Flow events
93
94Selecting an actual timeline slice in the app also draws a line back to the
95corresponding SurfaceFlinger timeline slice.
96
97![](/docs/images/frametimeline/select-app-slice.png)
98
99Since SurfaceFlinger can composite frames from multiple layers into a single
100frame-on-screen (called a **DisplayFrame**), selecting a DisplayFrame draws
101arrows to all the frames that were composited together. This can span over
102multiple processes.
103
104![](/docs/images/frametimeline/select-sf-slice-1.png)
105![](/docs/images/frametimeline/select-sf-slice-2.png)
106
107### Color codes
108
109| Color | Image | Description    |
110| :---      | :---: | :---           |
111| Green | ![](/docs/images/frametimeline/green.png) | A good frame. No janks observed |
112| Light Green | ![](/docs/images/frametimeline/light-green.png) | High latency state. The framerate is smooth but frames are presented late, resulting in an increased input latency.|
113| Red | ![](/docs/images/frametimeline/red.png) | Janky frame. The process the slice belongs to, is the reason for the jank. |
114| Yellow | ![](/docs/images/frametimeline/yellow.png) | Used only by the apps. The frame is janky but app wasn't the reason, SurfaceFlinger caused the jank. |
115| Blue | ![](/docs/images/frametimeline/blue.png) | Dropped frame. Not related to jank. The frame was dropped by SurfaceFlinger, preferring an updated frame over this. |
116
117## Janks explained
118
119The jank types are defined in
120[JankInfo.h](http://cs/android/frameworks/native/libs/gui/include/gui/JankInfo.h?l=22).
121Since each app is written differently, there is no common way to go into the
122internals of the apps and specify what the reason for the jank was. Our goal is
123not to do this but rather, provide a quick way to tell if app was janky or if
124SurfaceFlinger was janky.
125
126### None
127
128All good. No jank with the frame. The ideal state that should be aimed for.
129
130### App janks
131
132* **AppDeadlineMissed**
133
134The app ran longer than expected causing a jank. The total time taken by the app
135frame is calculated by using the choreographer wake-up as the start time and
136max(gpu, post time) as the end time. Post time is the time the frame was sent to
137SurfaceFlinger. Since the GPU usually runs in parallel, it could be that the gpu
138finished later than the post time.
139
140* **BufferStuffing**
141
142This is more of a state than a jank. This happens if the app keeps sending new
143frames to SurfaceFlinger before the previous frame was even presented. The
144internal Buffer Queue is stuffed with buffers that are yet to be presented,
145hence the name, Buffer Stuffing. These extra buffers in the queue are presented
146only one after the other thus resulting in extra latency.
147This can also result in a stage where there are no more buffers for the app to
148use and it goes into a dequeue blocking wait.
149The actual duration of work performed by the app might still be within the
150deadline, but due to the stuffed nature, all the frames will be presented at
151least one vsync late no matter how quickly the app finishes its work.
152Frames will still be smooth in this state but there is an increased input
153latency associated with the late present.
154
155### SurfaceFlinger Janks
156
157There are two ways SurfaceFlinger can composite frames.
158* Device Composition - uses a dedicated hardware
159* GPU/Client composition - uses GPU to composite
160
161An important thing to note is that performing device composition happens as a
162blocking call on the main thread. However, GPU composition happens in parallel.
163SurfaceFlinger performs the necessary draw calls and then hands over the gpu
164fence to the display device. The display device then waits for the fence to be
165signaled, and then presents the frame.
166
167* **SurfaceFlingerCpuDeadlineMissed**
168
169SurfaceFlinger is expected to finish within the given deadline. If the main
170thread ran for longer than that, the jank is then
171SurfaceFlingerCpuDeadlineMissed. SurfaceFlinger’s CPU time is the time spent on
172the main thread. This includes the entire composition time if device composition
173was used. If GPU composition was used, this includes the time to write the draw
174calls and handing over the frame to the GPU.
175
176* **SurfaceFlingerGpuDeadlineMissed**
177
178The time taken by SurfaceFlinger’s main thread on the CPU + the GPU composition
179time together were longer than expected. Here, the CPU time would have still
180been within the deadline but since the work on the GPU wasn’t ready on time, the
181frame got pushed to the next vsync.
182
183* **DisplayHAL**
184
185DisplayHAL jank refers to the case where SurfaceFlinger finished its work and
186sent the frame down to the HAL on time, but the frame wasn’t presented on the
187vsync. It was presented on the next vsync. It could be that SurfaceFlinger did
188not give enough time for the HAL’s work or it could be that there was a genuine
189delay in the HAL’s work.
190
191* **PredictionError**
192
193SurfaceFlinger’s scheduler plans ahead the time to present the frames. However,
194this prediction sometimes drifts away from the actual hardware vsync time. For
195example, a frame might have predicted present time as 20ms. Due to a drift in
196estimation, the actual present time of the frame could be 23ms. This is called a
197Prediction Error in SurfaceFlinger’s scheduler. The scheduler corrects itself
198periodically, so this drift isn’t permanent. However, the frames that had a
199drift in prediction will still be classified as jank for tracking purposes.
200
201Isolated prediction errors are not usually perceived by the user as the
202scheduler is quick to adapt and fix the drift.
203
204### Unknown jank
205
206As the name suggests, the reason for the jank is unknown in this case. An
207example here would be that SurfaceFlinger or the App took longer than expected
208and missed the deadline but the frame was still presented early. The probability
209of such a jank happening is very low but not impossible.
210
211## SQL
212
213At the SQL level, frametimeline data is available in two tables
214* [`expected_frame_timeline_slice`](/docs/analysis/sql-tables.autogen#expected_frame_timeline_slice)
215* [`actual_frame_timeline_slice`](/docs/analysis/sql-tables.autogen#actual_frame_timeline_slice)
216
217```
218select ts, dur, surface_frame_token as app_token, display_frame_token as sf_token, process.name
219from expected_frame_timeline_slice left join process using(upid)
220```
221
222ts | dur | app_token | sf_token | name
223---|-----|-----------|----------|-----
22460230453475 | 20500000 | 3135 | 3142 | com.google.android.apps.nexuslauncher
22560241677540 | 20500000 | 3137 | 3144 | com.google.android.apps.nexuslauncher
22660252895412 | 20500000 | 3139 | 3146 | com.google.android.apps.nexuslauncher
22760284614241 | 10500000 | 0 | 3144 | /system/bin/surfaceflinger
22860295858299 | 10500000 | 0 | 3146 | /system/bin/surfaceflinger
22960297798913 | 20500000 | 3147 | 3150 | com.android.systemui
23060307075728 | 10500000 | 0 | 3148 | /system/bin/surfaceflinger
23160318297746 | 10500000 | 0 | 3150 | /system/bin/surfaceflinger
23260320236468 | 20500000 | 3151 | 3154 | com.android.systemui
23360329511401 | 10500000 | 0 | 3152 | /system/bin/surfaceflinger
23460340732956 | 10500000 | 0 | 3154 | /system/bin/surfaceflinger
23560342673064 | 20500000 | 3155 | 3158 | com.android.systemui
236
237
238```
239select ts, dur, surface_frame_token as app_token, display_frame_token, jank_type, on_time_finish, present_type, layer_name, process.name
240from actual_frame_timeline_slice left join process using(upid)
241```
242
243ts | dur | app_token | sf_token | jank_type | on_time_finish | present_type | layer_name | name
244---|-----|-----------|----------|-----------|----------------|--------------|------------|-----
24560230453475 | 26526379 | 3135 | 3142 | Buffer Stuffing | 1 | Late Present | TX - com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#0 | com.google.android.apps.nexuslauncher
24660241677540 | 28235805 | 3137 | 3144 | Buffer Stuffing | 1 | Late Present | TX - com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#0 | com.google.android.apps.nexuslauncher
24760252895412 | 2546525 | 3139 | 3142 | None | 1 | On-time Present | TX - NavigationBar0#0 | com.android.systemui
24860252895412 | 27945382 | 3139 | 3146 | Buffer Stuffing | 1 | Late Present | TX - com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#0 | com.google.android.apps.nexuslauncher
24960284808190 | 10318230 | 0 | 3144 | None | 1 | On-time Present | [NULL] | /system/bin/surfaceflinger
25060296067722 | 10265574 | 0 | 3146 | None | 1 | On-time Present | [NULL] | /system/bin/surfaceflinger
25160297798913 | 5239227 | 3147 | 3150 | None | 1 | On-time Present | TX - NavigationBar0#0 | com.android.systemui
25260307246161 | 10301772 | 0 | 3148 | None | 1 | On-time Present | [NULL] | /system/bin/surfaceflinger
25360318497204 | 10281199 | 0 | 3150 | None | 1 | On-time Present | [NULL] | /system/bin/surfaceflinger
25460320236468 | 2747559 | 3151 | 3154 | None | 1 | On-time Present | TX - NavigationBar0#0 | com.android.systemui
255
256## TraceConfig
257
258Trace Protos:
259[FrameTimelineEvent](/docs/reference/trace-packet-proto.autogen#FrameTimelineEvent)
260
261Datasource:
262
263```protobuf
264data_sources {
265    config {
266        name: "android.surfaceflinger.frametimeline"
267    }
268}
269```
270
271