1<?xml version='1.0' encoding='utf-8' ?>
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
3<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent">
4%BOOK_ENTITIES;
5]>
6<chapter id="chap-Wayland-Architecture">
7  <title>Wayland Architecture</title>
8  <section id="sect-Wayland-Architecture-wayland_architecture">
9    <title>X vs. Wayland Architecture</title>
10    <para>
11      A good way to understand the Wayland architecture
12      and how it is different from X is to follow an event
13      from the input device to the point where the change
14      it affects appears on screen.
15    </para>
16    <para>
17      This is where we are now with X:
18    </para>
19    <figure>
20      <title>X architecture diagram</title>
21      <mediaobjectco>
22	<imageobjectco>
23	  <areaspec id="map1" units="other" otherunits="imagemap">
24	    <area id="area1_1" linkends="x_flow_1" x_steal="#step_1"/>
25	    <area id="area1_2" linkends="x_flow_2" x_steal="#step_2"/>
26	    <area id="area1_3" linkends="x_flow_3" x_steal="#step_3"/>
27	    <area id="area1_4" linkends="x_flow_4" x_steal="#step_4"/>
28	    <area id="area1_5" linkends="x_flow_5" x_steal="#step_5"/>
29	    <area id="area1_6" linkends="x_flow_6" x_steal="#step_6"/>
30	  </areaspec>
31	  <imageobject>
32	    <imagedata fileref="images/x-architecture.png" format="PNG" />
33	  </imageobject>
34	</imageobjectco>
35      </mediaobjectco>
36    </figure>
37    <para>
38      <orderedlist>
39	<listitem id="x_flow_1">
40	  <para>
41	    The kernel gets an event from an input
42	    device and sends it to X through the evdev
43	    input driver. The kernel does all the hard
44	    work here by driving the device and
45	    translating the different device specific
46	    event protocols to the linux evdev input
47	    event standard.
48	  </para>
49	</listitem>
50	<listitem id="x_flow_2">
51	  <para>
52	    The X server determines which window the
53	    event affects and sends it to the clients
54	    that have selected for the event in question
55	    on that window. The X server doesn't
56	    actually know how to do this right, since
57	    the window location on screen is controlled
58	    by the compositor and may be transformed in
59	    a number of ways that the X server doesn't
60	    understand (scaled down, rotated, wobbling,
61	    etc).
62	  </para>
63	</listitem>
64	<listitem id="x_flow_3">
65	  <para>
66	    The client looks at the event and decides
67	    what to do. Often the UI will have to change
68	    in response to the event - perhaps a check
69	    box was clicked or the pointer entered a
70	    button that must be highlighted. Thus the
71	    client sends a rendering request back to the
72	    X server.
73	  </para>
74	</listitem>
75	<listitem id="x_flow_4">
76	  <para>
77	    When the X server receives the rendering
78	    request, it sends it to the driver to let it
79	    program the hardware to do the rendering.
80	    The X server also calculates the bounding
81	    region of the rendering, and sends that to
82	    the compositor as a damage event.
83	  </para>
84	</listitem>
85	<listitem id="x_flow_5">
86	  <para>
87	    The damage event tells the compositor that
88	    something changed in the window and that it
89	    has to recomposite the part of the screen
90	    where that window is visible. The compositor
91	    is responsible for rendering the entire
92	    screen contents based on its scenegraph and
93	    the contents of the X windows. Yet, it has
94	    to go through the X server to render this.
95	  </para>
96	</listitem>
97	<listitem id="x_flow_6">
98	  <para>
99	    The X server receives the rendering requests
100	    from the compositor and either copies the
101	    compositor back buffer to the front buffer
102	    or does a pageflip. In the general case, the
103	    X server has to do this step so it can
104	    account for overlapping windows, which may
105	    require clipping and determine whether or
106	    not it can page flip. However, for a
107	    compositor, which is always fullscreen, this
108	    is another unnecessary context switch.
109	  </para>
110	</listitem>
111      </orderedlist>
112    </para>
113    <para>
114      As suggested above, there are a few problems with this
115      approach. The X server doesn't have the information to
116      decide which window should receive the event, nor can it
117      transform the screen coordinates to window-local
118      coordinates. And even though X has handed responsibility for
119      the final painting of the screen to the compositing manager,
120      X still controls the front buffer and modesetting. Most of
121      the complexity that the X server used to handle is now
122      available in the kernel or self contained libraries (KMS,
123      evdev, mesa, fontconfig, freetype, cairo, Qt etc). In
124      general, the X server is now just a middle man that
125      introduces an extra step between applications and the
126      compositor and an extra step between the compositor and the
127      hardware.
128    </para>
129    <para>
130      In Wayland the compositor is the display server. We transfer
131      the control of KMS and evdev to the compositor. The Wayland
132      protocol lets the compositor send the input events directly
133      to the clients and lets the client send the damage event
134      directly to the compositor:
135    </para>
136    <figure>
137      <title>Wayland architecture diagram</title>
138      <mediaobjectco>
139	<imageobjectco>
140	  <areaspec id="mapB" units="other" otherunits="imagemap">
141	    <area id="areaB_1" linkends="wayland_flow_1" x_steal="#step_1"/>
142	    <area id="areaB_2" linkends="wayland_flow_2" x_steal="#step_2"/>
143	    <area id="areaB_3" linkends="wayland_flow_3" x_steal="#step_3"/>
144	    <area id="areaB_4" linkends="wayland_flow_4" x_steal="#step_4"/>
145	  </areaspec>
146	  <imageobject>
147	    <imagedata fileref="images/wayland-architecture.png" format="PNG" />
148	  </imageobject>
149	</imageobjectco>
150      </mediaobjectco>
151    </figure>
152    <para>
153      <orderedlist>
154	<listitem id="wayland_flow_1">
155	  <para>
156	    The kernel gets an event and sends
157	    it to the compositor. This
158	    is similar to the X case, which is
159	    great, since we get to reuse all the
160	    input drivers in the kernel.
161	  </para>
162	</listitem>
163	<listitem id="wayland_flow_2">
164	  <para>
165	    The compositor looks through its
166	    scenegraph to determine which window
167	    should receive the event. The
168	    scenegraph corresponds to what's on
169	    screen and the compositor
170	    understands the transformations that
171	    it may have applied to the elements
172	    in the scenegraph. Thus, the
173	    compositor can pick the right window
174	    and transform the screen coordinates
175	    to window-local coordinates, by
176	    applying the inverse
177	    transformations. The types of
178	    transformation that can be applied
179	    to a window is only restricted to
180	    what the compositor can do, as long
181	    as it can compute the inverse
182	    transformation for the input events.
183	  </para>
184	</listitem>
185	<listitem id="wayland_flow_3">
186	  <para>
187	    As in the X case, when the client
188	    receives the event, it updates the
189	    UI in response. But in the Wayland
190	    case, the rendering happens in the
191	    client, and the client just sends a
192	    request to the compositor to
193	    indicate the region that was
194	    updated.
195	  </para>
196	</listitem>
197	<listitem id="wayland_flow_4">
198	  <para>
199	    The compositor collects damage
200	    requests from its clients and then
201	    recomposites the screen. The
202	    compositor can then directly issue
203	    an ioctl to schedule a pageflip with
204	    KMS.
205	  </para>
206	</listitem>
207
208
209      </orderedlist>
210    </para>
211  </section>
212  <section id="sect-Wayland-Architecture-wayland_rendering">
213    <title>Wayland Rendering</title>
214    <para>
215      One of the details I left out in the above overview
216      is how clients actually render under Wayland. By
217      removing the X server from the picture we also
218      removed the mechanism by which X clients typically
219      render. But there's another mechanism that we're
220      already using with DRI2 under X: direct rendering.
221      With direct rendering, the client and the server
222      share a video memory buffer. The client links to a
223      rendering library such as OpenGL that knows how to
224      program the hardware and renders directly into the
225      buffer. The compositor in turn can take the buffer
226      and use it as a texture when it composites the
227      desktop. After the initial setup, the client only
228      needs to tell the compositor which buffer to use and
229      when and where it has rendered new content into it.
230    </para>
231
232    <para>
233      This leaves an application with two ways to update its window contents:
234    </para>
235    <para>
236      <orderedlist>
237	<listitem>
238	  <para>
239	    Render the new content into a new buffer and tell the compositor
240	    to use that instead of the old buffer. The application can
241	    allocate a new buffer every time it needs to update the window
242	    contents or it can keep two (or more) buffers around and cycle
243	    between them. The buffer management is entirely under
244	    application control.
245	  </para>
246	</listitem>
247	<listitem>
248	  <para>
249	    Render the new content into the buffer that it previously
250	    told the compositor to to use. While it's possible to just
251	    render directly into the buffer shared with the compositor,
252	    this might race with the compositor. What can happen is that
253	    repainting the window contents could be interrupted by the
254	    compositor repainting the desktop. If the application gets
255	    interrupted just after clearing the window but before
256	    rendering the contents, the compositor will texture from a
257	    blank buffer. The result is that the application window will
258	    flicker between a blank window or half-rendered content. The
259	    traditional way to avoid this is to render the new content
260	    into a back buffer and then copy from there into the
261	    compositor surface. The back buffer can be allocated on the
262	    fly and just big enough to hold the new content, or the
263	    application can keep a buffer around. Again, this is under
264	    application control.
265	  </para>
266	</listitem>
267      </orderedlist>
268    </para>
269    <para>
270      In either case, the application must tell the compositor
271      which area of the surface holds new contents. When the
272      application renders directly to the shared buffer, the
273      compositor needs to be noticed that there is new content.
274      But also when exchanging buffers, the compositor doesn't
275      assume anything changed, and needs a request from the
276      application before it will repaint the desktop. The idea
277      that even if an application passes a new buffer to the
278      compositor, only a small part of the buffer may be
279      different, like a blinking cursor or a spinner.
280    </para>
281  </section>
282  <section id="sect-Wayland-Architecture-wayland_hw_enabling">
283    <title>Hardware Enabling for Wayland</title>
284    <para>
285      Typically, hardware enabling includes modesetting/display
286      and EGL/GLES2. On top of that Wayland needs a way to share
287      buffers efficiently between processes. There are two sides
288      to that, the client side and the server side.
289    </para>
290    <para>
291      On the client side we've defined a Wayland EGL platform. In
292      the EGL model, that consists of the native types
293      (EGLNativeDisplayType, EGLNativeWindowType and
294      EGLNativePixmapType) and a way to create those types. In
295      other words, it's the glue code that binds the EGL stack and
296      its buffer sharing mechanism to the generic Wayland API. The
297      EGL stack is expected to provide an implementation of the
298      Wayland EGL platform. The full API is in the wayland-egl.h
299      header. The open source implementation in the mesa EGL stack
300      is in wayland-egl.c and platform_wayland.c.
301    </para>
302    <para>
303      Under the hood, the EGL stack is expected to define a
304      vendor-specific protocol extension that lets the client side
305      EGL stack communicate buffer details with the compositor in
306      order to share buffers. The point of the wayland-egl.h API
307      is to abstract that away and just let the client create an
308      EGLSurface for a Wayland surface and start rendering. The
309      open source stack uses the drm Wayland extension, which lets
310      the client discover the drm device to use and authenticate
311      and then share drm (GEM) buffers with the compositor.
312    </para>
313    <para>
314      The server side of Wayland is the compositor and core UX for
315      the vertical, typically integrating task switcher, app
316      launcher, lock screen in one monolithic application. The
317      server runs on top of a modesetting API (kernel modesetting,
318      OpenWF Display or similar) and composites the final UI using
319      a mix of EGL/GLES2 compositor and hardware overlays if
320      available. Enabling modesetting, EGL/GLES2 and overlays is
321      something that should be part of standard hardware bringup.
322      The extra requirement for Wayland enabling is the
323      EGL_WL_bind_wayland_display extension that lets the
324      compositor create an EGLImage from a generic Wayland shared
325      buffer. It's similar to the EGL_KHR_image_pixmap extension
326      to create an EGLImage from an X pixmap.
327    </para>
328    <para>
329      The extension has a setup step where you have to bind the
330      EGL display to a Wayland display. Then as the compositor
331      receives generic Wayland buffers from the clients (typically
332      when the client calls eglSwapBuffers), it will be able to
333      pass the struct wl_buffer pointer to eglCreateImageKHR as
334      the EGLClientBuffer argument and with EGL_WAYLAND_BUFFER_WL
335      as the target. This will create an EGLImage, which can then
336      be used by the compositor as a texture or passed to the
337      modesetting code to use as an overlay plane. Again, this is
338      implemented by the vendor specific protocol extension, which
339      on the server side will receive the driver specific details
340      about the shared buffer and turn that into an EGL image when
341      the user calls eglCreateImageKHR.
342    </para>
343  </section>
344</chapter>
345