1// Copyright 2021-2023 The Khronos Group Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5= VK_KHR_dynamic_rendering
6:toc: left
7:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/
8:sectnums:
9
10This document details API design ideas for the VK_KHR_dynamic_rendering extension, which adds a more dynamic and flexible way to use draw commands, as a straightforward replacement for single pass render passes.
11
12
13== Problem Statement
14
15Render passes are the number one complaint from developers about Vulkan and have been almost since launch. Some of the most pointed issues are as follows:
16
17  . Other APIs have much more flexible APIs for the same functionality
18  . Most of the render pass API in Vulkan goes unused
19  . Most applications do not or cannot use subpasses, but still pay the cost of setting them up
20  . The API does not fit into most existing software architectures
21  . Fundamentally, other than load/store actions, they do not address real issues for IHVs or ISVs
22  . When teaching Vulkan as an API, this is a huge place where people trip up
23
24An additional problem came up recently that having this state baked into pipeline creation actively contributes to the pipeline compilation time problem and having the ability to separate out most of this state would help enormously.
25
26This proposal _only_ addresses single pass render passes; additional functionality to replace multiple subpasses will be in a separate proposal.
27
28
29== Solution Space
30
31The following rough options exist for addressing this issue:
32
33  . Drastically expand the render pass compatibility options
34  . Allow render pass objects to be “VK_NULL_HANDLE” until record time
35  . Create a new API that pares down the information required to the bare minimum
36
37Option 1 has the advantage of being the least invasive in terms of API changes – it only really affects a handful of VUs, whilst still solving some of the flexibility issues.
38The disadvantage of this is that applications still have to manage render pass objects, and it does not really address any of the points in the problem statement directly.
39
40Option 2 effectively allows applications to provide the same information to applications again without any real API change, and addresses point 4 in the problem statement directly as it allows the render pass information to provided fairly late.
41
42Option 3 is a much more drastic change in terms of the API, requiring additional paths through the API/driver that are generally somewhat annoying to manage. This has the advantage of being able to address all points in the problem statement, however.
43Render pass objects also carry a lot of baggage in terms of developer opinion, and an overhaul replacement is likely to be better received for that reason.
44
45Developers and the Vulkan WG seems to be more enthusiastic about Option 3 than other approaches, and so it is the approach proposed here.
46
47
48== Proposal
49
50=== Begin/End Render Pass
51
52This extension introduces new commands to begin and end a render pass:
53
54[source,c]
55----
56VKAPI_ATTR void VKAPI_CALL vkCmdBeginRenderingKHR(
57    VkCommandBuffer                             commandBuffer,
58    const VkRenderingInfoKHR*                   pRenderingInfo);
59
60VKAPI_ATTR void VKAPI_CALL vkCmdEndRenderingKHR(
61    VkCommandBuffer                             commandBuffer);
62----
63
64Neither of these commands make any reference to a render pass object – render passes are now fully dynamic.
65These commands may be called inside secondary command buffers, but `vkCmdEndRenderingKHR` and `vkCmdBeginRenderingKHR` must always appear as a pair in the same command buffer.
66Note that render passes can still span multiple command buffers via <<suspending-and-resuming,suspended render passes>>.
67
68[source,c]
69----
70typedef struct VkRenderingInfoKHR {
71    VkStructureType                     sType;
72    const void*                         pNext;
73    VkRenderingFlagsKHR                 flags;
74    VkRect2D                            renderArea;
75    uint32_t                            layerCount;
76    uint32_t                            viewMask;
77    uint32_t                            colorAttachmentCount;
78    const VkRenderingAttachmentInfoKHR* pColorAttachments;
79    const VkRenderingAttachmentInfoKHR* pDepthAttachment;
80    const VkRenderingAttachmentInfoKHR* pStencilAttachment;
81} VkRenderingInfoKHR;
82----
83
84The rendering info provided to `vkCmdBeginRenderingKHR` is the essential information needed to begin rendering, based on what is and is not currently inside the compatibility rules for render passes.
85Notably, this is not a synchronization command – there is no replacement for subpass external dependencies.
86Applications should use other synchronization primitives (barriers, events) to manage synchronization.
87
88If `viewMask` is `0`, then multiview is disabled for this render pass, and `layerCount` indicates the number of layers used in each attachment.
89If `viewMask` is non-zero, then multiview is enabled for this render pass, and each bit in `viewMask` indicates a layer index in each element that will rendered.
90
91==== Attachments
92
93Depth and stencil image info are separated for API clarity (since everything else is applied independently), but they must point to the same image.
94The same restriction applies to their respective resolve images.
95For each attachment, the information provided is a the image view to bind, layout information, resolve information, and load/store ops (including a clear color).
96
97[source,c]
98----
99typedef struct VkRenderingAttachmentInfoKHR {
100    VkStructureType          sType;
101    const void*              pNext;
102    VkImageView              imageView;
103    VkImageLayout            imageLayout;
104    VkResolveModeFlagBits    resolveMode;
105    VkImageView              resolveImageView;
106    VkImageLayout            resolveImageLayout;
107    VkAttachmentLoadOp       loadOp;
108    VkAttachmentStoreOp      storeOp;
109    VkClearValue             clearValue;
110} VkRenderingAttachmentInfoKHR;
111----
112
113There are no layout transitions or other synchronization info for images – synchronization is done exclusively by existing synchronization commands - the layouts provided are those that the image must already be in when rendering.
114
115Image views for any attachment may be link:{refpage}VK_NULL_HANDLE.html[VK_NULL_HANDLE], indicating that writes to the attachment are discarded, and reads return undefined values.
116
117Note that the resolve images do not have their own load/store operations; they are treated as if they are implicitly `VK_ATTACHMENT_LOAD_OP_DONT_CARE` and `VK_ATTACHMENT_STORE_OP_STORE` – other combinations in the existing API do not really carry any useful meaning.
118
119`resolveMode` for color attachments must be `VK_RESOLVE_MODE_NONE` or `VK_RESOLVE_MODE_AVERAGE_BIT`.
120
121===== Store Op None
122
123A new store operation is provided as originally described by link:{refpage}VK_QCOM_render_pass_store_ops.html[VK_QCOM_render_pass_store_ops]:
124
125[source,c]
126----
127VK_ATTACHMENT_STORE_OP_NONE_KHR = 1000301000,
128----
129
130This store operation works largely like DONT_CARE but guarantees that the store op does not access the attachment.
131When a render pass accesses an attachment as read only, this can be useful in avoiding a potential write operation during the store operation, and removing the need for synchronization in some cases.
132
133
134==== Rendering Flags
135
136Rendering flags cover the following functionality:
137
138[source,c]
139----
140typedef enum VkRenderingFlagsKHR {
141    VK_RENDERING_CONTENTS_SECONDARY_COMMAND_BUFFERS_BIT_KHR = 0x00000001,
142    VK_RENDERING_SUSPENDING_BIT_KHR                         = 0x00000002,
143    VK_RENDERING_RESUMING_BIT_KHR                           = 0x00000004,
144} VkRenderingFlagsKHR;
145----
146
147
148===== Secondary Command Buffer Contents
149
150`VK_RENDERING_CONTENTS_SECONDARY_COMMAND_BUFFERS_BIT_KHR` works more or less identically to `VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS`, indicating that the contents of the render pass will be entirely recorded inside a secondary command buffer and replayed.
151If it is absent, the commands must be wholly recorded inside the command buffer that starts it.
152
153This requires the introduction of a new inheritance info when dynamic rendering is used, as the renderpass will no longer provide information required by implementations:
154
155[source,c]
156----
157typedef struct VkCommandBufferInheritanceRenderingInfoKHR {
158    VkStructureType          sType;
159    const void*              pNext;
160    VkRenderingFlagsKHR      flags;
161    uint32_t                 viewMask;
162    uint32_t                 colorAttachmentCount;
163    const VkFormat*          pColorAttachmentFormats;
164    VkFormat                 depthAttachmentFormat;
165    VkFormat                 stencilAttachmentFormat;
166    VkSampleCountFlagBits    rasterizationSamples;
167} VkCommandBufferInheritanceRenderingInfoKHR;
168----
169
170Information here must match that in the render pass being executed.
171If no color attachments are used or the formats are all `VK_FORMAT_UNDEFINED`, and the `variableMultisampleRate` feature is supported, the rasterization sample count is ignored.
172If either `depthAttachmentFormat` or `stencilAttachmentFormat` are not `VK_FORMAT_UNDEFINED`, they must have the same value.
173
174This allows applications to use secondary command buffers with dynamic rendering as they would have done in the existing render pass API.
175
176However, an alternative method of recording commands across multiple command buffers is also provided by <<suspending-and-resuming,suspending render passes>>.
177
178[[command-buffer-inheritance-mixed-samples]]
179====== Mixed Samples
180
181If either of link:{refpage}VK_NV_framebuffer_mixed_samples.html[VK_NV_framebuffer_mixed_samples] or link:{refpage}VK_AMD_mixed_attachment_samples.html[VK_AMD_mixed_attachment_samples] are enabled, the sample counts of color and depth attachments may vary from the `rasterizationSamples`.
182In this case, the sample count of each attachment can be specified by including the `VkAttachmentSampleInfoAMD`/`VkAttachmentSampleCountInfoNV` structure in the same `pNext` chain.
183
184[source,c]
185----
186typedef struct VkAttachmentSampleCountInfoAMD {
187    VkStructureType                 sType;
188    const void*                     pNext;
189    VkRenderingFlagsKHR             flags;
190    uint32_t                        colorAttachmentCount;
191    const VkSampleCountFlagBits*    pColorAttachmentSamples;
192    VkSampleCountFlagBits           depthStencilAttachmentSamples;
193} VkAttachmentSampleCountInfoAMD;
194
195typedef VkAttachmentSampleCountInfoAMD VkAttachmentSampleCountInfoNV;
196----
197
198[[command-buffer-inheritance-multiview-per-view-attributes]]
199====== Multiview Per-View Attributes
200
201If link:{refpage}VK_NVX_multiview_per_view_attributes.html[VK_NVX_multiview_per_view_attributes] is enabled, the multiview per-view attributes can be specified by including the `VkMultiviewPerViewAttributesInfoNVX` structure in the same `pNext` chain.
202
203
204[[suspending-and-resuming]]
205===== Suspending and Resuming
206
207`VK_RENDERING_SUSPENDING_BIT_KHR` and `VK_RENDERING_RESUMING_BIT_KHR` allow an alternative method of recording across multiple command buffers.
208Applications can suspend a render pass in one command buffer using `VK_RENDERING_SUSPENDING_BIT_KHR`, and resume it in another command buffer by starting an identical render pass with `VK_RENDERING_RESUMING_BIT_KHR`.
209Suspended render passes must be resumed by a render pass with identical begin parameters, other than the presence absence of `VK_RENDERING_SUSPENDING_BIT_KHR`, `VK_RENDERING_RESUMING_BIT_KHR`, and `VK_RENDERING_CONTENTS_SECONDARY_COMMAND_BUFFERS_BIT_KHR`.
210
211It is invalid to use action commands, synchronization commands, or record additional render passes, between a suspended render pass and the render pass which resumes it.
212All pairs of resuming and suspending render passes must be submitted in the same batch.
213Applications can resume a dynamic render pass in the same command buffer as it was suspended.
214Applications can record a dynamic render pass wholly inside secondary command buffers.
215A dynamic render pass can be both suspending and resuming.
216
217
218==== Device Groups
219
220The link:{refpage}VkDeviceGroupRenderPassBeginInfo.html[VkDeviceGroupRenderPassBeginInfo] structure can be chained from `VkRenderingInfoKHR`, with the same effect as when chained to link:{refpage}VkRenderPassBeginInfo.html[VkRenderPassBeginInfo] - setting the device mask and setting independent render areas per device.
221
222
223==== Fragment Shading Rate
224
225If link:{refpage}VK_KHR_fragment_shading_rate.html[VK_KHR_fragment_shading_rate] is enabled, when calling `vkCmdBeginRenderingKHR`, the following structure should be chained to `VkRenderingInfoKHR` to include a fragment shading rate attachment:
226
227[source,c]
228----
229typedef struct VkRenderingFragmentShadingRateAttachmentInfoKHR {
230    VkStructureType                     sType;
231    const void*                         pNext;
232    VkImageView                         imageView;
233    VkImageLayout                       imageLayout;
234} VkRenderingFragmentShadingRateAttachmentInfoKHR;
235----
236
237
238==== Fragment Density Map
239
240If link:{refpage}VK_EXT_fragment_density_map.html[VK_EXT_fragment_density_map] is enabled, when calling `vkCmdBeginRenderingKHR`, the following structure should be chained to `VkRenderingInfoKHR` to include a fragment density map attachment:
241
242[source,c]
243----
244typedef struct VkRenderingFragmentDensityMapAttachmentInfoEXT {
245    VkStructureType                     sType;
246    const void*                         pNext;
247    VkImageView                         imageView;
248    VkImageLayout                       imageLayout;
249} VkRenderingFragmentDensityMapAttachmentInfoEXT;
250----
251
252
253=== Pipeline Creation
254
255With the removal of render pass objects, it is now necessary to provide some of that same information to applications at pipeline creation.
256This structure is chained from link:{refpage}VkGraphicsPipelineCreateInfo.html[VkGraphicsPipelineCreateInfo]:
257
258[source,c]
259----
260typedef struct VkPipelineRenderingCreateInfoKHR {
261    VkStructureType    sType;
262    const void*        pNext;
263    uint32_t           colorAttachmentCount;
264    const VkFormat*    pColorAttachmentFormats;
265    VkFormat           depthAttachmentFormat;
266    VkFormat           stencilAttachmentFormat;
267    uint32_t           viewMask;
268} VkPipelineRenderingCreateInfoKHR;
269----
270
271If a color or depth/stencil attachment is specified in `vkCmdBeginRenderingKHR`, its format must match that provided here.
272If any format here is `VK_FORMAT_UNDEFINED`, no attachment must be specified for that attachment in `vkCmdBeginRenderingKHR`.
273If either `depthAttachmentFormat` or `stencilAttachmentFormat` are not `VK_FORMAT_UNDEFINED`, they must have the same value.
274
275The value of `viewMask` must match the value of the `viewMask` member of `VkRenderingInfoKHR`.
276
277==== Multiview Per-View Attributes
278
279If link:{refpage}VK_NVX_multiview_per_view_attributes.html[VK_NVX_multiview_per_view_attributes] is enabled, the multiview per-view attributes can be specified by including the `VkMultiviewPerViewAttributesInfoNVX` structure in the same `pNext` chain.
280
281==== Mixed Sample Attachments
282
283If either of link:{refpage}VK_NV_framebuffer_mixed_samples.html[VK_NV_framebuffer_mixed_samples] or link:{refpage}VK_AMD_mixed_attachment_samples.html[VK_AMD_mixed_attachment_samples] are enabled, the sample counts of color and depth attachments must be specified at pipeline creation as well.
284As with <<command-buffer-inheritance-mixed-samples,command buffer inheritance>>, the sample count of each attachment can be specified by including the `VkAttachmentSampleInfoAMD`/`VkAttachmentSampleCountInfoNV` structure in the `pNext` chain.
285If the structure is omitted, the sample count for each attachment is considered equal to link:{refpage}VkPipelineMultisampleStateCreateInfo.html[`VkPipelineMultisampleStateCreateInfo::rasterizationSamples`].
286
287
288==== Fragment Shading Rate
289
290If link:{refpage}VK_KHR_fragment_shading_rate.html[VK_KHR_fragment_shading_rate] is enabled, a new rasterization state pipeline creation flag must be provided if a shading rate attachment will be used:
291
292[source,c]
293----
294VK_PIPELINE_CREATE_RENDERING_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_KHR
295----
296
297
298==== Fragment Density Map
299
300If link:{refpage}VK_EXT_fragment_density_map.html[VK_EXT_fragment_density_map] is enabled, a new rasterization state pipeline creation flag must be provided if a fragment density map will be used:
301
302[source,c]
303----
304VK_PIPELINE_CREATE_RENDERING_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT
305----
306
307
308=== Features
309
310The following features are exposed by this extension:
311
312[source,c]
313----
314typedef struct VkPhysicalDeviceDynamicRenderingFeaturesKHR {
315    VkStructureType    sType;
316    void*              pNext;
317    VkBool32           dynamicRendering;
318} VkPhysicalDeviceDynamicRenderingFeaturesKHR
319----
320
321`dynamicRendering` is the core feature enabling this extension's functionality.
322
323
324== Examples
325
326
327=== Creating a Pipeline
328
329[source,c]
330----
331VkFormat colorRenderingFormats[2] = {
332    VK_FORMAT_R8G8B8A8_UNORM,
333    VK_FORMAT_R32_UINT };
334
335VkPipelineRenderingCreateInfoKHR rfInfo = {
336    .sType = VK_STRUCTURE_TYPE_PIPELINE_RENDERING_CREATE_INFO_KHR,
337    .pNext = NULL,
338    .colorAttachmentCount = 2,
339    .pColorAttachmentFormats = colorRenderingFormats,
340    .depthAttachmentFormat = VK_FORMAT_D32_SFLOAT_S8_UINT,
341    .stencilAttachmentFormat = VK_FORMAT_D32_SFLOAT_S8_UINT };
342
343VkGraphicsPipelineCreateInfo createInfo = {
344    .sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
345    .pNext = &rfInfo,
346    .renderPass = VK_NULL_HANDLE,
347    .... };
348
349VkPipeline graphicsPipeline;
350
351vkCreateGraphicsPipelines(device, pipelineCache, 1, &createInfo, NULL, &graphicsPipeline);
352----
353
354=== Rendering with a dynamic render pass
355
356[source,c]
357----
358VkRenderingAttachmentInfoKHR colorAttachments[2] = {
359    {
360        .sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO_KHR
361        .pNext = NULL,
362        .imageView = colorImageViews[0],
363        .imageLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL_KHR,
364        .resolveMode = VK_RESOLVE_MODE_AVERAGE_BIT,
365        .resolveImageView = resolveColorImageView,
366        .resolveImageLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL_KHR,
367        .loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR,
368        .storeOp = VK_ATTACHMENT_STORE_OP_DONT_CARE,
369        .clearValue = {.color = {.float32 = {0.0f,0.0f,0.0f,0.0f} } }
370    }, {
371        .sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO_KHR
372        .pNext = NULL,
373        .imageView = colorImageViews[1],
374        .imageLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL_KHR,
375        .resolveMode = VK_RESOLVE_MODE_NONE,
376        .loadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE,
377        .storeOp = VK_ATTACHMENT_STORE_OP_STORE
378    } };
379
380// A single depth stencil attachment info can be used, but they can also be specified separately.
381// When both are specified separately, the only requirement is that the image view is identical.
382VkRenderingAttachmentInfoKHR depthStencilAttachment = {
383    .sType = VK_STRUCTURE_TYPE_RENDERING_ATTACHMENT_INFO_KHR
384    .pNext = NULL,
385    .imageView = depthStencilImageView,
386    .imageLayout = VK_IMAGE_LAYOUT_ATTACHMENT_OPTIMAL_KHR,
387    .resolveMode = VK_RESOLVE_MODE_NONE,
388    .loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR,
389    .storeOp = VK_ATTACHMENT_STORE_OP_DONT_CARE,
390    .clearValue = {.depthStencil = {.depth = 0.0f, .stencil = 0 } } };
391
392VkRenderingInfoKHR renderingInfo = {
393    .sType = VK_STRUCTURE_TYPE_RENDERING_INFO_KHR,
394    .pNext = NULL,
395    .flags = 0,
396    .renderArea = { ... },
397    .layerCount = 1,
398    .colorAttachmentCount = 2,
399    .pColorAttachments = colorAttachments,
400    .pDepthAttachment = &depthStencilAttachment,
401    .pStencilAttachment = &depthStencilAttachment };
402
403vkCmdBeginRenderingKHR(commandBuffer, &renderingInfo);
404
405vkCmdDraw(commandBuffer, ...);
406
407...
408
409vkCmdDraw(commandBuffer, ...);
410
411vkCmdEndRenderingKHR(commandBuffer);
412----
413
414
415== Issues
416
417This section describes issues with the existing proposal – including both open issues that you have not addressed, and closed issues that are not self-evident from the proposal description.
418
419
420=== Should we support multiview?
421
422Yes, its complexity is much reduced compared to render pass objects, and it is probably worth preserving in this limited form for compatibility reasons.
423
424
425=== Should there be a view mask for multiview?
426
427Yes.
428Without multiple subpasses the view mask is significantly less useful; the layer count provided is sufficient to describe the number of views.
429However, the mask allows specification of a non-contiguous array, and while it is unclear if any applications use this, it has been included to maintain compatibility with existing APIs.
430
431
432=== Should we have functionality to replace the on-chip storage aspect of subpasses?
433
434No - this will be designed as a separate extension.
435
436
437=== Should pipeline barriers work inside these limited render passes?
438
439No - without input attachments or a solution for on-chip storage these are currently functionally useless.
440
441
442=== Is there a preferred render area granularity for `VkRenderingInfo::renderArea` similar to `vkGetRenderAreaGranularity`?
443
444During design discussions for this extension, no hardware vendor felt that this functionality was important enough to bring over to dynamic rendering.
445If vendors have performance concerns, extensions such as link:{refpage}VK_QCOM_tile_properties.html[VK_QCOM_tile_properties] can be exposed, and there may be scope for a future cross-vendor extension.
446Applications can use values for the render area freely without alignment considerations.
447