// Copyright 2015-2023 The Khronos Group Inc. // // SPDX-License-Identifier: CC-BY-4.0 [[shaders]] = Shaders A shader specifies programmable operations that execute for each vertex, control point, tessellated vertex, primitive, fragment, or workgroup in the corresponding stage(s) of the graphics and compute pipelines. Graphics pipelines include vertex shader execution as a result of <>, followed, if enabled, by tessellation control and evaluation shaders operating on <>, geometry shaders, if enabled, operating on primitives, and fragment shaders, if present, operating on fragments generated by <>. In this specification, vertex, tessellation control, tessellation evaluation and geometry shaders are collectively referred to as <>s and occur in the logical pipeline before rasterization. The fragment shader occurs logically after rasterization. Only the compute shader stage is included in a compute pipeline. Compute shaders operate on compute invocations in a workgroup. Shaders can: read from input variables, and read from and write to output variables. Input and output variables can: be used to transfer data between shader stages, or to allow the shader to interact with values that exist in the execution environment. Similarly, the execution environment provides constants describing capabilities. Shader variables are associated with execution environment-provided inputs and outputs using _built-in_ decorations in the shader. The available decorations for each stage are documented in the following subsections. ifdef::VK_EXT_shader_object[] [[shaders-objects]] == Shader Objects Shaders may: be compiled and linked into pipeline objects as described in <> chapter, or if the <> feature is enabled they may: be compiled into individual per-stage _shader objects_ which can: be bound on a command buffer independently from one another. Unlike pipelines, shader objects are not intrinsically tied to any specific set of state. Instead, state is specified dynamically in the command buffer. Each shader object represents a single compiled shader stage, which may: optionally: be linked with one or more other stages. [open,refpage='VkShaderEXT',desc='Opaque handle to a shader object',type='handles'] -- Shader objects are represented by sname:VkShaderEXT handles: include::{generated}/api/handles/VkShaderEXT.adoc[] -- [[shaders-objects-creation]] === Shader Object Creation Shader objects may: be created from shader code provided as SPIR-V, or in an opaque, implementation-defined binary format specific to the physical device. [open,refpage='vkCreateShadersEXT',desc='Create one or more new shaders',type='protos'] -- To create one or more shader objects, call: include::{generated}/api/protos/vkCreateShadersEXT.adoc[] * pname:device is the logical device that creates the shader objects. * pname:createInfoCount is the length of the pname:pCreateInfos and pname:pShaders arrays. * pname:pCreateInfos is a pointer to an array of slink:VkShaderCreateInfoEXT structures. * pname:pAllocator controls host memory allocation as described in the <> chapter. * pname:pShaders is a pointer to an array of slink:VkShaderEXT handles in which the resulting shader objects are returned. When this function returns, whether or not it succeeds, it is guaranteed that every element of pname:pShaders will have been overwritten by either dlink:VK_NULL_HANDLE or a valid sname:VkShaderEXT handle. This means that whenever shader creation fails, the application can: determine which shader the returned error pertains to by locating the first dlink:VK_NULL_HANDLE element in pname:pShaders. It also means that an application can: reliably clean up from a failed call by iterating over the pname:pShaders array and destroying every element that is not dlink:VK_NULL_HANDLE. .Valid Usage **** * [[VUID-vkCreateShadersEXT-None-08400]] The <> feature must: be enabled * [[VUID-vkCreateShadersEXT-pCreateInfos-08401]] If pname:createInfoCount is 1, there must: be no element of pname:pCreateInfos whose pname:flags member includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT * [[VUID-vkCreateShadersEXT-pCreateInfos-08402]] If the pname:flags member of any element of pname:pCreateInfos includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, the pname:flags member of all other elements of pname:pCreateInfos whose pname:stage is ename:VK_SHADER_STAGE_VERTEX_BIT, ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT, ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, ename:VK_SHADER_STAGE_GEOMETRY_BIT, or ename:VK_SHADER_STAGE_FRAGMENT_BIT must: also include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCreateShadersEXT-pCreateInfos-08403]] If the pname:flags member of any element of pname:pCreateInfos includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, the pname:flags member of all other elements of pname:pCreateInfos whose pname:stage is ename:VK_SHADER_STAGE_TASK_BIT_EXT or ename:VK_SHADER_STAGE_MESH_BIT_EXT must: also include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT * [[VUID-vkCreateShadersEXT-pCreateInfos-08404]] If the pname:flags member of any element of pname:pCreateInfos whose pname:stage is ename:VK_SHADER_STAGE_TASK_BIT_EXT or ename:VK_SHADER_STAGE_MESH_BIT_EXT includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, there must: be no member of pname:pCreateInfos whose pname:stage is ename:VK_SHADER_STAGE_VERTEX_BIT and whose pname:flags member includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT * [[VUID-vkCreateShadersEXT-pCreateInfos-08405]] If there is any element of pname:pCreateInfos whose pname:stage is ename:VK_SHADER_STAGE_MESH_BIT_EXT and whose pname:flags member includes both ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT and ename:VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT, there must: be no element of pname:pCreateInfos whose pname:stage is ename:VK_SHADER_STAGE_TASK_BIT_EXT and whose pname:flags member includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCreateShadersEXT-pCreateInfos-08409]] For each element of pname:pCreateInfos whose pname:flags member includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, if there is any other element of pname:pCreateInfos whose pname:stage is logically later than the pname:stage of the former and whose pname:flags member also includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, the pname:nextStage of the former must: be equal to the pname:stage of the element with the logically earliest pname:stage following the pname:stage of the former whose pname:flags member also includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT * [[VUID-vkCreateShadersEXT-pCreateInfos-08410]] The pname:stage member of each element of pname:pCreateInfos whose pname:flags member includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT must: be unique * [[VUID-vkCreateShadersEXT-pCreateInfos-08411]] The pname:codeType member of all elements of pname:pCreateInfos whose pname:flags member includes ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT must: be the same * [[VUID-vkCreateShadersEXT-pCreateInfos-08867]] If pname:pCreateInfos contains elements with both ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT and ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, both elements' pname:flags include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, both elements' pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT stage's pname:pCode contains an code:OpExecutionMode instruction specifying the type of subdivision, it must: match the subdivision type specified in the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT stage * [[VUID-vkCreateShadersEXT-pCreateInfos-08868]] If pname:pCreateInfos contains elements with both ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT and ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, both elements' pname:flags include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, both elements' pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT stage's pname:pCode contains an code:OpExecutionMode instruction specifying the orientation of triangles, it must: match the triangle orientation specified in the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT stage * [[VUID-vkCreateShadersEXT-pCreateInfos-08869]] If pname:pCreateInfos contains elements with both ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT and ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, both elements' pname:flags include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, both elements' pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT stage's pname:pCode contains an code:OpExecutionMode instruction specifying code:PointMode, the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT stage must: also contain an code:OpExecutionMode instruction specifying code:PointMode * [[VUID-vkCreateShadersEXT-pCreateInfos-08870]] If pname:pCreateInfos contains elements with both ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT and ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, both elements' pname:flags include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, both elements' pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT stage's pname:pCode contains an code:OpExecutionMode instruction specifying the spacing of segments on the edges of tessellated primitives, it must: match the segment spacing specified in the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT stage * [[VUID-vkCreateShadersEXT-pCreateInfos-08871]] If pname:pCreateInfos contains elements with both ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT and ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, both elements' pname:flags include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT, both elements' pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT stage's pname:pCode contains an code:OpExecutionMode instruction specifying the output patch size, it must: match the output patch size specified in the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT stage **** include::{generated}/validity/protos/vkCreateShadersEXT.adoc[] -- [open,refpage='VkShaderCreateInfoEXT',desc='Structure specifying parameters of a newly created shader',type='structs'] -- :refpage: VkShaderCreateInfoEXT The sname:VkShaderCreateInfoEXT structure is defined as: include::{generated}/api/structs/VkShaderCreateInfoEXT.adoc[] * pname:sType is a elink:VkStructureType value identifying this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:flags is a bitmask of elink:VkShaderCreateFlagBitsEXT describing additional parameters of the shader. * pname:stage is a elink:VkShaderStageFlagBits value specifying a single shader stage. * pname:nextStage is a bitmask of elink:VkShaderStageFlagBits specifying zero or stages which may: be used as a logically next bound stage when drawing with the shader bound. * pname:codeType is a elink:VkShaderCodeTypeEXT value specifying the type of the shader code pointed to be pname:pCode. * pname:codeSize is the size in bytes of the shader code pointed to be pname:pCode. * pname:pCode is a pointer to the shader code to use to create the shader. * pname:pName is a pointer to a null-terminated UTF-8 string specifying the entry point name of the shader for this stage. * pname:setLayoutCount is the number of descriptor set layouts pointed to by pname:pSetLayouts. * pname:pSetLayouts is a pointer to an array of slink:VkDescriptorSetLayout objects used by the shader stage. * pname:pushConstantRangeCount is the number of push constant ranges pointed to by pname:pPushConstantRanges. * pname:pPushConstantRanges is a pointer to an array of slink:VkPushConstantRange structures used by the shader stage. * pname:pSpecializationInfo is a pointer to a slink:VkSpecializationInfo structure, as described in <>, or `NULL`. .Valid Usage **** :prefixCondition: If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, include::{chapters}/commonvalidity/shader_create_spv_common.adoc[] * [[VUID-VkShaderCreateInfoEXT-flags-08412]] If pname:stage is not ename:VK_SHADER_STAGE_TASK_BIT_EXT, ename:VK_SHADER_STAGE_MESH_BIT_EXT, ename:VK_SHADER_STAGE_VERTEX_BIT, ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT, ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, ename:VK_SHADER_STAGE_GEOMETRY_BIT, or ename:VK_SHADER_STAGE_FRAGMENT_BIT, pname:flags must: not include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT ifdef::VK_KHR_fragment_shading_rate[] * [[VUID-VkShaderCreateInfoEXT-flags-08486]] If pname:stage is not ename:VK_SHADER_STAGE_FRAGMENT_BIT, pname:flags must: not include ename:VK_SHADER_CREATE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_EXT * [[VUID-VkShaderCreateInfoEXT-flags-08487]] If the <> feature is not enabled, pname:flags must: not include ename:VK_SHADER_CREATE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_EXT endif::VK_KHR_fragment_shading_rate[] ifdef::VK_EXT_fragment_density_map[] * [[VUID-VkShaderCreateInfoEXT-flags-08488]] If pname:stage is not ename:VK_SHADER_STAGE_FRAGMENT_BIT, pname:flags must: not include ename:VK_SHADER_CREATE_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT * [[VUID-VkShaderCreateInfoEXT-flags-08489]] If the <> feature is not enabled, pname:flags must: not include ename:VK_SHADER_CREATE_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT endif::VK_EXT_fragment_density_map[] ifdef::VK_VERSION_1_1,VK_EXT_subgroup_size_control[] * [[VUID-VkShaderCreateInfoEXT-flags-09404]] If pname:flags includes ename:VK_SHADER_CREATE_ALLOW_VARYING_SUBGROUP_SIZE_BIT_EXT, the <> feature must: be enabled * [[VUID-VkShaderCreateInfoEXT-flags-09405]] If pname:flags includes ename:VK_SHADER_CREATE_REQUIRE_FULL_SUBGROUPS_BIT_EXT, the <> feature must: be enabled * [[VUID-VkShaderCreateInfoEXT-flags-08992]] If pname:flags includes ename:VK_SHADER_CREATE_REQUIRE_FULL_SUBGROUPS_BIT_EXT, pname:stage must: be ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] one of ename:VK_SHADER_STAGE_MESH_BIT_EXT, ename:VK_SHADER_STAGE_TASK_BIT_EXT, or endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] ename:VK_SHADER_STAGE_COMPUTE_BIT endif::VK_VERSION_1_1,VK_EXT_subgroup_size_control[] ifdef::VK_VERSION_1_1,VK_KHR_device_group[] * [[VUID-VkShaderCreateInfoEXT-flags-08485]] If pname:stage is not ename:VK_SHADER_STAGE_COMPUTE_BIT, pname:flags must: not include ename:VK_SHADER_CREATE_DISPATCH_BASE_BIT_EXT endif::VK_VERSION_1_1,VK_KHR_device_group[] ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-VkShaderCreateInfoEXT-flags-08414]] If pname:stage is not ename:VK_SHADER_STAGE_MESH_BIT_EXT, pname:flags must: not include ename:VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] ifdef::VK_VERSION_1_1,VK_EXT_subgroup_size_control[] * [[VUID-VkShaderCreateInfoEXT-flags-08416]] If pname:flags includes both ename:VK_SHADER_CREATE_ALLOW_VARYING_SUBGROUP_SIZE_BIT_EXT and ename:VK_SHADER_CREATE_REQUIRE_FULL_SUBGROUPS_BIT_EXT, the local workgroup size in the X dimension of the shader must: be a multiple of <> * [[VUID-VkShaderCreateInfoEXT-flags-08417]] If pname:flags includes ename:VK_SHADER_CREATE_REQUIRE_FULL_SUBGROUPS_BIT_EXT but not ename:VK_SHADER_CREATE_ALLOW_VARYING_SUBGROUP_SIZE_BIT_EXT and no slink:VkShaderRequiredSubgroupSizeCreateInfoEXT structure is included in the pname:pNext chain, the local workgroup size in the X dimension of the shader must: be a multiple of <> endif::VK_VERSION_1_1,VK_EXT_subgroup_size_control[] * [[VUID-VkShaderCreateInfoEXT-stage-08418]] pname:stage must: not be ename:VK_SHADER_STAGE_ALL_GRAPHICS or ename:VK_SHADER_STAGE_ALL * [[VUID-VkShaderCreateInfoEXT-stage-08419]] If the <> feature is not enabled, pname:stage must: not be ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT or ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT * [[VUID-VkShaderCreateInfoEXT-stage-08420]] If the <> feature is not enabled, pname:stage must: not be ename:VK_SHADER_STAGE_GEOMETRY_BIT ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-VkShaderCreateInfoEXT-stage-08421]] If the <> feature is not enabled, pname:stage must: not be ename:VK_SHADER_STAGE_TASK_BIT_EXT * [[VUID-VkShaderCreateInfoEXT-stage-08422]] If the <> feature is not enabled, pname:stage must: not be ename:VK_SHADER_STAGE_MESH_BIT_EXT endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] ifdef::VK_HUAWEI_subpass_shading[] * [[VUID-VkShaderCreateInfoEXT-stage-08425]] pname:stage must: not be ename:VK_SHADER_STAGE_SUBPASS_SHADING_BIT_HUAWEI endif::VK_HUAWEI_subpass_shading[] ifdef::VK_HUAWEI_cluster_culling_shader[] * [[VUID-VkShaderCreateInfoEXT-stage-08426]] pname:stage must: not be ename:VK_SHADER_STAGE_CLUSTER_CULLING_BIT_HUAWEI endif::VK_HUAWEI_cluster_culling_shader[] * [[VUID-VkShaderCreateInfoEXT-nextStage-08427]] If pname:stage is ename:VK_SHADER_STAGE_VERTEX_BIT, pname:nextStage must: not include any bits other than ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT, ename:VK_SHADER_STAGE_GEOMETRY_BIT, and ename:VK_SHADER_STAGE_FRAGMENT_BIT * [[VUID-VkShaderCreateInfoEXT-nextStage-08428]] If the <> feature is not enabled, pname:nextStage must: not include ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT or ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT * [[VUID-VkShaderCreateInfoEXT-nextStage-08429]] If the <> feature is not enabled, pname:nextStage must: not include ename:VK_SHADER_STAGE_GEOMETRY_BIT * [[VUID-VkShaderCreateInfoEXT-nextStage-08430]] If pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT, pname:nextStage must: not include any bits other than ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT * [[VUID-VkShaderCreateInfoEXT-nextStage-08431]] If pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, pname:nextStage must: not include any bits other than ename:VK_SHADER_STAGE_GEOMETRY_BIT and ename:VK_SHADER_STAGE_FRAGMENT_BIT * [[VUID-VkShaderCreateInfoEXT-nextStage-08433]] If pname:stage is ename:VK_SHADER_STAGE_GEOMETRY_BIT, pname:nextStage must: not include any bits other than ename:VK_SHADER_STAGE_FRAGMENT_BIT * [[VUID-VkShaderCreateInfoEXT-nextStage-08434]] If pname:stage is ename:VK_SHADER_STAGE_FRAGMENT_BIT or ename:VK_SHADER_STAGE_COMPUTE_BIT, pname:nextStage must: be 0 ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-VkShaderCreateInfoEXT-nextStage-08435]] If pname:stage is ename:VK_SHADER_STAGE_TASK_BIT_EXT, pname:nextStage must: not include any bits other than ename:VK_SHADER_STAGE_MESH_BIT_EXT * [[VUID-VkShaderCreateInfoEXT-nextStage-08436]] If pname:stage is ename:VK_SHADER_STAGE_MESH_BIT_EXT, pname:nextStage must: not include any bits other than ename:VK_SHADER_STAGE_FRAGMENT_BIT endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-VkShaderCreateInfoEXT-pName-08440]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, pname:pName must: be the name of an code:OpEntryPoint in pname:pCode with an execution model that matches pname:stage * [[VUID-VkShaderCreateInfoEXT-pCode-08492]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_BINARY_EXT, pname:pCode must: be aligned to `16` bytes * [[VUID-VkShaderCreateInfoEXT-pCode-08493]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, pname:pCode must: be aligned to `4` bytes * [[VUID-VkShaderCreateInfoEXT-pCode-08448]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the identified entry point includes any variable in its interface that is declared with the code:ClipDistance code:BuiltIn decoration, that variable must: not have an array size greater than sname:VkPhysicalDeviceLimits::pname:maxClipDistances * [[VUID-VkShaderCreateInfoEXT-pCode-08449]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the identified entry point includes any variable in its interface that is declared with the code:CullDistance code:BuiltIn decoration, that variable must: not have an array size greater than sname:VkPhysicalDeviceLimits::pname:maxCullDistances * [[VUID-VkShaderCreateInfoEXT-pCode-08450]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the identified entry point includes any variables in its interface that are declared with the code:ClipDistance or code:CullDistance code:BuiltIn decoration, those variables must: not have array sizes which sum to more than sname:VkPhysicalDeviceLimits::pname:maxCombinedClipAndCullDistances * [[VUID-VkShaderCreateInfoEXT-pCode-08451]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and the identified entry point includes any variable in its interface that is declared with the code:SampleMask code:BuiltIn decoration, that variable must: not have an array size greater than sname:VkPhysicalDeviceLimits::pname:maxSampleMaskWords * [[VUID-VkShaderCreateInfoEXT-pCode-08452]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_VERTEX_BIT, the identified entry point must: not include any input variable in its interface that is decorated with code:CullDistance * [[VUID-VkShaderCreateInfoEXT-pCode-08453]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT or ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, and the identified entry point has an code:OpExecutionMode instruction specifying a patch size with code:OutputVertices, the patch size must: be greater than `0` and less than or equal to sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize * [[VUID-VkShaderCreateInfoEXT-pCode-08454]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_GEOMETRY_BIT, the identified entry point must: have an code:OpExecutionMode instruction specifying a maximum output vertex count that is greater than `0` and less than or equal to sname:VkPhysicalDeviceLimits::pname:maxGeometryOutputVertices * [[VUID-VkShaderCreateInfoEXT-pCode-08455]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_GEOMETRY_BIT, the identified entry point must: have an code:OpExecutionMode instruction specifying an invocation count that is greater than `0` and less than or equal to sname:VkPhysicalDeviceLimits::pname:maxGeometryShaderInvocations * [[VUID-VkShaderCreateInfoEXT-pCode-08456]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is a <>, and the identified entry point writes to code:Layer for any primitive, it must: write the same value to code:Layer for all vertices of a given primitive * [[VUID-VkShaderCreateInfoEXT-pCode-08457]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is a <>, and the identified entry point writes to code:ViewportIndex for any primitive, it must: write the same value to code:ViewportIndex for all vertices of a given primitive * [[VUID-VkShaderCreateInfoEXT-pCode-08458]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_FRAGMENT_BIT, the identified entry point must: not include any output variables in its interface decorated with code:CullDistance * [[VUID-VkShaderCreateInfoEXT-pCode-08459]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_FRAGMENT_BIT, and the identified entry point writes to code:FragDepth in any execution path, all execution paths that are not exclusive to helper invocations must: either discard the fragment, or write or initialize the value of code:FragDepth * [[VUID-VkShaderCreateInfoEXT-pCode-08460]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, the shader code in pname:pCode must: be valid as described by the <> after applying the specializations provided in pname:pSpecializationInfo, if any, and then converting all specialization constants into fixed constants * [[VUID-VkShaderCreateInfoEXT-codeType-08872]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, pname:pCode must: contain an code:OpExecutionMode instruction specifying the type of subdivision * [[VUID-VkShaderCreateInfoEXT-codeType-08873]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, pname:pCode must: contain an code:OpExecutionMode instruction specifying the orientation of triangles generated by the tessellator * [[VUID-VkShaderCreateInfoEXT-codeType-08874]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, pname:pCode must: contain an code:OpExecutionMode instruction specifying the spacing of segments on the edges of tessellated primitives * [[VUID-VkShaderCreateInfoEXT-codeType-08875]] If pname:codeType is ename:VK_SHADER_CODE_TYPE_SPIRV_EXT, and pname:stage is ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, pname:pCode must: contain an code:OpExecutionMode instruction specifying the output patch size **** include::{generated}/validity/structs/VkShaderCreateInfoEXT.adoc[] -- [open,refpage='VkShaderCreateFlagsEXT',desc='Bitmask of VkShaderCreateFlagBitsEXT',type='flags'] -- include::{generated}/api/flags/VkShaderCreateFlagsEXT.adoc[] tname:VkShaderCreateFlagsEXT is a bitmask type for setting a mask of zero or more elink:VkShaderCreateFlagBitsEXT. -- [open,refpage='VkShaderCreateFlagBitsEXT',desc='Bitmask controlling how a shader object is created',type='enums'] -- Possible values of the pname:flags member of slink:VkShaderCreateInfoEXT specifying how a shader object is created, are: include::{generated}/api/enums/VkShaderCreateFlagBitsEXT.adoc[] * ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT specifies that a shader is linked to all other shaders created in the same flink:vkCreateShadersEXT call whose slink:VkShaderCreateInfoEXT structures' pname:flags include ename:VK_SHADER_CREATE_LINK_STAGE_BIT_EXT. ifdef::VK_VERSION_1_1,VK_EXT_subgroup_size_control[] * ename:VK_SHADER_CREATE_ALLOW_VARYING_SUBGROUP_SIZE_BIT_EXT specifies that the <> may: vary in a ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[task, mesh, or] compute shader. * ename:VK_SHADER_CREATE_REQUIRE_FULL_SUBGROUPS_BIT_EXT specifies that the subgroup sizes must: be launched with all invocations active in a ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[task, mesh, or] compute shader. endif::VK_VERSION_1_1,VK_EXT_subgroup_size_control[] ifdef::VK_EXT_mesh_shader,VK_NV_mesh_shader[] * ename:VK_SHADER_CREATE_NO_TASK_SHADER_BIT_EXT specifies that a mesh shader must: only be used without a task shader. Otherwise, the mesh shader must: only be used with a task shader. endif::VK_EXT_mesh_shader,VK_NV_mesh_shader[] ifdef::VK_VERSION_1_1,VK_KHR_device_group[] * ename:VK_SHADER_CREATE_DISPATCH_BASE_BIT_EXT specifies that a compute shader can: be used with flink:vkCmdDispatchBase with a non-zero base workgroup. endif::VK_VERSION_1_1,VK_KHR_device_group[] ifdef::VK_KHR_fragment_shading_rate[] * ename:VK_SHADER_CREATE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_EXT specifies that a fragment shader can: be used with a fragment shading rate attachment. endif::VK_KHR_fragment_shading_rate[] ifdef::VK_EXT_fragment_density_map[] * ename:VK_SHADER_CREATE_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT specifies that a fragment shader can: be used with a fragment density map attachment. endif::VK_EXT_fragment_density_map[] -- ifdef::VK_KHR_fragment_shading_rate,VK_EXT_fragment_density_map[] [NOTE] .Note ==== The behavior of ifdef::VK_KHR_fragment_shading_rate[] ename:VK_SHADER_CREATE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_EXT endif::VK_KHR_fragment_shading_rate[] ifdef::VK_KHR_fragment_shading_rate+VK_EXT_fragment_density_map[and] ifdef::VK_EXT_fragment_density_map[] ename:VK_SHADER_CREATE_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT endif::VK_EXT_fragment_density_map[] differs subtly from the behavior of ifdef::VK_KHR_fragment_shading_rate[] ename:VK_PIPELINE_CREATE_RENDERING_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_KHR endif::VK_KHR_fragment_shading_rate[] ifdef::VK_KHR_fragment_shading_rate+VK_EXT_fragment_density_map[and] ifdef::VK_EXT_fragment_density_map[] ename:VK_PIPELINE_CREATE_RENDERING_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT endif::VK_EXT_fragment_density_map[] in that the shader bit allows, but does not require the shader to be used with that type of attachment. This means that the application need not create multiple shaders when it does not know in advance whether the shader will be used with or without the attachment type, or when it needs the same shader to be compatible with usage both with and without. This may: come at some performance cost on some implementations, so applications should: still only set bits that are actually necessary. ==== endif::VK_KHR_fragment_shading_rate,VK_EXT_fragment_density_map[] [open,refpage='VkShaderCodeTypeEXT',desc='Indicate a shader code type',type='enums'] -- Shader objects can: be created using different types of shader code. Possible values of slink:VkShaderCreateInfoEXT::pname:codeType, are: include::{generated}/api/enums/VkShaderCodeTypeEXT.adoc[] * ename:VK_SHADER_CODE_TYPE_BINARY_EXT specifies shader code in an opaque, implementation-defined binary format specific to the physical device. * ename:VK_SHADER_CODE_TYPE_SPIRV_EXT specifies shader code in SPIR-V format. -- [[shaders-objects-binary-code]] === Binary Shader Code [open,refpage='vkGetShaderBinaryDataEXT',desc='Get the binary shader code from a shader object',type='protos'] -- Binary shader code can: be retrieved from a shader object using the command: include::{generated}/api/protos/vkGetShaderBinaryDataEXT.adoc[] * pname:device is the logical device that shader object was created from. * pname:shader is the shader object to retrieve binary shader code from. * pname:pDataSize is a pointer to a code:size_t value related to the size of the binary shader code, as described below. * pname:pData is either `NULL` or a pointer to a buffer. If pname:pData is `NULL`, then the size of the binary shader code of the shader object, in bytes, is returned in pname:pDataSize. Otherwise, pname:pDataSize must: point to a variable set by the user to the size of the buffer, in bytes, pointed to by pname:pData, and on return the variable is overwritten with the amount of data actually written to pname:pData. If pname:pDataSize is less than the size of the binary shader code, nothing is written to pname:pData, and ename:VK_INCOMPLETE will be returned instead of ename:VK_SUCCESS. [NOTE] .Note ==== The behavior of this command when pname:pDataSize is too small differs from how some other getter-type commands work in Vulkan. Because shader binary data is only usable in its entirety, it would never be useful for the implementation to return partial data. Because of this, nothing is written to pname:pData unless pname:pDataSize is large enough to fit the data it its entirety. ==== Binary shader code retrieved using fname:vkGetShaderBinaryDataEXT can: be passed to a subsequent call to flink:vkCreateShadersEXT on a compatible physical device by specifying ename:VK_SHADER_CODE_TYPE_BINARY_EXT in the pname:codeType member of sname:VkShaderCreateInfoEXT. The shader code returned by repeated calls to this function with the same sname:VkShaderEXT is guaranteed to be invariant for the lifetime of the sname:VkShaderEXT object. .Valid Usage **** * [[VUID-vkGetShaderBinaryDataEXT-None-08461]] The <> feature must: be enabled * [[VUID-vkGetShaderBinaryDataEXT-None-08499]] If pname:pData is not `NULL`, it must: be aligned to `16` bytes **** include::{generated}/validity/protos/vkGetShaderBinaryDataEXT.adoc[] -- [[shaders-objects-binary-compatibility]] === Binary Shader Compatibility Binary shader compatibility means that binary shader code returned from a call to flink:vkGetShaderBinaryDataEXT can: be passed to a later call to flink:vkCreateShadersEXT, potentially on a different logical and/or physical device, and that this will result in the successful creation of a shader object functionally equivalent to the shader object that the code was originally queried from. Binary shader code queried from flink:vkGetShaderBinaryDataEXT is not guaranteed to be compatible across all devices, but implementations are required to provide some compatibility guarantees. Applications may: determine binary shader compatibility using either (or both) of two mechanisms. Guaranteed compatibility of shader binaries is expressed through a combination of the pname:shaderBinaryUUID and pname:shaderBinaryVersion members of the slink:VkPhysicalDeviceShaderObjectPropertiesEXT structure queried from a physical device. Binary shaders retrieved from a physical device with a certain pname:shaderBinaryUUID are guaranteed to be compatible with all other physical devices reporting the same pname:shaderBinaryUUID and the same or higher pname:shaderBinaryVersion. Whenever a new version of an implementation incorporates any changes that affect the output of flink:vkGetShaderBinaryDataEXT, the implementation should: either increment pname:shaderBinaryVersion if binary shader code retrieved from older versions remains compatible with the new implementation, or else replace pname:shaderBinaryUUID with a new value if backward compatibility has been broken. Binary shader code queried from a device with a matching pname:shaderBinaryUUID and lower pname:shaderBinaryVersion relative to the device on which flink:vkCreateShadersEXT is being called may: be suboptimal for the new device in ways that do not change shader functionality, but it is still guaranteed to be usable to successfully create the shader object(s). [NOTE] .Note ==== Implementations are encouraged to share pname:shaderBinaryUUID between devices and driver versions to the maximum extent their hardware naturally allows, and are *strongly* discouraged from ever changing the pname:shaderBinaryUUID for the same hardware except unless absolutely necessary. ==== In addition to the shader compatibility guarantees described above, it is valid for an application to call flink:vkCreateShadersEXT with binary shader code created on a device with a different or unknown pname:shaderBinaryUUID and/or higher pname:shaderBinaryVersion. In this case, the implementation may: use any unspecified means of its choosing to determine whether the provided binary shader code is usable. If it is, the flink:vkCreateShadersEXT call must: return ename:VK_SUCCESS, and the created shader object is guaranteed to be valid. Otherwise, in the absence of some other error, the flink:vkCreateShadersEXT call must: return ename:VK_ERROR_INCOMPATIBLE_SHADER_BINARY_EXT to indicate that the provided binary shader code is not compatible with the device. [[shaders-objects-binding]] === Binding Shader Objects [open,refpage='vkCmdBindShadersEXT',desc='Bind shader objects to a command buffer',type='protos'] -- Once shader objects have been created, they can: be bound to the command buffer using the command: include::{generated}/api/protos/vkCmdBindShadersEXT.adoc[] * pname:commandBuffer is the command buffer that the shader object will be bound to. * pname:stageCount is the length of the pname:pStages and pname:pShaders arrays. * pname:pStages is a pointer to an array of elink:VkShaderStageFlagBits values specifying one stage per array index that is affected by the corresponding value in the pname:pShaders array. * pname:pShaders is a pointer to an array of sname:VkShaderEXT handles and/or dlink:VK_NULL_HANDLE values describing the shader binding operations to be performed on each stage in pname:pStages. When binding linked shaders, an application may: bind them in any combination of one or more calls to fname:vkCmdBindShadersEXT (i.e., shaders that were created linked together do not need to be bound in the same fname:vkCmdBindShadersEXT call). Any shader object bound to a particular stage may: be unbound by setting its value in pname:pShaders to dlink:VK_NULL_HANDLE. If pname:pShaders is `NULL`, fname:vkCmdBindShadersEXT behaves as if pname:pShaders was an array of pname:stageCount dlink:VK_NULL_HANDLE values (i.e., any shaders bound to the stages specified in pname:pStages are unbound). .Valid Usage **** * [[VUID-vkCmdBindShadersEXT-None-08462]] The <> feature must: be enabled * [[VUID-vkCmdBindShadersEXT-pStages-08463]] Every element of pname:pStages must: be unique * [[VUID-vkCmdBindShadersEXT-pStages-08464]] pname:pStages must: not contain ename:VK_SHADER_STAGE_ALL_GRAPHICS or ename:VK_SHADER_STAGE_ALL ifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] * [[VUID-vkCmdBindShadersEXT-pStages-08465]] pname:pStages must: not contain ename:VK_SHADER_STAGE_RAYGEN_BIT_KHR, ename:VK_SHADER_STAGE_ANY_HIT_BIT_KHR, ename:VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR, ename:VK_SHADER_STAGE_MISS_BIT_KHR, ename:VK_SHADER_STAGE_INTERSECTION_BIT_KHR, or ename:VK_SHADER_STAGE_CALLABLE_BIT_KHR endif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] ifdef::VK_HUAWEI_subpass_shading[] * [[VUID-vkCmdBindShadersEXT-pStages-08467]] pname:pStages must: not contain ename:VK_SHADER_STAGE_SUBPASS_SHADING_BIT_HUAWEI endif::VK_HUAWEI_subpass_shading[] ifdef::VK_HUAWEI_cluster_culling_shader[] * [[VUID-vkCmdBindShadersEXT-pStages-08468]] pname:pStages must: not contain ename:VK_SHADER_STAGE_CLUSTER_CULLING_BIT_HUAWEI endif::VK_HUAWEI_cluster_culling_shader[] * [[VUID-vkCmdBindShadersEXT-pShaders-08469]] For each element of pname:pStages, if pname:pShaders is not `NULL`, and the element of the pname:pShaders array with the same index is not dlink:VK_NULL_HANDLE, it must: have been created with a pname:stage equal to the corresponding element of pname:pStages ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCmdBindShadersEXT-pShaders-08470]] If pname:pStages contains both ename:VK_SHADER_STAGE_TASK_BIT_EXT and ename:VK_SHADER_STAGE_VERTEX_BIT, and pname:pShaders is not `NULL`, and the same index in pname:pShaders as ename:VK_SHADER_STAGE_TASK_BIT_EXT in pname:pStages is not dlink:VK_NULL_HANDLE, the same index in pname:pShaders as ename:VK_SHADER_STAGE_VERTEX_BIT in pname:pStages must: be dlink:VK_NULL_HANDLE * [[VUID-vkCmdBindShadersEXT-pShaders-08471]] If pname:pStages contains both ename:VK_SHADER_STAGE_MESH_BIT_EXT and ename:VK_SHADER_STAGE_VERTEX_BIT, and pname:pShaders is not `NULL`, and the same index in pname:pShaders as ename:VK_SHADER_STAGE_MESH_BIT_EXT in pname:pStages is not dlink:VK_NULL_HANDLE, the same index in pname:pShaders as ename:VK_SHADER_STAGE_VERTEX_BIT in pname:pStages must: be dlink:VK_NULL_HANDLE endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCmdBindShadersEXT-pShaders-08474]] If the <> feature is not enabled, and pname:pStages contains ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT or ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, and pname:pShaders is not `NULL`, the same index or indices in pname:pShaders must: be dlink:VK_NULL_HANDLE * [[VUID-vkCmdBindShadersEXT-pShaders-08475]] If the <> feature is not enabled, and pname:pStages contains ename:VK_SHADER_STAGE_GEOMETRY_BIT, and pname:pShaders is not `NULL`, the same index in pname:pShaders must: be dlink:VK_NULL_HANDLE ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCmdBindShadersEXT-pShaders-08490]] If the <> feature is not enabled, and pname:pStages contains ename:VK_SHADER_STAGE_TASK_BIT_EXT, and pname:pShaders is not `NULL`, the same index in pname:pShaders must: be dlink:VK_NULL_HANDLE * [[VUID-vkCmdBindShadersEXT-pShaders-08491]] If the <> feature is not enabled, and pname:pStages contains ename:VK_SHADER_STAGE_MESH_BIT_EXT, and pname:pShaders is not `NULL`, the same index in pname:pShaders must: be dlink:VK_NULL_HANDLE endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCmdBindShadersEXT-pShaders-08476]] If pname:pStages contains ename:VK_SHADER_STAGE_COMPUTE_BIT, the sname:VkCommandPool that pname:commandBuffer was allocated from must: support compute operations * [[VUID-vkCmdBindShadersEXT-pShaders-08477]] If pname:pStages contains ename:VK_SHADER_STAGE_VERTEX_BIT, ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT, ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT, ename:VK_SHADER_STAGE_GEOMETRY_BIT, or ename:VK_SHADER_STAGE_FRAGMENT_BIT, the sname:VkCommandPool that pname:commandBuffer was allocated from must: support graphics operations ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] * [[VUID-vkCmdBindShadersEXT-pShaders-08478]] If pname:pStages contains ename:VK_SHADER_STAGE_MESH_BIT_EXT or ename:VK_SHADER_STAGE_TASK_BIT_EXT, the sname:VkCommandPool that pname:commandBuffer was allocated from must: support graphics operations endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] **** include::{generated}/validity/protos/vkCmdBindShadersEXT.adoc[] -- [[shaders-objects-state]] === Setting State Whenever shader objects are used to issue drawing commands, the appropriate <> setting commands must: have been called to set the relevant state in the command buffer prior to drawing: * flink:vkCmdSetViewportWithCount * flink:vkCmdSetScissorWithCount * flink:vkCmdSetRasterizerDiscardEnable ifdef::VK_EXT_mesh_shader,VK_NV_mesh_shader[] If a shader is bound to the ename:VK_SHADER_STAGE_VERTEX_BIT stage, the following commands must: have been called in the command buffer prior to drawing: endif::VK_EXT_mesh_shader,VK_NV_mesh_shader[] * flink:vkCmdSetVertexInputEXT * flink:vkCmdSetPrimitiveTopology * flink:vkCmdSetPatchControlPointsEXT, if pname:primitiveTopology is ename:VK_PRIMITIVE_TOPOLOGY_PATCH_LIST * flink:vkCmdSetPrimitiveRestartEnable If a shader is bound to the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT stage, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetTessellationDomainOriginEXT If pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetRasterizationSamplesEXT * flink:vkCmdSetSampleMaskEXT * flink:vkCmdSetAlphaToCoverageEnableEXT * flink:vkCmdSetAlphaToOneEnableEXT, if the <> feature is enabled on the device * flink:vkCmdSetPolygonModeEXT * flink:vkCmdSetLineWidth, if pname:polygonMode is ename:VK_POLYGON_MODE_LINE, or if ifdef::VK_EXT_mesh_shader,VK_NV_mesh_shader[] a shader is bound to the ename:VK_SHADER_STAGE_VERTEX_BIT stage and endif::VK_EXT_mesh_shader,VK_NV_mesh_shader[] pname:primitiveTopology is a line topology, or if a shader which outputs line primitives is bound to the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT or ename:VK_SHADER_STAGE_GEOMETRY_BIT stage * flink:vkCmdSetCullMode * flink:vkCmdSetFrontFace * flink:vkCmdSetDepthTestEnable * flink:vkCmdSetDepthWriteEnable * flink:vkCmdSetDepthCompareOp, if pname:depthTestEnable is ename:VK_TRUE * flink:vkCmdSetDepthBoundsTestEnable, if the <> feature is enabled on the device * flink:vkCmdSetDepthBounds, if pname:depthBoundsTestEnable is ename:VK_TRUE * flink:vkCmdSetDepthBiasEnable ifdef::VK_EXT_depth_bias_control[] * flink:vkCmdSetDepthBias or flink:vkCmdSetDepthBias2EXT, endif::VK_EXT_depth_bias_control[] ifndef::VK_EXT_depth_bias_control[] * flink:vkCmdSetDepthBias, endif::VK_EXT_depth_bias_control[] if pname:depthBiasEnable is ename:VK_TRUE * flink:vkCmdSetDepthClampEnableEXT, if the <> feature is enabled on the device * flink:vkCmdSetStencilTestEnable * flink:vkCmdSetStencilOp, if pname:stencilTestEnable is ename:VK_TRUE * flink:vkCmdSetStencilCompareMask, if pname:stencilTestEnable is ename:VK_TRUE * flink:vkCmdSetStencilWriteMask, if pname:stencilTestEnable is ename:VK_TRUE * flink:vkCmdSetStencilReference, if pname:stencilTestEnable is ename:VK_TRUE If a shader is bound to the ename:VK_SHADER_STAGE_FRAGMENT_BIT stage, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetLogicOpEnableEXT, if the <> feature is enabled on the device * flink:vkCmdSetLogicOpEXT, if pname:logicOpEnable is ename:VK_TRUE * flink:vkCmdSetColorBlendEnableEXT, with values set for every color attachment in the render pass instance active at draw time ifdef::VK_EXT_blend_operation_advanced[] * flink:vkCmdSetColorBlendEquationEXT or flink:vkCmdSetColorBlendAdvancedEXT, endif::VK_EXT_blend_operation_advanced[] ifndef::VK_EXT_blend_operation_advanced[] * flink:vkCmdSetColorBlendEquationEXT, endif::VK_EXT_blend_operation_advanced[] for every attachment whose index in pname:pColorBlendEnables is a pointer to a value of ename:VK_TRUE * flink:vkCmdSetBlendConstants, if any index in pname:pColorBlendEnables is ename:VK_TRUE, and the same index in pname:pColorBlendEquations is a sname:VkColorBlendEquationEXT structure with any elink:VkBlendFactor member with a value of ename:VK_BLEND_FACTOR_CONSTANT_COLOR, ename:VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_COLOR, ename:VK_BLEND_FACTOR_CONSTANT_ALPHA, or ename:VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_ALPHA * flink:vkCmdSetColorWriteMaskEXT ifdef::VK_KHR_fragment_shading_rate[] If the <> feature is enabled on the device, and a shader is bound to the ename:VK_SHADER_STAGE_FRAGMENT_BIT stage, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetFragmentShadingRateKHR endif::VK_KHR_fragment_shading_rate[] ifdef::VK_EXT_transform_feedback[] If the <> feature is enabled on the device, and a shader is bound to the ename:VK_SHADER_STAGE_GEOMETRY_BIT stage, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetRasterizationStreamEXT endif::VK_EXT_transform_feedback[] ifdef::VK_EXT_discard_rectangles[] If the `apiext:VK_EXT_discard_rectangles` extension is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetDiscardRectangleEnableEXT * flink:vkCmdSetDiscardRectangleModeEXT, if `discardRectangleEnable` is ename:VK_TRUE * flink:vkCmdSetDiscardRectangleEXT, if `discardRectangleEnable` is ename:VK_TRUE endif::VK_EXT_discard_rectangles[] ifdef::VK_EXT_conservative_rasterization[] If `apiext:VK_EXT_conservative_rasterization` extension is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetConservativeRasterizationModeEXT * flink:vkCmdSetExtraPrimitiveOverestimationSizeEXT, if pname:conservativeRasterizationMode is ename:VK_CONSERVATIVE_RASTERIZATION_MODE_OVERESTIMATE_EXT endif::VK_EXT_conservative_rasterization[] ifdef::VK_EXT_depth_clip_enable[] If the <> feature is enabled on the device, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetDepthClipEnableEXT endif::VK_EXT_depth_clip_enable[] ifdef::VK_EXT_sample_locations[] If the `apiext:VK_EXT_sample_locations` extension is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetSampleLocationsEnableEXT * flink:vkCmdSetSampleLocationsEXT, if pname:sampleLocationsEnable is ename:VK_TRUE endif::VK_EXT_sample_locations[] ifdef::VK_EXT_provoking_vertex[] If the `apiext:VK_EXT_provoking_vertex` extension is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, and a shader is bound to the ename:VK_SHADER_STAGE_VERTEX_BIT stage, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetProvokingVertexModeEXT endif::VK_EXT_provoking_vertex[] ifdef::VK_EXT_line_rasterization[] If the `apiext:VK_EXT_line_rasterization` extension is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, and if pname:polygonMode is ename:VK_POLYGON_MODE_LINE or a shader is bound to the ename:VK_SHADER_STAGE_VERTEX_BIT stage and pname:primitiveTopology is a line topology or a shader which outputs line primitives is bound to the ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT or ename:VK_SHADER_STAGE_GEOMETRY_BIT stage, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetLineRasterizationModeEXT * flink:vkCmdSetLineStippleEnableEXT * flink:vkCmdSetLineStippleEXT, if pname:stippledLineEnable is ename:VK_TRUE endif::VK_EXT_line_rasterization[] ifdef::VK_EXT_depth_clip_control[] If the <> feature is enabled on the device, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetDepthClipNegativeOneToOneEXT endif::VK_EXT_depth_clip_control[] ifdef::VK_EXT_color_write_enable[] If the <> feature is enabled on the device, and a shader is bound to the ename:VK_SHADER_STAGE_FRAGMENT_BIT stage, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetColorWriteEnableEXT, with values set for every color attachment in the render pass instance active at draw time endif::VK_EXT_color_write_enable[] ifdef::VK_EXT_attachment_feedback_loop_dynamic_state[] If the <> feature is enabled on the device, and a shader is bound to the ename:VK_SHADER_STAGE_FRAGMENT_BIT stage, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetAttachmentFeedbackLoopEnableEXT endif::VK_EXT_attachment_feedback_loop_dynamic_state[] ifdef::VK_NV_clip_space_w_scaling[] If the `apiext:VK_NV_clip_space_w_scaling` extension is enabled on the device, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetViewportWScalingEnableNV * flink:vkCmdSetViewportWScalingNV, if pname:viewportWScalingEnable is ename:VK_TRUE endif::VK_NV_clip_space_w_scaling[] ifdef::VK_NV_viewport_swizzle[] If the `apiext:VK_NV_viewport_swizzle` extension is enabled on the device, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetViewportSwizzleNV endif::VK_NV_viewport_swizzle[] ifdef::VK_NV_fragment_coverage_to_color[] If the `apiext:VK_NV_fragment_coverage_to_color` extension is enabled on the device, and a shader is bound to the ename:VK_SHADER_STAGE_FRAGMENT_BIT stage, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetCoverageToColorEnableNV * flink:vkCmdSetCoverageToColorLocationNV, if pname:coverageToColorEnable is ename:VK_TRUE endif::VK_NV_fragment_coverage_to_color[] ifdef::VK_NV_framebuffer_mixed_samples[] If the `apiext:VK_NV_framebuffer_mixed_samples` extension is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetCoverageModulationModeNV * flink:vkCmdSetCoverageModulationTableEnableNV, if pname:coverageModulationMode is not ename:VK_COVERAGE_MODULATION_MODE_NONE_NV * flink:vkCmdSetCoverageModulationTableNV, if pname:coverageModulationTableEnable is ename:VK_TRUE endif::VK_NV_framebuffer_mixed_samples[] ifdef::VK_NV_coverage_reduction_mode[] If the <> feature is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetCoverageReductionModeNV endif::VK_NV_coverage_reduction_mode[] ifdef::VK_NV_representative_fragment_test[] If the <> feature is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following command must: have been called in the command buffer prior to drawing: * flink:vkCmdSetRepresentativeFragmentTestEnableNV endif::VK_NV_representative_fragment_test[] ifdef::VK_NV_shading_rate_image[] If the <> feature is enabled on the device, and pname:rasterizerDiscardEnable is ename:VK_FALSE, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetCoarseSampleOrderNV * flink:vkCmdSetShadingRateImageEnableNV * flink:vkCmdSetViewportShadingRatePaletteNV, if pname:shadingRateImageEnable is ename:VK_TRUE endif::VK_NV_shading_rate_image[] ifdef::VK_NV_scissor_exclusive[] If the <> feature is enabled on the device, the following commands must: have been called in the command buffer prior to drawing: * flink:vkCmdSetExclusiveScissorEnableNV * flink:vkCmdSetExclusiveScissorNV, if any value in pname:pExclusiveScissorEnables is ename:VK_TRUE endif::VK_NV_scissor_exclusive[] State can: be set either at any time before or after shader objects are bound, but all required state must: be set prior to issuing drawing commands. [[shaders-objects-pipeline-interaction]] === Interaction With Pipelines Calling flink:vkCmdBindShadersEXT causes the pipeline bind points <> in pname:pStages to be disturbed, meaning that any <> that had previously been bound to those pipeline bind points are no longer bound. If ename:VK_PIPELINE_BIND_POINT_GRAPHICS is disturbed (i.e., if pname:pStages contains any graphics stage), any graphics pipeline state that the previously bound pipeline did not specify as <> becomes undefined:, and must: be set in the command buffer before issuing drawing commands using shader objects. Calls to flink:vkCmdBindPipeline likewise disturb the shader stage(s) corresponding to pname:pipelineBindPoint, meaning that any shaders that had previously been bound to any of those stages are no longer bound, even if the pipeline was created without shaders for some of those stages. [[shaders-objects-destruction]] === Shader Object Destruction [open,refpage='vkDestroyShaderEXT',desc='Destroy a shader object',type='protos'] -- To destroy a shader object, call: include::{generated}/api/protos/vkDestroyShaderEXT.adoc[] * pname:device is the logical device that destroys the shader object. * pname:shader is the handle of the shader object to destroy. * pname:pAllocator controls host memory allocation as described in the <> chapter. Destroying a shader object used by one or more command buffers in the <> causes those command buffers to move into the _invalid state_. .Valid Usage **** * [[VUID-vkDestroyShaderEXT-None-08481]] The <> feature must: be enabled * [[VUID-vkDestroyShaderEXT-shader-08482]] All submitted commands that refer to pname:shader must: have completed execution * [[VUID-vkDestroyShaderEXT-pAllocator-08483]] If sname:VkAllocationCallbacks were provided when pname:shader was created, a compatible set of callbacks must: be provided here * [[VUID-vkDestroyShaderEXT-pAllocator-08484]] If no sname:VkAllocationCallbacks were provided when pname:shader was created, pname:pAllocator must: be `NULL` **** include::{generated}/validity/protos/vkDestroyShaderEXT.adoc[] -- endif::VK_EXT_shader_object[] [[shader-modules]] == Shader Modules [open,refpage='VkShaderModule',desc='Opaque handle to a shader module object',type='handles'] -- _Shader modules_ contain _shader code_ and one or more entry points. Shaders are selected from a shader module by specifying an entry point as part of <> creation. The stages of a pipeline can: use shaders that come from different modules. The shader code defining a shader module must: be in the SPIR-V format, as described by the <> appendix. Shader modules are represented by sname:VkShaderModule handles: include::{generated}/api/handles/VkShaderModule.adoc[] ifdef::VKSC_VERSION_1_0[] Shader modules are not used in Vulkan SC, but the type has been retained for compatibility <>. In Vulkan SC, the shader modules and pipeline state are supplied to an offline compiler which creates a pipeline cache entry which is loaded at <> creation time. ifdef::hidden[] // tag::scremoved[] * elink:VkStructureType ** ename:VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO <> * elink:VkObjectType ** ename:VK_OBJECT_TYPE_SHADER_MODULE <> * fname:vkCreateShaderModule, fname:vkDestroyShaderModule <> * sname:VkShaderModule, sname:VkShaderModuleCreateInfo <> * tname:VkShaderModuleCreateFlags <> * ename:VkShaderModuleCreateFlagBits <> // end::scremoved[] endif::hidden[] endif::VKSC_VERSION_1_0[] -- ifndef::VKSC_VERSION_1_0[] [open,refpage='vkCreateShaderModule',desc='Creates a new shader module object',type='protos'] -- To create a shader module, call: include::{generated}/api/protos/vkCreateShaderModule.adoc[] * pname:device is the logical device that creates the shader module. * pname:pCreateInfo is a pointer to a slink:VkShaderModuleCreateInfo structure. * pname:pAllocator controls host memory allocation as described in the <> chapter. * pname:pShaderModule is a pointer to a slink:VkShaderModule handle in which the resulting shader module object is returned. Once a shader module has been created, any entry points it contains can: be used in pipeline shader stages as described in <> and <>. ifdef::VK_EXT_graphics_pipeline_libraries,VK_KHR_maintenance5[] [NOTE] .Note ==== If ifdef::VK_EXT_graphics_pipeline_libraries[] the <> feature endif::VK_EXT_graphics_pipeline_libraries[] ifdef::VK_EXT_graphics_pipeline_libraries+VK_KHR_maintenance5[or] ifdef::VK_KHR_maintenance5[] the <> feature endif::VK_KHR_maintenance5[] is enabled, shader module creation can be omitted entirely. Instead, applications should provide the slink:VkShaderModuleCreateInfo structure directly in to pipeline creation by chaining it to slink:VkPipelineShaderStageCreateInfo. This avoids the overhead of creating and managing an additional object. ==== endif::VK_EXT_graphics_pipeline_libraries,VK_KHR_maintenance5[] .Valid Usage **** * [[VUID-vkCreateShaderModule-pCreateInfo-06904]] If pname:pCreateInfo is not `NULL`, pname:pCreateInfo->pNext must: be `NULL` ifdef::VK_EXT_validation_cache[] or a pointer to a slink:VkShaderModuleValidationCacheCreateInfoEXT structure endif::VK_EXT_validation_cache[] **** include::{generated}/validity/protos/vkCreateShaderModule.adoc[] -- [open,refpage='VkShaderModuleCreateInfo',desc='Structure specifying parameters of a newly created shader module',type='structs'] -- :refpage: VkShaderModuleCreateInfo The sname:VkShaderModuleCreateInfo structure is defined as: include::{generated}/api/structs/VkShaderModuleCreateInfo.adoc[] * pname:sType is a elink:VkStructureType value identifying this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:flags is reserved for future use. * pname:codeSize is the size, in bytes, of the code pointed to by pname:pCode. * pname:pCode is a pointer to code that is used to create the shader module. The type and format of the code is determined from the content of the memory addressed by pname:pCode. .Valid Usage **** :prefixCondition: ifdef::VK_NV_glsl_shader[] :prefixCondition: If pCode is a pointer to SPIR-V code, endif::VK_NV_glsl_shader[] include::{chapters}/commonvalidity/shader_create_spv_common.adoc[] ifdef::VK_NV_glsl_shader[] * [[VUID-VkShaderModuleCreateInfo-pCode-07912]] If the apiext:VK_NV_glsl_shader extension is not enabled, pname:pCode must: be a pointer to SPIR-V code * [[VUID-VkShaderModuleCreateInfo-pCode-01379]] If pname:pCode is a pointer to GLSL code, it must: be valid GLSL code written to the `GL_KHR_vulkan_glsl` GLSL extension specification endif::VK_NV_glsl_shader[] * [[VUID-VkShaderModuleCreateInfo-codeSize-01085]] pname:codeSize must: be greater than 0 **** include::{generated}/validity/structs/VkShaderModuleCreateInfo.adoc[] -- [open,refpage='VkShaderModuleCreateFlags',desc='Reserved for future use',type='flags'] -- include::{generated}/api/flags/VkShaderModuleCreateFlags.adoc[] tname:VkShaderModuleCreateFlags is a bitmask type for setting a mask, but is currently reserved for future use. -- ifdef::VK_EXT_validation_cache[] include::{chapters}/VK_EXT_validation_cache/shader-module-validation-cache.adoc[] endif::VK_EXT_validation_cache[] [open,refpage='vkDestroyShaderModule',desc='Destroy a shader module',type='protos'] -- To destroy a shader module, call: include::{generated}/api/protos/vkDestroyShaderModule.adoc[] * pname:device is the logical device that destroys the shader module. * pname:shaderModule is the handle of the shader module to destroy. * pname:pAllocator controls host memory allocation as described in the <> chapter. A shader module can: be destroyed while pipelines created using its shaders are still in use. .Valid Usage **** * [[VUID-vkDestroyShaderModule-shaderModule-01092]] If sname:VkAllocationCallbacks were provided when pname:shaderModule was created, a compatible set of callbacks must: be provided here * [[VUID-vkDestroyShaderModule-shaderModule-01093]] If no sname:VkAllocationCallbacks were provided when pname:shaderModule was created, pname:pAllocator must: be `NULL` **** include::{generated}/validity/protos/vkDestroyShaderModule.adoc[] -- endif::VKSC_VERSION_1_0[] ifdef::VK_EXT_shader_module_identifier[] [[shaders-identifiers]] == Shader Module Identifiers [open,refpage='vkGetShaderModuleIdentifierEXT',desc='Query a unique identifier for a shader module',type='protos'] -- Shader modules have unique identifiers associated with them. To query an implementation provided identifier, call: include::{generated}/api/protos/vkGetShaderModuleIdentifierEXT.adoc[] * pname:device is the logical device that created the shader module. * pname:shaderModule is the handle of the shader module. * pname:pIdentifier is a pointer to the returned slink:VkShaderModuleIdentifierEXT. The identifier returned by the implementation must: only depend on pname:shaderIdentifierAlgorithmUUID and information provided in the slink:VkShaderModuleCreateInfo which created pname:shaderModule. The implementation may: return equal identifiers for two different slink:VkShaderModuleCreateInfo structures if the difference does not affect pipeline compilation. Identifiers are only meaningful on different slink:VkDevice objects if the device the identifier was queried from had the same <> as the device consuming the identifier. .Valid Usage **** * [[VUID-vkGetShaderModuleIdentifierEXT-shaderModuleIdentifier-06884]] <> feature must: be enabled **** include::{generated}/validity/protos/vkGetShaderModuleIdentifierEXT.adoc[] -- [open,refpage='vkGetShaderModuleCreateInfoIdentifierEXT',desc='Query a unique identifier for a shader module create info',type='protos'] -- slink:VkShaderModuleCreateInfo structures have unique identifiers associated with them. To query an implementation provided identifier, call: include::{generated}/api/protos/vkGetShaderModuleCreateInfoIdentifierEXT.adoc[] * pname:device is the logical device that can: create a slink:VkShaderModule from pname:pCreateInfo. * pname:pCreateInfo is a pointer to a slink:VkShaderModuleCreateInfo structure. * pname:pIdentifier is a pointer to the returned slink:VkShaderModuleIdentifierEXT. The identifier returned by implementation must: only depend on pname:shaderIdentifierAlgorithmUUID and information provided in the slink:VkShaderModuleCreateInfo. The implementation may: return equal identifiers for two different slink:VkShaderModuleCreateInfo structures if the difference does not affect pipeline compilation. Identifiers are only meaningful on different slink:VkDevice objects if the device the identifier was queried from had the same <> as the device consuming the identifier. The identifier returned by the implementation in flink:vkGetShaderModuleCreateInfoIdentifierEXT must: be equal to the identifier returned by flink:vkGetShaderModuleIdentifierEXT given equivalent definitions of slink:VkShaderModuleCreateInfo and any chained pname:pNext structures. .Valid Usage **** * [[VUID-vkGetShaderModuleCreateInfoIdentifierEXT-shaderModuleIdentifier-06885]] <> feature must: be enabled **** include::{generated}/validity/protos/vkGetShaderModuleCreateInfoIdentifierEXT.adoc[] -- [open,refpage='VkShaderModuleIdentifierEXT',desc='A unique identifier for a shader module',type='structs'] -- slink:VkShaderModuleIdentifierEXT represents a shader module identifier returned by the implementation. include::{generated}/api/structs/VkShaderModuleIdentifierEXT.adoc[] * pname:sType is a elink:VkStructureType value identifying this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:identifierSize is the size, in bytes, of valid data returned in pname:identifier. * pname:identifier is a buffer of opaque data specifying an identifier. Any returned values beyond the first pname:identifierSize bytes are undefined:. Implementations must: return an pname:identifierSize greater than 0, and less-or-equal to ename:VK_MAX_SHADER_MODULE_IDENTIFIER_SIZE_EXT. Two identifiers are considered equal if pname:identifierSize is equal and the first pname:identifierSize bytes of pname:identifier compare equal. Implementations may: return a different pname:identifierSize for different modules. Implementations should: ensure that pname:identifierSize is large enough to uniquely define a shader module. include::{generated}/validity/structs/VkShaderModuleIdentifierEXT.adoc[] -- [open,refpage='VK_MAX_SHADER_MODULE_IDENTIFIER_SIZE_EXT',desc='Maximum length of a shader module identifier',type='consts'] -- ename:VK_MAX_SHADER_MODULE_IDENTIFIER_SIZE_EXT is the length in bytes of a shader module identifier, as returned in slink:VkShaderModuleIdentifierEXT::pname:identifierSize. include::{generated}/api/enums/VK_MAX_SHADER_MODULE_IDENTIFIER_SIZE_EXT.adoc[] -- endif::VK_EXT_shader_module_identifier[] [[shaders-binding]] == Binding Shaders Before a shader can be used it must: be first bound to the command buffer. Calling flink:vkCmdBindPipeline binds all stages corresponding to the elink:VkPipelineBindPoint. ifdef::VK_EXT_shader_object[] Calling flink:vkCmdBindShadersEXT binds all stages in pname:pStages endif::VK_EXT_shader_object[] The following table describes the relationship between shader stages and pipeline bind points: [cols="1,1,1"] |==== |Shader stage |Pipeline bind point | behavior controlled a| * ename:VK_SHADER_STAGE_VERTEX_BIT * ename:VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT * ename:VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT * ename:VK_SHADER_STAGE_GEOMETRY_BIT * ename:VK_SHADER_STAGE_FRAGMENT_BIT ifdef::VK_EXT_mesh_shader[] * ename:VK_SHADER_STAGE_TASK_BIT_EXT * ename:VK_SHADER_STAGE_MESH_BIT_EXT endif::VK_EXT_mesh_shader[] ifndef::VK_EXT_mesh_shader[] ifdef::VK_NV_mesh_shader[] * ename:VK_SHADER_STAGE_TASK_BIT_NV * ename:VK_SHADER_STAGE_MESH_BIT_NV endif::VK_NV_mesh_shader[] endif::VK_EXT_mesh_shader[] | ename:VK_PIPELINE_BIND_POINT_GRAPHICS | all <> a| * ename:VK_SHADER_STAGE_COMPUTE_BIT | ename:VK_PIPELINE_BIND_POINT_COMPUTE | all <> ifdef::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] a| * ename:VK_SHADER_STAGE_ANY_HIT_BIT_KHR * ename:VK_SHADER_STAGE_CALLABLE_BIT_KHR * ename:VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR * ename:VK_SHADER_STAGE_INTERSECTION_BIT_KHR * ename:VK_SHADER_STAGE_MISS_BIT_KHR * ename:VK_SHADER_STAGE_RAYGEN_BIT_KHR | ename:VK_PIPELINE_BIND_POINT_RAY_TRACING_KHR | flink:vkCmdTraceRaysKHR and flink:vkCmdTraceRaysIndirectKHR endif::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] ifdef::VK_HUAWEI_subpass_shading[] a| * ename:VK_SHADER_STAGE_SUBPASS_SHADING_BIT_HUAWEI * ename:VK_SHADER_STAGE_CLUSTER_CULLING_BIT_HUAWEI | ename:VK_PIPELINE_BIND_POINT_SUBPASS_SHADING_HUAWEI | flink:vkCmdSubpassShadingHUAWEI endif::VK_HUAWEI_subpass_shading[] ifdef::VK_AMDX_shader_enqueue[] a| * ename:VK_SHADER_STAGE_COMPUTE_BIT | ename:VK_PIPELINE_BIND_POINT_EXECUTION_GRAPH_AMDX | all <> endif::VK_AMDX_shader_enqueue[] |==== [[shaders-execution]] == Shader Execution At each stage of the pipeline, multiple invocations of a shader may: execute simultaneously. Further, invocations of a single shader produced as the result of different commands may: execute simultaneously. The relative execution order of invocations of the same shader type is undefined:. Shader invocations may: complete in a different order than that in which the primitives they originated from were drawn or dispatched by the application. However, fragment shader outputs are written to attachments in <>. The relative execution order of invocations of different shader types is largely undefined:. However, when invoking a shader whose inputs are generated from a previous pipeline stage, the shader invocations from the previous stage are guaranteed to have executed far enough to generate input values for all required inputs. [[shaders-termination]] === Shader Termination A shader invocation that is _terminated_ has finished executing instructions. Executing code:OpReturn in the entry point, or executing code:OpTerminateInvocation in any function will terminate an invocation. Implementations may: also terminate a shader invocation when code:OpKill is executed in any function; otherwise it becomes a <>. In addition to the above conditions, <> are terminated when all non-helper invocations in the same <> either terminate or become <> via ifdef::VK_EXT_shader_demote_to_helper_invocation[] code:OpDemoteToHelperInvocationEXT or endif::VK_EXT_shader_demote_to_helper_invocation[] code:OpKill. A shader stage for a given command completes execution when all invocations for that stage have terminated. [[shaders-execution-memory-ordering]] == Shader Memory Access Ordering The order in which image or buffer memory is read or written by shaders is largely undefined:. For some shader types (vertex, tessellation evaluation, and in some cases, fragment), even the number of shader invocations that may: perform loads and stores is undefined:. In particular, the following rules apply: * <> and <> shaders will be invoked at least once for each unique vertex, as defined in those sections. * <> shaders will be invoked zero or more times, as defined in that section. * The relative execution order of invocations of the same shader type is undefined:. A store issued by a shader when working on primitive B might complete prior to a store for primitive A, even if primitive A is specified prior to primitive B. This applies even to fragment shaders; while fragment shader outputs are always written to the framebuffer in <>, stores executed by fragment shader invocations are not. * The relative execution order of invocations of different shader types is largely undefined:. [NOTE] .Note ==== The above limitations on shader invocation order make some forms of synchronization between shader invocations within a single set of primitives unimplementable. For example, having one invocation poll memory written by another invocation assumes that the other invocation has been launched and will complete its writes in finite time. ==== ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] The <> appendix defines the terminology and rules for how to correctly communicate between shader invocations, such as when a write is <> a read, and what constitutes a <>. Applications must: not cause a data race. endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] Stores issued to different memory locations within a single shader invocation may: not be visible to other invocations, or may: not become visible in the order they were performed. The code:OpMemoryBarrier instruction can: be used to provide stronger ordering of reads and writes performed by a single invocation. code:OpMemoryBarrier guarantees that any memory transactions issued by the shader invocation prior to the instruction complete prior to the memory transactions issued after the instruction. Memory barriers are needed for algorithms that require multiple invocations to access the same memory and require the operations to be performed in a partially-defined relative order. For example, if one shader invocation does a series of writes, followed by an code:OpMemoryBarrier instruction, followed by another write, then the results of the series of writes before the barrier become visible to other shader invocations at a time earlier or equal to when the results of the final write become visible to those invocations. In practice it means that another invocation that sees the results of the final write would also see the previous writes. Without the memory barrier, the final write may: be visible before the previous writes. Writes that are the result of shader stores through a variable decorated with code:Coherent automatically have available writes to the same buffer, buffer view, or image view made visible to them, and are themselves automatically made available to access by the same buffer, buffer view, or image view. Reads that are the result of shader loads through a variable decorated with code:Coherent automatically have available writes to the same buffer, buffer view, or image view made visible to them. The order that coherent writes to different locations become available is undefined:, unless enforced by a memory barrier instruction or other memory dependency. [NOTE] .Note ==== Explicit memory dependencies must: still be used to guarantee availability and visibility for access via other buffers, buffer views, or image views. ==== The built-in atomic memory transaction instructions can: be used to read and write a given memory address atomically. While built-in atomic functions issued by multiple shader invocations are executed in undefined: order relative to each other, these functions perform both a read and a write of a memory address and guarantee that no other memory transaction will write to the underlying memory between the read and write. Atomic operations ensure automatic availability and visibility for writes and reads in the same way as those to code:Coherent variables. [NOTE] .Note ==== Memory accesses performed on different resource descriptors with the same memory backing may: not be well-defined even with the code:Coherent decoration or via atomics, due to things such as image layouts or ownership of the resource - as described in the <> chapter. ==== [NOTE] .Note ==== Atomics allow shaders to use shared global addresses for mutual exclusion or as counters, among other uses. ==== endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] The SPIR-V *SubgroupMemory*, *CrossWorkgroupMemory*, and *AtomicCounterMemory* memory semantics are ignored. Sequentially consistent atomics and barriers are not supported and *SequentiallyConsistent* is treated as *AcquireRelease*. *SequentiallyConsistent* should: not be used. [[shaders-inputs]] == Shader Inputs and Outputs Data is passed into and out of shaders using variables with input or output storage class, respectively. User-defined inputs and outputs are connected between stages by matching their code:Location decorations. Additionally, data can: be provided by or communicated to special functions provided by the execution environment using code:BuiltIn decorations. In many cases, the same code:BuiltIn decoration can: be used in multiple shader stages with similar meaning. The specific behavior of variables decorated as code:BuiltIn is documented in the following sections. ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] [[shaders-task]] == Task Shaders Task shaders operate in conjunction with the mesh shaders to produce a collection of primitives that will be processed by subsequent stages of the graphics pipeline. Its primary purpose is to create a variable amount of subsequent mesh shader invocations. Task shaders are invoked via the execution of the <> pipeline. The task shader has no fixed-function inputs other than variables identifying the specific workgroup and invocation. ifdef::VK_NV_mesh_shader[] In the code:TaskNV {ExecutionModel} the number of mesh shader workgroups to create is specified via a code:TaskCountNV decorated output variable. endif::VK_NV_mesh_shader[] ifdef::VK_EXT_mesh_shader[] In the code:TaskEXT {ExecutionModel} the number of mesh shader workgroups to create is specified via the code:OpEmitMeshTasksEXT instruction. endif::VK_EXT_mesh_shader[] The task shader can write additional outputs to task memory, which can be read by all of the mesh shader workgroups it created. === Task Shader Execution Task workloads are formed from groups of work items called workgroups and processed by the task shader in the current graphics pipeline. A workgroup is a collection of shader invocations that execute the same shader, potentially in parallel. Task shaders execute in _global workgroups_ which are divided into a number of _local workgroups_ with a size that can: be set by assigning a value to the code:LocalSize ifdef::VK_VERSION_1_3,VK_KHR_maintenance4[or code:LocalSizeId] execution mode or via an object decorated by the code:WorkgroupSize decoration. An invocation within a local workgroup can: share data with other members of the local workgroup through shared variables and issue memory and control flow barriers to synchronize with other members of the local workgroup. ifdef::VK_EXT_mesh_shader[] ifdef::VK_VERSION_1_1,VK_KHR_multiview[] If the subpass includes multiple views in its view mask, a Task shader using code:TaskEXT {ExecutionModel} may: be invoked separately for each view. endif::VK_VERSION_1_1,VK_KHR_multiview[] endif::VK_EXT_mesh_shader[] [[shaders-mesh]] == Mesh Shaders Mesh shaders operate in workgroups to produce a collection of primitives that will be processed by subsequent stages of the graphics pipeline. Each workgroup emits zero or more output primitives and the group of vertices and their associated data required for each output primitive. Mesh shaders are invoked via the execution of the <> pipeline. The only inputs available to the mesh shader are variables identifying the specific workgroup and invocation and, if applicable, any outputs written to task memory by the task shader that spawned the mesh shader's workgroup. The mesh shader can operate without a task shader as well. The invocations of the mesh shader workgroup write an output mesh, comprising a set of primitives with per-primitive attributes, a set of vertices with per-vertex attributes, and an array of indices identifying the mesh vertices that belong to each primitive. The primitives of this mesh are then processed by subsequent graphics pipeline stages, where the outputs of the mesh shader form an interface with the fragment shader. === Mesh Shader Execution Mesh workloads are formed from groups of work items called workgroups and processed by the mesh shader in the current graphics pipeline. A workgroup is a collection of shader invocations that execute the same shader, potentially in parallel. Mesh shaders execute in _global workgroups_ which are divided into a number of _local workgroups_ with a size that can: be set by assigning a value to the code:LocalSize ifdef::VK_VERSION_1_3,VK_KHR_maintenance4[or code:LocalSizeId] execution mode or via an object decorated by the code:WorkgroupSize decoration. An invocation within a local workgroup can: share data with other members of the local workgroup through shared variables and issue memory and control flow barriers to synchronize with other members of the local workgroup. The _global workgroups_ may be generated explicitly via the API, or implicitly through the task shader's work creation mechanism. endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] ifdef::VK_EXT_mesh_shader[] ifdef::VK_VERSION_1_1,VK_KHR_multiview[] If the subpass includes multiple views in its view mask, a Mesh shader using code:MeshEXT {ExecutionModel} may: be invoked separately for each view. endif::VK_VERSION_1_1,VK_KHR_multiview[] endif::VK_EXT_mesh_shader[] ifdef::VK_HUAWEI_cluster_culling_shader[] [[shaders-cluster-culling]] == Cluster Culling Shaders Cluster Culling shaders are invoked via the execution of the <> pipeline. The only inputs available to the cluster culling shader are variables identifying the specific workgroup and invocation. Cluster Culling shaders operate in workgroups to perform cluster-based culling and produce zero or more cluster drawing command that will be processed by subsequent stages of the graphics pipeline. The Cluster Drawing Command(CDC) is very similar to the MDI command, invocations in workgroup can emit zero of more CDC to draw zero or more visible cluster. === Cluster Culling Shader Execution Cluster Culling workloads are formed from groups of work items called workgroups and processed by the cluster culling shader in the current graphics pipeline. A workgroup is a collection of shader invocations that execute the same shader, potentially in parallel. Cluster Culling shaders execute in _global workgroups_ which are divided into a number of _local workgroups_ with a size that can: be set by assigning a value to the code:LocalSize ifdef::VK_VERSION_1_3,VK_KHR_maintenance4[or code:LocalSizeId] execution mode or via an object decorated by the code:WorkgroupSize decoration. An invocation within a local workgroup can: share data with other members of the local workgroup through shared variables and issue memory and control flow barriers to synchronize with other members of the local workgroup. endif::VK_HUAWEI_cluster_culling_shader[] [[shaders-vertex]] == Vertex Shaders Each vertex shader invocation operates on one vertex and its associated <> data, and outputs one vertex and associated data. ifndef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] Graphics pipelines must: include a vertex shader, and the vertex shader stage is always the first shader stage in the graphics pipeline. endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] Graphics pipelines using primitive shading must: include a vertex shader, and the vertex shader stage is always the first shader stage in the graphics pipeline. endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] [[shaders-vertex-execution]] === Vertex Shader Execution A vertex shader must: be executed at least once for each vertex specified by a drawing command. ifdef::VK_VERSION_1_1,VK_KHR_multiview[] If the subpass includes multiple views in its view mask, the shader may: be invoked separately for each view. endif::VK_VERSION_1_1,VK_KHR_multiview[] During execution, the shader is presented with the index of the vertex and instance for which it has been invoked. Input variables declared in the vertex shader are filled by the implementation with the values of vertex attributes associated with the invocation being executed. If the same vertex is specified multiple times in a drawing command (e.g. by including the same index value multiple times in an index buffer) the implementation may: reuse the results of vertex shading if it can statically determine that the vertex shader invocations will produce identical results. [NOTE] .Note ==== It is implementation-dependent when and if results of vertex shading are reused, and thus how many times the vertex shader will be executed. This is true also if the vertex shader contains stores or atomic operations (see <>). ==== [[shaders-tessellation-control]] == Tessellation Control Shaders The tessellation control shader is used to read an input patch provided by the application and to produce an output patch. Each tessellation control shader invocation operates on an input patch (after all control points in the patch are processed by a vertex shader) and its associated data, and outputs a single control point of the output patch and its associated data, and can: also output additional per-patch data. The input patch is sized according to the pname:patchControlPoints member of slink:VkPipelineTessellationStateCreateInfo, as part of input assembly. ifdef::VK_EXT_extended_dynamic_state2,VK_EXT_shader_object[] The input patch can also be dynamically sized with pname:patchControlPoints parameter of flink:vkCmdSetPatchControlPointsEXT. [open,refpage='vkCmdSetPatchControlPointsEXT',desc='Specify the number of control points per patch dynamically for a command buffer',type='protos'] -- To <> the number of control points per patch, call: include::{generated}/api/protos/vkCmdSetPatchControlPointsEXT.adoc[] * pname:commandBuffer is the command buffer into which the command will be recorded. * pname:patchControlPoints specifies the number of control points per patch. This command sets the number of control points per patch for subsequent drawing commands ifdef::VK_EXT_shader_object[] ifdef::VK_EXT_extended_dynamic_state2[when drawing using <>, or] ifndef::VK_EXT_extended_dynamic_state2[when drawing using <>.] endif::VK_EXT_shader_object[] ifdef::VK_EXT_extended_dynamic_state2[] when the graphics pipeline is created with ename:VK_DYNAMIC_STATE_PATCH_CONTROL_POINTS_EXT set in slink:VkPipelineDynamicStateCreateInfo::pname:pDynamicStates. endif::VK_EXT_extended_dynamic_state2[] Otherwise, this state is specified by the slink:VkPipelineTessellationStateCreateInfo::pname:patchControlPoints value used to create the currently active pipeline. :refpage: vkCmdSetPatchControlPointsEXT :requiredfeature: extendedDynamicState2PatchControlPoints .Valid Usage **** include::{chapters}/commonvalidity/dynamic_state2_optional_feature_common.adoc[] * [[VUID-vkCmdSetPatchControlPointsEXT-patchControlPoints-04874]] pname:patchControlPoints must: be greater than zero and less than or equal to sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize **** include::{generated}/validity/protos/vkCmdSetPatchControlPointsEXT.adoc[] -- endif::VK_EXT_extended_dynamic_state2,VK_EXT_shader_object[] The size of the output patch is controlled by the code:OpExecutionMode code:OutputVertices specified in the tessellation control or tessellation evaluation shaders, which must: be specified in at least one of the shaders. The size of the input and output patches must: each be greater than zero and less than or equal to sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize. [[shaders-tessellation-control-execution]] === Tessellation Control Shader Execution A tessellation control shader is invoked at least once for each _output_ vertex in a patch. ifdef::VK_VERSION_1_1,VK_KHR_multiview[] If the subpass includes multiple views in its view mask, the shader may: be invoked separately for each view. endif::VK_VERSION_1_1,VK_KHR_multiview[] Inputs to the tessellation control shader are generated by the vertex shader. Each invocation of the tessellation control shader can: read the attributes of any incoming vertices and their associated data. The invocations corresponding to a given patch execute logically in parallel, with undefined: relative execution order. However, the code:OpControlBarrier instruction can: be used to provide limited control of the execution order by synchronizing invocations within a patch, effectively dividing tessellation control shader execution into a set of phases. Tessellation control shaders will read undefined: values if one invocation reads a per-vertex or per-patch output written by another invocation at any point during the same phase, or if two invocations attempt to write different values to the same per-patch output in a single phase. [[shaders-tessellation-evaluation]] == Tessellation Evaluation Shaders The Tessellation Evaluation Shader operates on an input patch of control points and their associated data, and a single input barycentric coordinate indicating the invocation's relative position within the subdivided patch, and outputs a single vertex and its associated data. [[shaders-tessellation-evaluation-execution]] === Tessellation Evaluation Shader Execution A tessellation evaluation shader is invoked at least once for each unique vertex generated by the tessellator. ifdef::VK_VERSION_1_1,VK_KHR_multiview[] If the subpass includes multiple views in its view mask, the shader may: be invoked separately for each view. endif::VK_VERSION_1_1,VK_KHR_multiview[] [[shaders-geometry]] == Geometry Shaders The geometry shader operates on a group of vertices and their associated data assembled from a single input primitive, and emits zero or more output primitives and the group of vertices and their associated data required for each output primitive. [[shaders-geometry-execution]] === Geometry Shader Execution A geometry shader is invoked at least once for each primitive produced by the tessellation stages, or at least once for each primitive generated by <> when tessellation is not in use. A shader can request that the geometry shader runs multiple <>. A geometry shader is invoked at least once for each instance. ifdef::VK_VERSION_1_1,VK_KHR_multiview[] If the subpass includes multiple views in its view mask, the shader may: be invoked separately for each view. endif::VK_VERSION_1_1,VK_KHR_multiview[] [[shaders-fragment]] == Fragment Shaders Fragment shaders are invoked as a <> in a graphics pipeline. Each fragment shader invocation operates on a single fragment and its associated data. With few exceptions, fragment shaders do not have access to any data associated with other fragments and are considered to execute in isolation of fragment shader invocations associated with other fragments. [[shaders-compute]] == Compute Shaders Compute shaders are invoked via flink:vkCmdDispatch and flink:vkCmdDispatchIndirect commands. In general, they have access to similar resources as shader stages executing as part of a graphics pipeline. Compute workloads are formed from groups of work items called workgroups and processed by the compute shader in the current compute pipeline. A workgroup is a collection of shader invocations that execute the same shader, potentially in parallel. Compute shaders execute in _global workgroups_ which are divided into a number of _local workgroups_ with a size that can: be set by assigning a value to the code:LocalSize ifdef::VK_VERSION_1_3,VK_KHR_maintenance4[or code:LocalSizeId] execution mode or via an object decorated by the code:WorkgroupSize decoration. An invocation within a local workgroup can: share data with other members of the local workgroup through shared variables and issue memory and control flow barriers to synchronize with other members of the local workgroup. ifdef::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] [[shaders-raytracing-shaders]] [[shaders-ray-generation]] == Ray Generation Shaders A ray generation shader is similar to a compute shader. Its main purpose is to execute ray tracing queries using code:OpTraceRayKHR instructions and process the results. [[shaders-ray-generation-execution]] === Ray Generation Shader Execution One ray generation shader is executed per ray tracing dispatch. Its location in the shader binding table (see <> for details) is passed directly into ifdef::VK_KHR_ray_tracing_pipeline[] flink:vkCmdTraceRaysKHR using the pname:pRaygenShaderBindingTable parameter endif::VK_KHR_ray_tracing_pipeline[] ifdef::VK_KHR_ray_tracing_pipeline+VK_KHR_ray_tracing_pipeline[or] ifdef::VK_NV_ray_tracing[] flink:vkCmdTraceRaysNV using the pname:raygenShaderBindingTableBuffer and pname:raygenShaderBindingOffset parameters endif::VK_NV_ray_tracing[] . [[shaders-intersection]] == Intersection Shaders Intersection shaders enable the implementation of arbitrary, application defined geometric primitives. An intersection shader for a primitive is executed whenever its axis-aligned bounding box is hit by a ray. Like other ray tracing shader domains, an intersection shader operates on a single ray at a time. It also operates on a single primitive at a time. It is therefore the purpose of an intersection shader to compute the ray-primitive intersections and report them. To report an intersection, the shader calls the code:OpReportIntersectionKHR instruction. An intersection shader communicates with any-hit and closest shaders by generating attribute values that they can: read. Intersection shaders cannot: read or modify the ray payload. [[shaders-intersection-execution]] === Intersection Shader Execution The order in which intersections are found along a ray, and therefore the order in which intersection shaders are executed, is unspecified. The intersection shader of the closest AABB which intersects the ray is guaranteed to be executed at some point during traversal, unless the ray is forcibly terminated. [[shaders-any-hit]] == Any-Hit Shaders The any-hit shader is executed after the intersection shader reports an intersection that lies within the current [eq]#[t~min~,t~max~]# of the ray. The main use of any-hit shaders is to programmatically decide whether or not an intersection will be accepted. The intersection will be accepted unless the shader calls the code:OpIgnoreIntersectionKHR instruction. Any-hit shaders have read-only access to the attributes generated by the corresponding intersection shader, and can: read or modify the ray payload. [[shaders-any-hit-execution]] === Any-Hit Shader Execution The order in which intersections are found along a ray, and therefore the order in which any-hit shaders are executed, is unspecified. The any-hit shader of the closest hit is guaranteed to be executed at some point during traversal, unless the ray is forcibly terminated. [[shaders-closest-hit]] == Closest Hit Shaders Closest hit shaders have read-only access to the attributes generated by the corresponding intersection shader, and can: read or modify the ray payload. They also have access to a number of system-generated values. Closest hit shaders can: call code:OpTraceRayKHR to recursively trace rays. [[shaders-closest-hit-execution]] === Closest Hit Shader Execution Exactly one closest hit shader is executed when traversal is finished and an intersection has been found and accepted. [[shaders-miss]] == Miss Shaders Miss shaders can: access the ray payload and can: trace new rays through the code:OpTraceRayKHR instruction, but cannot: access attributes since they are not associated with an intersection. [[shaders-miss-execution]] === Miss Shader Execution A miss shader is executed instead of a closest hit shader if no intersection was found during traversal. [[shaders-callable]] == Callable Shaders Callable shaders can: access a callable payload that works similarly to ray payloads to do subroutine work. [[shaders-callable-execution]] === Callable Shader Execution A callable shader is executed by calling code:OpExecuteCallableKHR from an allowed shader stage. endif::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] [[shaders-interpolation-decorations]] == Interpolation Decorations Variables in the code:Input storage class in a fragment shader's interface are interpolated from the values specified by the primitive being rasterized. [NOTE] .Note ==== Interpolation decorations can be present on input and output variables in pre-rasterization shaders but have no effect on the interpolation performed. ifdef::VK_EXT_graphics_pipeline_libraries[] However, when linking graphics pipeline libraries, if the <> limit is not supported, interpolation qualifiers do need to match between the fragment shader input and the last pre-rasterization shader output. endif::VK_EXT_graphics_pipeline_libraries[] ==== An undecorated input variable will be interpolated with perspective-correct interpolation according to the primitive type being rasterized. <> and <> are interpolated in the same way as the primitive's clip coordinates. If the code:NoPerspective decoration is present, linear interpolation is instead used for <> and <>. For points, as there is only a single vertex, input values are never interpolated and instead take the value written for the single vertex. If the code:Flat decoration is present on an input variable, the value is not interpolated, and instead takes its value directly from the <>. Fragment shader inputs that are signed or unsigned integers, integer vectors, or any double-precision floating-point type must: be decorated with code:Flat. Interpolation of input variables is performed at an implementation-defined position within the fragment area being shaded. The position is further constrained as follows: * If the code:Centroid decoration is used, the interpolation position used for the variable must: also fall within the bounds of the primitive being rasterized. * If the code:Sample decoration is used, the interpolation position used for the variable must: be at the position of the sample being shaded by the current fragment shader invocation. * If a sample count of 1 is used, the interpolation position must: be at the center of the fragment area. [NOTE] .Note ==== As code:Centroid restricts the possible interpolation position to the covered area of the primitive, the position can be forced to vary between neighboring fragments when it otherwise would not. Derivatives calculated based on these differing locations can produce inconsistent results compared to undecorated inputs. It is recommended that input variables used in derivative calculations are not decorated with code:Centroid. ==== ifdef::VK_NV_fragment_shader_barycentric,VK_KHR_fragment_shader_barycentric[] [[shaders-interpolation-decorations-pervertexkhr]] If the code:PerVertexKHR decoration is present on an input variable, the value is not interpolated, and instead values from all input vertices are available in an array. Each index of the array corresponds to one of the vertices of the primitive that produced the fragment. endif::VK_NV_fragment_shader_barycentric,VK_KHR_fragment_shader_barycentric[] ifdef::VK_AMD_shader_explicit_vertex_parameter[] If the code:CustomInterpAMD decoration is present on an input variable, the value cannot: be accessed directly; instead the extended instruction code:InterpolateAtVertexAMD must: be used to obtain values from the input vertices. endif::VK_AMD_shader_explicit_vertex_parameter[] [[shaders-staticuse]] == Static Use A SPIR-V module declares a global object in memory using the code:OpVariable instruction, which results in a pointer code:x to that object. A specific entry point in a SPIR-V module is said to _statically use_ that object if that entry point's call tree contains a function containing a instruction with code:x as an code:id operand. A shader entry point also _statically uses_ any variables explicitly declared in its interface. [[shaders-scope]] == Scope A _scope_ describes a set of shader invocations, where each such set is a _scope instance_. Each invocation belongs to one or more scope instances, but belongs to no more than one scope instance for each scope. The operations available between invocations in a given scope instance vary, with smaller scopes generally able to perform more operations, and with greater efficiency. [[shaders-scope-cross-device]] === Cross Device All invocations executed in a Vulkan instance fall into a single _cross device scope instance_. Whilst the code:CrossDevice scope is defined in SPIR-V, it is disallowed in Vulkan. API <> commands can: be used to communicate between devices. [[shaders-scope-device]] === Device All invocations executed on a single device form a _device scope instance_. ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] If the <> and <> features are enabled, this scope is represented in SPIR-V by the code:Device code:Scope, which can: be used as a code:Memory code:Scope for barrier and atomic operations. ifdef::VK_KHR_shader_clock[] If both the <> and <> features are enabled, using the code:Device code:Scope with the code:OpReadClockKHR instruction will read from a clock that is consistent across invocations in the same device scope instance. endif::VK_KHR_shader_clock[] endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] There is no method to synchronize the execution of these invocations within SPIR-V, and this can: only be done with API synchronization primitives. ifdef::VK_VERSION_1_1,VK_KHR_device_group[] Invocations executing on different devices in a device group operate in separate device scope instances. endif::VK_VERSION_1_1,VK_KHR_device_group[] ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] The scope only extends to the queue family, not the whole device. endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] [[shaders-scope-queue-family]] === Queue Family Invocations executed by queues in a given queue family form a _queue family scope instance_. This scope is identified in SPIR-V as the ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] code:QueueFamily code:Scope if the <> feature is enabled, or if not, the endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] code:Device code:Scope, which can: be used as a code:Memory code:Scope for barrier and atomic operations. ifdef::VK_KHR_shader_clock[] If the <> feature is enabled, ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] but the <> feature is not enabled, endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] using the code:Device code:Scope with the code:OpReadClockKHR instruction will read from a clock that is consistent across invocations in the same queue family scope instance. endif::VK_KHR_shader_clock[] There is no method to synchronize the execution of these invocations within SPIR-V, and this can: only be done with API synchronization primitives. Each invocation in a queue family scope instance must: be in the same <>. [[shaders-scope-command]] === Command Any shader invocations executed as the result of a single command such as flink:vkCmdDispatch or flink:vkCmdDraw form a _command scope instance_. For indirect drawing commands with pname:drawCount greater than one, invocations from separate draws are in separate command scope instances. ifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] For ray tracing shaders, an invocation group is an implementation-dependent subset of the set of shader invocations of a given shader stage which are produced by a single trace rays command. endif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] There is no specific code:Scope for communication across invocations in a command scope instance. As this has a clear boundary at the API level, coordination here can: be performed in the API, rather than in SPIR-V. Each invocation in a command scope instance must: be in the same <>. For shaders without defined <>, this set of invocations forms an _invocation group_ as defined in the <>. [[shaders-scope-primitive]] === Primitive Any fragment shader invocations executed as the result of rasterization of a single primitive form a _primitive scope instance_. There is no specific code:Scope for communication across invocations in a primitive scope instance. Any generated <> are included in this scope instance. Each invocation in a primitive scope instance must: be in the same <>. Any input variables decorated with code:Flat are uniform within a primitive scope instance. // intentionally no VK_NV_ray_tracing here since this scope does not exist there ifdef::VK_KHR_ray_tracing_pipeline[] [[shaders-scope-shadercall]] === Shader Call Any <> invocations that are executed in one or more ray tracing execution models form a _shader call scope instance_. The code:ShaderCallKHR code:Scope can be used as code:Memory code:Scope for barrier and atomic operations. Each invocation in a shader call scope instance must: be in the same <>. endif::VK_KHR_ray_tracing_pipeline[] [[shaders-scope-workgroup]] === Workgroup A _local workgroup_ is a set of invocations that can synchronize and share data with each other using memory in the code:Workgroup storage class. The code:Workgroup code:Scope can be used as both an code:Execution code:Scope and code:Memory code:Scope for barrier and atomic operations. Each invocation in a local workgroup must: be in the same <>. Only ifdef::VK_NV_mesh_shader,VK_EXT_mesh_shader[] task, mesh, and endif::VK_NV_mesh_shader,VK_EXT_mesh_shader[] compute shaders have defined workgroups - other shader types cannot: use workgroup functionality. For shaders that have defined workgroups, this set of invocations forms an _invocation group_ as defined in the <>. [[workgroup-padding]] ifdef::VK_KHR_workgroup_memory_explicit_layout[] When variables declared with the code:Workgroup storage class are explicitly laid out (hence they are also decorated with code:Block), the amount of storage consumed is the size of the largest Block variable, not counting any padding at the end. endif::VK_KHR_workgroup_memory_explicit_layout[] The amount of storage consumed by the ifdef::VK_KHR_workgroup_memory_explicit_layout[] non-Block endif::VK_KHR_workgroup_memory_explicit_layout[] variables declared with the code:Workgroup storage class is implementation-dependent. However, the amount of storage consumed may not exceed the largest block size that would be obtained if all active ifdef::VK_KHR_workgroup_memory_explicit_layout[] non-Block endif::VK_KHR_workgroup_memory_explicit_layout[] variables declared with code:Workgroup storage class were assigned offsets in an arbitrary order by successively taking the smallest valid offset according to the <> rules, and with code:Boolean values considered as 32-bit integer values for the purpose of this calculation. (This is equivalent to using the GLSL std430 layout rules.) ifdef::VK_VERSION_1_1[] [[shaders-scope-subgroup]] === Subgroup A _subgroup_ (see the subsection "`Control Flow`" of section 2 of the SPIR-V 1.3 Revision 1 specification) is a set of invocations that can synchronize and share data with each other efficiently. The code:Subgroup code:Scope can be used as both an code:Execution code:Scope and code:Memory code:Scope for barrier and atomic operations. Other <> allow the use of <> with subgroup scope. ifdef::VK_KHR_shader_clock[] If the <> feature is enabled, using the code:Subgroup code:Scope with the code:OpReadClockKHR instruction will read from a clock that is consistent across invocations in the same subgroup. endif::VK_KHR_shader_clock[] For <>, each invocation in a subgroup must: be in the same <>. In other shader stages, each invocation in a subgroup must: be in the same <>. Only <> have defined subgroups. [NOTE] .Note ==== In shaders, there are two kinds of uniformity that are of primary interest to applications: uniform within an invocation group (a.k.a. dynamically uniform), and uniform within a subgroup scope. While one could make the assumption that being uniform in invocation group implies being uniform in subgroup scope, it is not necessarily the case for shader stages without defined workgroups. For shader stages with defined workgroups however, the relationship between invocation group and subgroup scope is well defined as a subgroup is a subset of the workgroup, and the workgroup is the invocation group. If a value is uniform in invocation group, it is by definition also uniform in subgroup scope. This is important if writing code like: [source,glsl] ---- uniform texture2D Textures[]; uint dynamicallyUniformValue = gl_WorkGroupID.x; vec4 value = texelFetch(Textures[dynamicallyUniformValue], coord, 0); // subgroupUniformValue is guaranteed to be uniform within the subgroup. // This value also happens to be dynamically uniform. vec4 subgroupUniformValue = subgroupBroadcastFirst(dynamicallyUniformValue); ---- In shader stages without defined workgroups, this gets complicated. Due to scoping rules, there is no guarantee that a subgroup is a subset of the invocation group, which in turn defines the scope for dynamically uniform. In graphics, the invocation group is a single draw command, except for multi-draw situations, and indirect draws with drawCount > 1, where there are multiple invocation groups, one per code:DrawIndex. [source,glsl] ---- // Assume SubgroupSize = 8, where 3 draws are packed together. // Two subgroups were generated. uniform texture2D Textures[]; // DrawIndex builtin is dynamically uniform uint dynamicallyUniformValue = gl_DrawID; // | gl_DrawID = 0 | gl_DrawID = 1 | } // Subgroup 0: { 0, 0, 0, 0, 1, 1, 1, 1 } // | DrawID = 2 | DrawID = 1 | } // Subgroup 1: { 2, 2, 2, 2, 1, 1, 1, 1 } uint notActuallyDynamicallyUniformAnymore = subgroupBroadcastFirst(dynamicallyUniformValue); // | gl_DrawID = 0 | gl_DrawID = 1 | } // Subgroup 0: { 0, 0, 0, 0, 0, 0, 0, 0 } // | gl_DrawID = 2 | gl_DrawID = 1 | } // Subgroup 1: { 2, 2, 2, 2, 2, 2, 2, 2 } // Bug. gl_DrawID = 1's invocation group observes both index 0 and 2. vec4 value = texelFetch(Textures[notActuallyDynamicallyUniformAnymore], coord, 0); ---- Another problematic scenario is when a shader attempts to help the compiler notice that a value is uniform in subgroup scope to potentially improve performance. [source,c] ---- layout(location = 0) flat in dynamicallyUniformIndex; // Vertex shader might have emitted a value that depends only on gl_DrawID, // making it dynamically uniform. // Give knowledge to compiler that the flat input is dynamically uniform, // as this is not a guarantee otherwise. uint uniformIndex = subgroupBroadcastFirst(dynamicallyUniformIndex); // Hazard: If different draw commands are packed into one subgroup, the uniformIndex is wrong. DrawData d = UBO.perDrawData[uniformIndex]; ---- For implementations where subgroups are packed across draws, the implementation must make sure to handle descriptor indexing correctly. From the specification's point of view, a dynamically uniform index does not require code:NonUniform decoration, and such an implementation will likely either promote descriptor indexing into code:NonUniform on its own, or handle non-uniformity implicitly. ==== endif::VK_VERSION_1_1[] [[shaders-scope-quad]] === Quad A _quad scope instance_ is formed of four shader invocations. In a fragment shader, each invocation in a quad scope instance is formed of invocations in neighboring framebuffer locations [eq]#(x~i~, y~i~)#, where: * [eq]#i# is the index of the invocation within the scope instance. * [eq]#w# and [eq]#h# are the number of pixels the fragment covers in the [eq]#x# and [eq]#y# axes. * [eq]#w# and [eq]#h# are identical for all participating invocations. * [eq]#(x~0~) = (x~1~ - w) = (x~2~) = (x~3~ - w)# * [eq]#(y~0~) = (y~1~) = (y~2~ - h) = (y~3~ - h)# * Each invocation has the same layer and sample indices. ifdef::VK_NV_compute_shader_derivatives[] In a compute shader, if the code:DerivativeGroupQuadsNV execution mode is specified, each invocation in a quad scope instance is formed of invocations with adjacent local invocation IDs [eq]#(x~i~, y~i~)#, where: * [eq]#i# is the index of the invocation within the quad scope instance. * [eq]#(x~0~) = (x~1~ - 1) = (x~2~) = (x~3~ - 1)# * [eq]#(y~0~) = (y~1~) = (y~2~ - 1) = (y~3~ - 1)# * [eq]#x~0~# and [eq]#y~0~# are integer multiples of 2. * Each invocation has the same [eq]#z# coordinate. In a compute shader, if the code:DerivativeGroupLinearNV execution mode is specified, each invocation in a quad scope instance is formed of invocations with adjacent local invocation indices [eq]#(l~i~)#, where: * [eq]#i# is the index of the invocation within the quad scope instance. * [eq]#(l~0~) = (l~1~ - 1) = (l~2~ - 2) = (l~3~ - 3)# * [eq]#l~0~# is an integer multiple of 4. endif::VK_NV_compute_shader_derivatives[] ifdef::VK_VERSION_1_1[] In all shaders, each invocation in a quad scope instance is formed of invocations in adjacent subgroup invocation indices [eq]#(s~i~)#, where: * [eq]#i# is the index of the invocation within the quad scope instance. * [eq]#(s~0~) = (s~1~ - 1) = (s~2~ - 2) = (s~3~ - 3)# * [eq]#s~0~# is an integer multiple of 4. Each invocation in a quad scope instance must: be in the same <>. endif::VK_VERSION_1_1[] ifndef::VK_VERSION_1_1[] The specific set of invocations that make up a quad scope instance in other shader stages is undefined:. endif::VK_VERSION_1_1[] In a fragment shader, each invocation in a quad scope instance must: be in the same <>. ifndef::VK_VERSION_1_1[] For <>, each invocation in a quad scope instance must: be in the same <>. In other shader stages, each invocation in a quad scope instance must: be in the same <>. endif::VK_VERSION_1_1[] Fragment ifdef::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[] and compute endif::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[] shaders have defined quad scope instances. ifdef::VK_VERSION_1_1[] If the <> limit is supported, any <> also have defined quad scope instances. endif::VK_VERSION_1_1[] ifdef::VK_EXT_fragment_shader_interlock[] [[shaders-scope-fragment-interlock]] === Fragment Interlock A _fragment interlock scope instance_ is formed of fragment shader invocations based on their framebuffer locations [eq]#(x,y,layer,sample)#, executed by commands inside a single <>. The specific set of invocations included varies based on the execution mode as follows: * If the code:SampleInterlockOrderedEXT or code:SampleInterlockUnorderedEXT execution modes are used, only invocations with identical framebuffer locations [eq]#(x,y,layer,sample)# are included. * If the code:PixelInterlockOrderedEXT or code:PixelInterlockUnorderedEXT execution modes are used, fragments with different sample ids are also included. ifdef::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[] * If the code:ShadingRateInterlockOrderedEXT or code:ShadingRateInterlockUnorderedEXT execution modes are used, fragments from neighbouring framebuffer locations are also included. The ifdef::VK_NV_shading_rate_image[<>] ifdef::VK_KHR_fragment_shading_rate+VK_NV_shading_rate_image[or] ifdef::VK_KHR_fragment_shading_rate[<>] determines these fragments. endif::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[] Only fragment shaders with one of the above execution modes have defined fragment interlock scope instances. There is no specific code:Scope value for communication across invocations in a fragment interlock scope instance. However, this is implicitly used as a memory scope by code:OpBeginInvocationInterlockEXT and code:OpEndInvocationInterlockEXT. Each invocation in a fragment interlock scope instance must: be in the same <>. endif::VK_EXT_fragment_shader_interlock[] [[shaders-scope-invocation]] === Invocation The smallest _scope_ is a single invocation; this is represented by the code:Invocation code:Scope in SPIR-V. Fragment shader invocations must: be in a <>. ifdef::VK_EXT_fragment_shader_interlock[] Invocations in <> must: be in a <>. endif::VK_EXT_fragment_shader_interlock[] Invocations in <> must: be in a <>. ifdef::VK_VERSION_1_1[] Invocations in <> must: be in a <>. endif::VK_VERSION_1_1[] Invocations in <> must: be in a <>. All invocations in all stages must: be in a <>. ifdef::VK_VERSION_1_1[] [[shaders-group-operations]] == Group Operations _Group operations_ are executed by multiple invocations within a <>; with each invocation involved in calculating the result. This provides a mechanism for efficient communication between invocations in a particular scope instance. Group operations all take a code:Scope defining the desired <> to operate within. Only the code:Subgroup scope can: be used for these operations; the <> limit defines which types of operation can: be used. [[shaders-group-operations-basic]] === Basic Group Operations Basic group operations include the use of code:OpGroupNonUniformElect, code:OpControlBarrier, code:OpMemoryBarrier, and atomic operations. code:OpGroupNonUniformElect can: be used to choose a single invocation to perform a task for the whole group. Only the invocation with the lowest id in the group will return code:true. The <> appendix defines the operation of barriers and atomics. [[shaders-group-operations-vote]] === Vote Group Operations The vote group operations allow invocations within a group to compare values across a group. The types of votes enabled are: * Do all active group invocations agree that an expression is true? * Do any active group invocations evaluate an expression to true? * Do all active group invocations have the same value of an expression? [NOTE] .Note ==== These operations are useful in combination with control flow in that they allow for developers to check whether conditions match across the group and choose potentially faster code-paths in these cases. ==== [[shaders-group-operations-arithmetic]] === Arithmetic Group Operations The arithmetic group operations allow invocations to perform scans and reductions across a group. The operators supported are add, mul, min, max, and, or, xor. For reductions, every invocation in a group will obtain the cumulative result of these operators applied to all values in the group. For exclusive scans, each invocation in a group will obtain the cumulative result of these operators applied to all values in invocations with a lower index in the group. Inclusive scans are identical to exclusive scans, except the cumulative result includes the operator applied to the value in the current invocation. The order in which these operators are applied is implementation-dependent. [[shaders-group-operations-ballot]] === Ballot Group Operations The ballot group operations allow invocations to perform more complex votes across the group. The ballot functionality allows all invocations within a group to provide a boolean value and get as a result what each invocation provided as their boolean value. The broadcast functionality allows values to be broadcast from an invocation to all other invocations within the group. [[shaders-group-operations-shuffle]] === Shuffle Group Operations The shuffle group operations allow invocations to read values from other invocations within a group. [[shaders-group-operations-shuffle-relative]] === Shuffle Relative Group Operations The shuffle relative group operations allow invocations to read values from other invocations within the group relative to the current invocation in the group. The relative operations supported allow data to be shifted up and down through the invocations within a group. [[shaders-group-operations-clustered]] === Clustered Group Operations The clustered group operations allow invocations to perform an operation among partitions of a group, such that the operation is only performed within the group invocations within a partition. The partitions for clustered group operations are consecutive power-of-two size groups of invocations and the cluster size must: be known at pipeline creation time. The operations supported are add, mul, min, max, and, or, xor. [[shaders-quad-operations]] == Quad Group Operations Quad group operations (code:OpGroupNonUniformQuad*) are a specialized type of <> that only operate on <>. Whilst these instructions do include a code:Scope parameter, this scope is always overridden; only the <> is included in its execution scope. Fragment shaders that statically execute quad group operations must: launch sufficient invocations to ensure their correct operation; additional <> are launched for framebuffer locations not covered by rasterized fragments if necessary. The index used to select participating invocations is [eq]#i#, as described for a <>, defined as the _quad index_ in the <>. For code:OpGroupNonUniformQuadBroadcast this value is equal to code:Index. For code:OpGroupNonUniformQuadSwap, it is equal to the implicit code:Index used by each participating invocation. endif::VK_VERSION_1_1[] [[shaders-derivative-operations]] == Derivative Operations Derivative operations calculate the partial derivative for an expression [eq]#P# as a function of an invocation's [eq]#x# and [eq]#y# coordinates. Derivative operations operate on a set of invocations known as a _derivative group_ as defined in the <>. A derivative group is equivalent to ifdef::VK_NV_compute_shader_derivatives[] the <> for a compute shader invocation, or endif::VK_NV_compute_shader_derivatives[] the <> for a fragment shader invocation. Derivatives are calculated assuming that [eq]#P# is piecewise linear and continuous within the derivative group. All dynamic instances of explicit derivative instructions (code:OpDPdx*, code:OpDPdy*, and code:OpFwidth*) must: be executed in control flow that is uniform within a derivative group. For other derivative operations, results are undefined: if a dynamic instance is executed in control flow that is not uniform within the derivative group. Fragment shaders that statically execute derivative operations must: launch sufficient invocations to ensure their correct operation; additional <> are launched for framebuffer locations not covered by rasterized fragments if necessary. ifdef::VK_NV_compute_shader_derivatives[] [NOTE] .Note ==== In a compute shader, it is the application's responsibility to ensure that sufficient invocations are launched. ==== endif::VK_NV_compute_shader_derivatives[] Derivative operations calculate their results as the difference between the result of [eq]#P# across invocations in the quad. For fine derivative operations (code:OpDPdxFine and code:OpDPdyFine), the values of [eq]#DPdx(P~i~)# are calculated as {empty}:: [eq]#DPdx(P~0~) = DPdx(P~1~) = P~1~ - P~0~# {empty}:: [eq]#DPdx(P~2~) = DPdx(P~3~) = P~3~ - P~2~# and the values of [eq]#DPdy(P~i~)# are calculated as {empty}:: [eq]#DPdy(P~0~) = DPdy(P~2~) = P~2~ - P~0~# {empty}:: [eq]#DPdy(P~1~) = DPdy(P~3~) = P~3~ - P~1~# where [eq]#i# is the index of each invocation as described in <>. Coarse derivative operations (code:OpDPdxCoarse and code:OpDPdyCoarse), calculate their results in roughly the same manner, but may: only calculate two values instead of four (one for each of [eq]#DPdx# and [eq]#DPdy#), reusing the same result no matter the originating invocation. If an implementation does this, it should: use the fine derivative calculations described for [eq]#P~0~#. [NOTE] .Note ==== Derivative values are calculated between fragments rather than pixels. If the fragment shader invocations involved in the calculation cover multiple pixels, these operations cover a wider area, resulting in larger derivative values. This in turn will result in a coarser LOD being selected for image sampling operations using derivatives. Applications may want to account for this when using multi-pixel fragments; if pixel derivatives are desired, applications should use explicit derivative operations and divide the results by the size of the fragment in each dimension as follows: {empty}:: [eq]#DPdx(P~n~)' = DPdx(P~n~) / w# {empty}:: [eq]#DPdy(P~n~)' = DPdy(P~n~) / h# where [eq]#w# and [eq]#h# are the size of the fragments in the quad, and [eq]#DPdx(P~n~)'# and [eq]#DPdy(P~n~)'# are the pixel derivatives. ==== The results for code:OpDPdx and code:OpDPdy may: be calculated as either fine or coarse derivatives, with implementations favouring the most efficient approach. Implementations must: choose coarse or fine consistently between the two. Executing code:OpFwidthFine, code:OpFwidthCoarse, or code:OpFwidth is equivalent to executing the corresponding code:OpDPdx* and code:OpDPdy* instructions, taking the absolute value of the results, and summing them. Executing an code:OpImage*Sample*ImplicitLod instruction is equivalent to executing code:OpDPdx(code:Coordinate) and code:OpDPdy(code:Coordinate), and passing the results as the code:Grad operands code:dx and code:dy. [NOTE] .Note ==== It is expected that using the code:ImplicitLod variants of sampling functions will be substantially more efficient than using the code:ExplicitLod variants with explicitly generated derivatives. ==== [[shaders-helper-invocations]] == Helper Invocations When performing <> ifdef::VK_VERSION_1_1[] or <> endif::VK_VERSION_1_1[] operations in a fragment shader, additional invocations may: be spawned in order to ensure correct results. These additional invocations are known as _helper invocations_ and can: be identified by a non-zero value in the code:HelperInvocation built-in. Stores and atomics performed by helper invocations must: not have any effect on memory except for the code:Function, code:Private and code:Output storage classes, and values returned by atomic instructions in helper invocations are undefined:. [NOTE] .Note ==== While storage to code:Output storage class has an effect even in helper invocations, it does not mean that helper invocations have an effect on the framebuffer. code:Output variables in fragment shaders can be read from as well, and they behave more like code:Private variables for the duration of the shader invocation. ==== For <> other than <> ifdef::VK_VERSION_1_1[] and <> endif::VK_VERSION_1_1[] operations, helper invocations may: be treated as inactive even if they would be considered otherwise active. ifdef::VK_VERSION_1_3,VK_EXT_shader_demote_to_helper_invocation[] Helper invocations may: become permanently inactive if all invocations in a quad scope instance become helper invocations. endif::VK_VERSION_1_3,VK_EXT_shader_demote_to_helper_invocation[] ifdef::VK_NV_cooperative_matrix,VK_KHR_cooperative_matrix[] == Cooperative Matrices A _cooperative matrix_ type is a SPIR-V type where the storage for and computations performed on the matrix are spread across the invocations in a scope instance. These types give the implementation freedom in how to optimize matrix multiplies. SPIR-V defines the types and instructions, but does not specify rules about what sizes/combinations are valid, and it is expected that different implementations may: support different sizes. ifdef::VK_KHR_cooperative_matrix[] [open,refpage='vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR',desc='Returns properties describing what cooperative matrix types are supported',type='protos'] -- To enumerate the supported cooperative matrix types and operations, call: include::{generated}/api/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR.adoc[] * pname:physicalDevice is the physical device. * pname:pPropertyCount is a pointer to an integer related to the number of cooperative matrix properties available or queried. * pname:pProperties is either `NULL` or a pointer to an array of slink:VkCooperativeMatrixPropertiesKHR structures. If pname:pProperties is `NULL`, then the number of cooperative matrix properties available is returned in pname:pPropertyCount. Otherwise, pname:pPropertyCount must: point to a variable set by the user to the number of elements in the pname:pProperties array, and on return the variable is overwritten with the number of structures actually written to pname:pProperties. If pname:pPropertyCount is less than the number of cooperative matrix properties available, at most pname:pPropertyCount structures will be written, and ename:VK_INCOMPLETE will be returned instead of ename:VK_SUCCESS, to indicate that not all the available cooperative matrix properties were returned. include::{generated}/validity/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR.adoc[] -- endif::VK_KHR_cooperative_matrix[] ifdef::VK_NV_cooperative_matrix[] [open,refpage='vkGetPhysicalDeviceCooperativeMatrixPropertiesNV',desc='Returns properties describing what cooperative matrix types are supported',type='protos'] -- To enumerate the supported cooperative matrix types and operations, call: include::{generated}/api/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.adoc[] * pname:physicalDevice is the physical device. * pname:pPropertyCount is a pointer to an integer related to the number of cooperative matrix properties available or queried. * pname:pProperties is either `NULL` or a pointer to an array of slink:VkCooperativeMatrixPropertiesNV structures. If pname:pProperties is `NULL`, then the number of cooperative matrix properties available is returned in pname:pPropertyCount. Otherwise, pname:pPropertyCount must: point to a variable set by the user to the number of elements in the pname:pProperties array, and on return the variable is overwritten with the number of structures actually written to pname:pProperties. If pname:pPropertyCount is less than the number of cooperative matrix properties available, at most pname:pPropertyCount structures will be written, and ename:VK_INCOMPLETE will be returned instead of ename:VK_SUCCESS, to indicate that not all the available cooperative matrix properties were returned. include::{generated}/validity/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.adoc[] -- endif::VK_NV_cooperative_matrix[] Each ifdef::VK_KHR_cooperative_matrix[slink:VkCooperativeMatrixPropertiesKHR] ifdef::VK_KHR_cooperative_matrix+VK_NV_cooperative_matrix[or] ifdef::VK_NV_cooperative_matrix[slink:VkCooperativeMatrixPropertiesNV] structure describes a single supported combination of types for a matrix multiply/add operation ( ifdef::VK_KHR_cooperative_matrix[code:OpCooperativeMatrixMulAddKHR] ifdef::VK_KHR_cooperative_matrix+VK_NV_cooperative_matrix[or] ifdef::VK_NV_cooperative_matrix[code:OpCooperativeMatrixMulAddNV] ). The multiply can: be described in terms of the following variables and types (in SPIR-V pseudocode): ifdef::VK_KHR_cooperative_matrix[] [source,c] ---- %A is of type OpTypeCooperativeMatrixKHR %AType %scope %MSize %KSize %MatrixAKHR %B is of type OpTypeCooperativeMatrixKHR %BType %scope %KSize %NSize %MatrixBKHR %C is of type OpTypeCooperativeMatrixKHR %CType %scope %MSize %NSize %MatrixAccumulatorKHR %Result is of type OpTypeCooperativeMatrixKHR %ResultType %scope %MSize %NSize %MatrixAccumulatorKHR %Result = %A * %B + %C // using OpCooperativeMatrixMulAddKHR ---- endif::VK_KHR_cooperative_matrix[] ifdef::VK_NV_cooperative_matrix[] [source,c] ---- %A is of type OpTypeCooperativeMatrixNV %AType %scope %MSize %KSize %B is of type OpTypeCooperativeMatrixNV %BType %scope %KSize %NSize %C is of type OpTypeCooperativeMatrixNV %CType %scope %MSize %NSize %D is of type OpTypeCooperativeMatrixNV %DType %scope %MSize %NSize %D = %A * %B + %C // using OpCooperativeMatrixMulAddNV ---- endif::VK_NV_cooperative_matrix[] A matrix multiply with these dimensions is known as an _MxNxK_ matrix multiply. ifdef::VK_KHR_cooperative_matrix[] [open,refpage='VkCooperativeMatrixPropertiesKHR',desc='Structure specifying cooperative matrix properties',type='structs'] -- The sname:VkCooperativeMatrixPropertiesKHR structure is defined as: include::{generated}/api/structs/VkCooperativeMatrixPropertiesKHR.adoc[] * pname:sType is a elink:VkStructureType value identifying this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:MSize is the number of rows in matrices code:A, code:C, and code:Result. * pname:KSize is the number of columns in matrix code:A and rows in matrix code:B. * pname:NSize is the number of columns in matrices code:B, code:C, code:Result. * pname:AType is the component type of matrix code:A, of type elink:VkComponentTypeKHR. * pname:BType is the component type of matrix code:B, of type elink:VkComponentTypeKHR. * pname:CType is the component type of matrix code:C, of type elink:VkComponentTypeKHR. * pname:ResultType is the component type of matrix code:Result, of type elink:VkComponentTypeKHR. * pname:saturatingAccumulation indicates whether the code:SaturatingAccumulation operand to code:OpCooperativeMatrixMulAddKHR must: be present. * pname:scope is the scope of all the matrix types, of type elink:VkScopeKHR. If some types are preferred over other types (e.g. for performance), they should: appear earlier in the list enumerated by flink:vkGetPhysicalDeviceCooperativeMatrixPropertiesKHR. At least one entry in the list must: have power of two values for all of pname:MSize, pname:KSize, and pname:NSize. include::{generated}/validity/structs/VkCooperativeMatrixPropertiesKHR.adoc[] -- endif::VK_KHR_cooperative_matrix[] ifdef::VK_NV_cooperative_matrix[] [open,refpage='VkCooperativeMatrixPropertiesNV',desc='Structure specifying cooperative matrix properties',type='structs'] -- The sname:VkCooperativeMatrixPropertiesNV structure is defined as: include::{generated}/api/structs/VkCooperativeMatrixPropertiesNV.adoc[] * pname:sType is a elink:VkStructureType value identifying this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:MSize is the number of rows in matrices A, C, and D. * pname:KSize is the number of columns in matrix A and rows in matrix B. * pname:NSize is the number of columns in matrices B, C, D. * pname:AType is the component type of matrix A, of type elink:VkComponentTypeNV. * pname:BType is the component type of matrix B, of type elink:VkComponentTypeNV. * pname:CType is the component type of matrix C, of type elink:VkComponentTypeNV. * pname:DType is the component type of matrix D, of type elink:VkComponentTypeNV. * pname:scope is the scope of all the matrix types, of type elink:VkScopeNV. If some types are preferred over other types (e.g. for performance), they should: appear earlier in the list enumerated by flink:vkGetPhysicalDeviceCooperativeMatrixPropertiesNV. At least one entry in the list must: have power of two values for all of pname:MSize, pname:KSize, and pname:NSize. include::{generated}/validity/structs/VkCooperativeMatrixPropertiesNV.adoc[] -- endif::VK_NV_cooperative_matrix[] [open,refpage='VkScopeKHR',desc='Specify SPIR-V scope',type='enums'] -- Possible values for elink:VkScopeKHR include: include::{generated}/api/enums/VkScopeKHR.adoc[] ifdef::VK_NV_cooperative_matrix[] or the equivalent include::{generated}/api/enums/VkScopeNV.adoc[] endif::VK_NV_cooperative_matrix[] * ename:VK_SCOPE_DEVICE_KHR corresponds to SPIR-V code:Device scope. * ename:VK_SCOPE_WORKGROUP_KHR corresponds to SPIR-V code:Workgroup scope. * ename:VK_SCOPE_SUBGROUP_KHR corresponds to SPIR-V code:Subgroup scope. * ename:VK_SCOPE_QUEUE_FAMILY_KHR corresponds to SPIR-V code:QueueFamily scope. All enum values match the corresponding SPIR-V value. -- [open,refpage='VkComponentTypeKHR',desc='Specify SPIR-V cooperative matrix component type',type='enums'] -- Possible values for elink:VkComponentTypeKHR include: include::{generated}/api/enums/VkComponentTypeKHR.adoc[] ifdef::VK_NV_cooperative_matrix[] or the equivalent include::{generated}/api/enums/VkComponentTypeNV.adoc[] endif::VK_NV_cooperative_matrix[] * ename:VK_COMPONENT_TYPE_FLOAT16_KHR corresponds to SPIR-V code:OpTypeFloat 16. * ename:VK_COMPONENT_TYPE_FLOAT32_KHR corresponds to SPIR-V code:OpTypeFloat 32. * ename:VK_COMPONENT_TYPE_FLOAT64_KHR corresponds to SPIR-V code:OpTypeFloat 64. * ename:VK_COMPONENT_TYPE_SINT8_KHR corresponds to SPIR-V code:OpTypeInt 8 1. * ename:VK_COMPONENT_TYPE_SINT16_KHR corresponds to SPIR-V code:OpTypeInt 16 1. * ename:VK_COMPONENT_TYPE_SINT32_KHR corresponds to SPIR-V code:OpTypeInt 32 1. * ename:VK_COMPONENT_TYPE_SINT64_KHR corresponds to SPIR-V code:OpTypeInt 64 1. * ename:VK_COMPONENT_TYPE_UINT8_KHR corresponds to SPIR-V code:OpTypeInt 8 0. * ename:VK_COMPONENT_TYPE_UINT16_KHR corresponds to SPIR-V code:OpTypeInt 16 0. * ename:VK_COMPONENT_TYPE_UINT32_KHR corresponds to SPIR-V code:OpTypeInt 32 0. * ename:VK_COMPONENT_TYPE_UINT64_KHR corresponds to SPIR-V code:OpTypeInt 64 0. -- endif::VK_NV_cooperative_matrix,VK_KHR_cooperative_matrix[] ifdef::VK_EXT_validation_cache[] [[shaders-validation-cache]] == Validation Cache [open,refpage='VkValidationCacheEXT',desc='Opaque handle to a validation cache object',type='handles'] -- Validation cache objects allow the result of internal validation to be reused, both within a single application run and between multiple runs. Reuse within a single run is achieved by passing the same validation cache object when creating supported Vulkan objects. Reuse across runs of an application is achieved by retrieving validation cache contents in one run of an application, saving the contents, and using them to preinitialize a validation cache on a subsequent run. The contents of the validation cache objects are managed by the validation layers. Applications can: manage the host memory consumed by a validation cache object and control the amount of data retrieved from a validation cache object. Validation cache objects are represented by sname:VkValidationCacheEXT handles: include::{generated}/api/handles/VkValidationCacheEXT.adoc[] -- [open,refpage='vkCreateValidationCacheEXT',desc='Creates a new validation cache',type='protos'] -- To create validation cache objects, call: include::{generated}/api/protos/vkCreateValidationCacheEXT.adoc[] * pname:device is the logical device that creates the validation cache object. * pname:pCreateInfo is a pointer to a slink:VkValidationCacheCreateInfoEXT structure containing the initial parameters for the validation cache object. * pname:pAllocator controls host memory allocation as described in the <> chapter. * pname:pValidationCache is a pointer to a slink:VkValidationCacheEXT handle in which the resulting validation cache object is returned. [NOTE] .Note ==== Applications can: track and manage the total host memory size of a validation cache object using the pname:pAllocator. Applications can: limit the amount of data retrieved from a validation cache object in fname:vkGetValidationCacheDataEXT. Implementations should: not internally limit the total number of entries added to a validation cache object or the total host memory consumed. ==== Once created, a validation cache can: be passed to the fname:vkCreateShaderModule command by adding this object to the slink:VkShaderModuleCreateInfo structure's pname:pNext chain. If a slink:VkShaderModuleValidationCacheCreateInfoEXT object is included in the slink:VkShaderModuleCreateInfo::pname:pNext chain, and its pname:validationCache field is not dlink:VK_NULL_HANDLE, the implementation will query it for possible reuse opportunities and update it with new content. The use of the validation cache object in these commands is internally synchronized, and the same validation cache object can: be used in multiple threads simultaneously. [NOTE] .Note ==== Implementations should: make every effort to limit any critical sections to the actual accesses to the cache, which is expected to be significantly shorter than the duration of the fname:vkCreateShaderModule command. ==== include::{generated}/validity/protos/vkCreateValidationCacheEXT.adoc[] -- [open,refpage='VkValidationCacheCreateInfoEXT',desc='Structure specifying parameters of a newly created validation cache',type='structs'] -- The sname:VkValidationCacheCreateInfoEXT structure is defined as: include::{generated}/api/structs/VkValidationCacheCreateInfoEXT.adoc[] * pname:sType is a elink:VkStructureType value identifying this structure. * pname:pNext is `NULL` or a pointer to a structure extending this structure. * pname:flags is reserved for future use. * pname:initialDataSize is the number of bytes in pname:pInitialData. If pname:initialDataSize is zero, the validation cache will initially be empty. * pname:pInitialData is a pointer to previously retrieved validation cache data. If the validation cache data is incompatible (as defined below) with the device, the validation cache will be initially empty. If pname:initialDataSize is zero, pname:pInitialData is ignored. .Valid Usage **** * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01534]] If pname:initialDataSize is not `0`, it must: be equal to the size of pname:pInitialData, as returned by fname:vkGetValidationCacheDataEXT when pname:pInitialData was originally retrieved * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01535]] If pname:initialDataSize is not `0`, pname:pInitialData must: have been retrieved from a previous call to fname:vkGetValidationCacheDataEXT **** include::{generated}/validity/structs/VkValidationCacheCreateInfoEXT.adoc[] -- [open,refpage='VkValidationCacheCreateFlagsEXT',desc='Reserved for future use',type='flags'] -- include::{generated}/api/flags/VkValidationCacheCreateFlagsEXT.adoc[] tname:VkValidationCacheCreateFlagsEXT is a bitmask type for setting a mask, but is currently reserved for future use. -- [open,refpage='vkMergeValidationCachesEXT',desc='Combine the data stores of validation caches',type='protos'] -- Validation cache objects can: be merged using the command: include::{generated}/api/protos/vkMergeValidationCachesEXT.adoc[] * pname:device is the logical device that owns the validation cache objects. * pname:dstCache is the handle of the validation cache to merge results into. * pname:srcCacheCount is the length of the pname:pSrcCaches array. * pname:pSrcCaches is a pointer to an array of validation cache handles, which will be merged into pname:dstCache. The previous contents of pname:dstCache are included after the merge. [NOTE] .Note ==== The details of the merge operation are implementation-dependent, but implementations should: merge the contents of the specified validation caches and prune duplicate entries. ==== .Valid Usage **** * [[VUID-vkMergeValidationCachesEXT-dstCache-01536]] pname:dstCache must: not appear in the list of source caches **** include::{generated}/validity/protos/vkMergeValidationCachesEXT.adoc[] -- [open,refpage='vkGetValidationCacheDataEXT',desc='Get the data store from a validation cache',type='protos'] -- Data can: be retrieved from a validation cache object using the command: include::{generated}/api/protos/vkGetValidationCacheDataEXT.adoc[] * pname:device is the logical device that owns the validation cache. * pname:validationCache is the validation cache to retrieve data from. * pname:pDataSize is a pointer to a value related to the amount of data in the validation cache, as described below. * pname:pData is either `NULL` or a pointer to a buffer. If pname:pData is `NULL`, then the maximum size of the data that can: be retrieved from the validation cache, in bytes, is returned in pname:pDataSize. Otherwise, pname:pDataSize must: point to a variable set by the user to the size of the buffer, in bytes, pointed to by pname:pData, and on return the variable is overwritten with the amount of data actually written to pname:pData. If pname:pDataSize is less than the maximum size that can: be retrieved by the validation cache, at most pname:pDataSize bytes will be written to pname:pData, and fname:vkGetValidationCacheDataEXT will return ename:VK_INCOMPLETE instead of ename:VK_SUCCESS, to indicate that not all of the validation cache was returned. Any data written to pname:pData is valid and can: be provided as the pname:pInitialData member of the slink:VkValidationCacheCreateInfoEXT structure passed to fname:vkCreateValidationCacheEXT. Two calls to fname:vkGetValidationCacheDataEXT with the same parameters must: retrieve the same data unless a command that modifies the contents of the cache is called between them. [[validation-cache-header]] Applications can: store the data retrieved from the validation cache, and use these data, possibly in a future run of the application, to populate new validation cache objects. The results of validation, however, may: depend on the vendor ID, device ID, driver version, and other details of the device. To enable applications to detect when previously retrieved data is incompatible with the device, the initial bytes written to pname:pData must: be a header consisting of the following members: .Layout for validation cache header version ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT [width="85%",cols="8%,21%,71%",options="header"] |==== | Offset | Size | Meaning | 0 | 4 | length in bytes of the entire validation cache header written as a stream of bytes, with the least significant byte first | 4 | 4 | a elink:VkValidationCacheHeaderVersionEXT value written as a stream of bytes, with the least significant byte first | 8 | ename:VK_UUID_SIZE | a layer commit ID expressed as a UUID, which uniquely identifies the version of the validation layers used to generate these validation results |==== The first four bytes encode the length of the entire validation cache header, in bytes. This value includes all fields in the header including the validation cache version field and the size of the length field. The next four bytes encode the validation cache version, as described for elink:VkValidationCacheHeaderVersionEXT. A consumer of the validation cache should: use the cache version to interpret the remainder of the cache header. If pname:pDataSize is less than what is necessary to store this header, nothing will be written to pname:pData and zero will be written to pname:pDataSize. include::{generated}/validity/protos/vkGetValidationCacheDataEXT.adoc[] -- [open,refpage='VkValidationCacheHeaderVersionEXT',desc='Encode validation cache version',type='enums',xrefs='vkCreateValidationCacheEXT vkGetValidationCacheDataEXT'] -- Possible values of the second group of four bytes in the header returned by flink:vkGetValidationCacheDataEXT, encoding the validation cache version, are: include::{generated}/api/enums/VkValidationCacheHeaderVersionEXT.adoc[] * ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT specifies version one of the validation cache. -- [open,refpage='vkDestroyValidationCacheEXT',desc='Destroy a validation cache object',type='protos'] -- To destroy a validation cache, call: include::{generated}/api/protos/vkDestroyValidationCacheEXT.adoc[] * pname:device is the logical device that destroys the validation cache object. * pname:validationCache is the handle of the validation cache to destroy. * pname:pAllocator controls host memory allocation as described in the <> chapter. .Valid Usage **** * [[VUID-vkDestroyValidationCacheEXT-validationCache-01537]] If sname:VkAllocationCallbacks were provided when pname:validationCache was created, a compatible set of callbacks must: be provided here * [[VUID-vkDestroyValidationCacheEXT-validationCache-01538]] If no sname:VkAllocationCallbacks were provided when pname:validationCache was created, pname:pAllocator must: be `NULL` **** include::{generated}/validity/protos/vkDestroyValidationCacheEXT.adoc[] -- endif::VK_EXT_validation_cache[] ifdef::VK_NV_cuda_kernel_launch[] include::{chapters}/VK_NV_cuda_kernel_launch/module.adoc[] endif::VK_NV_cuda_kernel_launch[]