1// Copyright (c) 2014-2020 Khronos Group.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5
6[[fault-handling]]
7== Fault Handling
8
9The fault handling mechanism provides a method for the implementation to
10pass fault information to the application.
11A fault indicates that an issue has occurred with the host or device that
12could impact the implementation's ability to function correctly.
13It consists of a slink:VkFaultData structure that is used to communicate
14information about the fault between the implementation and the application,
15with two methods to obtain the data.
16The application can: obtain the fault data from the implementation using
17flink:vkGetFaultData.
18Alternatively, the implementation can: directly call a pre-registered fault
19handler function (tlink:PFN_vkFaultCallbackFunction) in the application when
20a fault occurs.
21
22The sname:VkFaultData structure provides categories the implementation must:
23set to provide basic information on a fault.
24These allow the implementation to provide a coarse classification of a fault
25to the application.
26As the potential faults that could occur will vary between different
27platforms, it is expected that an implementation would also provide
28additional implementation-specific data on the fault, enabling the
29application to take appropriate action.
30
31The implementation must: also define whether a particular fault results in
32the fault callback function being called, is communicated via
33flink:vkGetFaultData, or both.
34This will be decided by several factors including:
35
36  * the severity of the fault,
37  * the application's ability to handle the fault, and
38  * how the application should handle the fault.
39
40The implementation must: document the implementation-specific fault data,
41how the faults are communicated, and expected responses from the application
42for each of the faults that it can: report.
43
44[[fault-data]]
45=== Fault Data
46
47[open,refpage='VkFaultData',desc='structure describing fault data',type='structs']
48--
49The information on a single fault is returned using the sname:VkFaultData
50structure.
51The sname:VkFaultData structure is defined as:
52
53include::{generated}/api/structs/VkFaultData.adoc[]
54
55  * pname:sType is a elink:VkStructureType value identifying this structure.
56  * pname:pNext is `NULL` or a pointer to a structure extending this
57    structure that provides implementation-specific data on the fault.
58  * pname:faultLevel is a elink:VkFaultLevel that provides the severity of
59    the fault.
60  * pname:faultType is a elink:VkFaultType that provides the type of the
61    fault.
62
63To retrieve implementation-specific fault data, pname:pNext can: point to
64one or more implementation-defined fault structures or `NULL` to not
65retrieve implementation-specific data.
66
67.Valid Usage
68****
69  * [[VUID-VkFaultData-pNext-05019]]
70    pname:pNext must: be `NULL` or a valid pointer to an
71    implementation-specific structure
72****
73
74include::{generated}/validity/structs/VkFaultData.adoc[]
75
76--
77
78[open,refpage='VkFaultLevel',desc='The different fault severity levels that can be returned',type='enums']
79--
80Possible values of slink:VkFaultData::pname:faultLevel, specifying the fault
81severity, are:
82
83include::{generated}/api/enums/VkFaultLevel.adoc[]
84
85  * ename:VK_FAULT_LEVEL_UNASSIGNED A fault level has not been assigned.
86  * ename:VK_FAULT_LEVEL_CRITICAL A fault that cannot: be recovered by the
87    application.
88  * ename:VK_FAULT_LEVEL_RECOVERABLE A fault that can: be recovered by the
89    application.
90  * ename:VK_FAULT_LEVEL_WARNING A fault that indicates a non-optimal
91    condition has occurred, but no recovery is necessary at this point.
92
93--
94
95[open,refpage='VkFaultType',desc='The different fault types that can be returned',type='enums']
96--
97
98Possible values of slink:VkFaultData::pname:faultType, specifying the fault
99type, are:
100
101include::{generated}/api/enums/VkFaultType.adoc[]
102
103  * ename:VK_FAULT_TYPE_INVALID The fault data does not contain a valid
104    fault.
105  * ename:VK_FAULT_TYPE_UNASSIGNED A fault type has not been assigned.
106  * ename:VK_FAULT_TYPE_IMPLEMENTATION Implementation-defined fault.
107  * ename:VK_FAULT_TYPE_SYSTEM A fault occurred in the system components.
108  * ename:VK_FAULT_TYPE_PHYSICAL_DEVICE A fault occurred with the physical
109    device.
110  * ename:VK_FAULT_TYPE_COMMAND_BUFFER_FULL Command buffer memory was
111    exhausted before flink:vkEndCommandBuffer was called.
112  * ename:VK_FAULT_TYPE_INVALID_API_USAGE Invalid usage of the API was
113    detected by the implementation.
114--
115
116
117[[querrying-fault]]
118=== Querying Fault Status
119
120[open,refpage='vkGetFaultData',desc='Query fault information',type='protos']
121--
122:refpage: vkGetFaultData
123
124To query the number of current faults and obtain the fault data, call
125flink:vkGetFaultData.
126
127include::{generated}/api/protos/vkGetFaultData.adoc[]
128
129  * pname:device is the logical device to obtain faults from.
130  * pname:faultQueryBehavior is a elink:VkFaultQueryBehavior that specifies
131    the types of faults to obtain from the implementation, and how those
132    faults should be handled.
133  * pname:pUnrecordedFaults is a return boolean that specifies if the logged
134    fault information is incomplete and does not contain entries for all
135    faults that have been detected by the implementation and may: be
136    reported via flink:vkGetFaultData.
137  * pname:pFaultCount is a pointer to an integer that specifies the number
138    of fault entries.
139  * pname:pFaults is either `NULL` or a pointer to an array of
140    pname:pFaultCount slink:VkFaultData structures to be updated with the
141    recorded fault data.
142
143Access to fault data is internally synchronized, meaning
144flink:vkGetFaultData can: be called from multiple threads simultaneously.
145
146The implementation must: not record more than <<limits-maxQueryFaultCount,
147pname:maxQueryFaultCount>> faults to be reported by flink:vkGetFaultData.
148
149pname:pUnrecordedFaults is set to ename:VK_TRUE if the implementation has
150detected one or more faults since the last successful retrieval of fault
151data using this command, but was unable to record fault information for all
152faults.
153Otherwise, pname:pUnrecordedFaults is set to ename:VK_FALSE.
154
155If pname:pFaults is `NULL`, then the number of faults with the specified
156pname:faultQueryBehavior characteristics associated with pname:device is
157returned in pname:pFaultCount, and pname:pUnrecordedFaults is set as
158indicated above.
159Otherwise, pname:pFaultCount must: point to a variable set by the user to
160the number of elements in the pname:pFaults array, and on return the
161variable is overwritten with the number of faults actually written to
162pname:pFaults.
163If pname:pFaultCount is less than the number of recorded pname:device faults
164with the specified pname:faultQueryBehavior characteristics, at most
165pname:pFaultCount faults will be written, and ename:VK_INCOMPLETE will be
166returned instead of ename:VK_SUCCESS, to indicate that not all the available
167faults were returned.
168
169On success, the fault information stored by the implementation for the
170faults that were returned will be handled as specified by
171pname:faultQueryBehavior.
172
173For each filled pname:pFaults entry, if pname:pNext is not `NULL`, the
174implementation will fill in any implementation-specific structures
175applicable to that fault that are included in the pname:pNext chain.
176
177[NOTE]
178.Note
179====
180In order to simplify the application logic, an application could have a
181static allocation sized to <<limits-maxQueryFaultCount,
182pname:maxQueryFaultCount>> which it passes in to each call of
183flink:vkGetFaultData.
184This allows an application to obtain all the faults available at this time
185in a single call to flink:vkGetFaultData.
186Furthermore, under this usage pattern, the command will never return
187ename:VK_INCOMPLETE.
188====
189
190include::{chapters}/commonvalidity/no_dynamic_allocations_common.adoc[]
191
192.Valid Usage
193****
194  * [[VUID-vkGetFaultData-pFaultCount-05020]]
195    pname:pFaultCount must: be less than or equal to
196    <<limits-maxQueryFaultCount,pname:maxQueryFaultCount>>
197****
198
199include::{generated}/validity/protos/vkGetFaultData.adoc[]
200--
201
202
203[open,refpage='VkFaultQueryBehavior',desc='Controls how the faults are retrieved by vkGetFaultData',type='enums']
204--
205Possible values that can: be set in elink:VkFaultQueryBehavior, specifying
206which faults to return, are:
207
208include::{generated}/api/enums/VkFaultQueryBehavior.adoc[]
209
210  * ename:VK_FAULT_QUERY_BEHAVIOR_GET_AND_CLEAR_ALL_FAULTS All fault types
211    and severities are reported and are cleared from the internal fault
212    storage after retrieval.
213
214--
215
216[[fault-callback]]
217=== Fault Callback
218
219The slink:VkFaultCallbackInfo structure allows an application to register a
220function at device creation that the implementation can call to report
221faults when they occur.
222A callback function is registered by attaching a valid
223sname:VkFaultCallbackInfo structure to the pname:pNext chain of the
224slink:VkDeviceCreateInfo structure.
225The callback function is only called by the implementation during a call to
226the API, using the same thread that is making the API call.
227The sname:VkFaultCallbackInfo structure provides the function pointer to be
228called by the implementation, and optionally, application memory to store
229fault data.
230
231[open,refpage='VkFaultCallbackInfo',desc='Fault call back information',type='structs']
232--
233
234The sname:VkFaultCallbackInfo structure is defined as:
235
236include::{generated}/api/structs/VkFaultCallbackInfo.adoc[]
237
238  * pname:sType is a elink:VkStructureType value identifying this structure.
239  * pname:pNext is `NULL` or pointer to a structure extending this
240    structure.
241  * pname:faultCount is the number of reported faults in the array pointed
242    to by pname:pFaults.
243  * pname:pFaults is either `NULL` or a pointer to an array of
244    pname:faultCount slink:VkFaultData structures.
245  * pname:pfnFaultCallback is a function pointer to the fault handler
246    function that will be called by the implementation when a fault occurs.
247
248If provided, the implementation may: make use of the pname:pFaults array to
249return fault data to the application when using the fault callback.
250
251[NOTE]
252.Note
253====
254Prior to Vulkan SC 1.0.11, the application was required to provide the
255pname:pFaults array for fault callback data.
256This proved to be unwieldy for both applications and implementations and it
257was made optional as of version 1.0.11.
258It is expected that most implementations will ignore this and use stack or
259other preallocated memory for fault callback parameters.
260====
261
262If provided, the application memory referenced by pname:pFaults must: remain
263accessible throughout the lifetime of the logical device that was created
264with this structure.
265
266[NOTE]
267.Note
268====
269The memory pointed to by pname:pFaults will be updated by the implementation
270and should not be used or accessed by the application outside of the fault
271handling function pointed to by pname:pfnFaultCallback.
272This restriction also applies to any implementation-specific structure
273chained to an element of pname:pFaults by pname:pNext.
274
275It is expected that implementations will maintain separate storage for fault
276information and populate the array pointed to by pname:pFaults ahead of
277calling the fault callback function.
278====
279
280.Valid Usage
281****
282  * [[VUID-VkFaultCallbackInfo-faultCount-05138]]
283    pname:faultCount must: either be 0, or equal to
284    <<limits-maxCallbackFaultCount,
285    sname:VkPhysicalDeviceVulkanSC10Properties::pname:maxCallbackFaultCount>>
286****
287
288include::{generated}/validity/structs/VkFaultCallbackInfo.adoc[]
289--
290
291[open,refpage='PFN_vkFaultCallbackFunction',desc='Fault Callback Function',type='funcpointers']
292--
293
294The function pointer tlink:PFN_vkFaultCallbackFunction is defined as:
295
296include::{generated}/api/funcpointers/PFN_vkFaultCallbackFunction.adoc[]
297
298  * pname:unrecordedFaults is a boolean that specifies if the supplied fault
299    information is incomplete and does not contain entries for all faults
300    that have been detected by the implementation and may: be reported via
301    tlink:PFN_vkFaultCallbackFunction since the last call to this callback.
302  * pname:faultCount will contain the number of reported faults in the array
303    pointed to by pname:pFaults.
304  * pname:pFaults will point to an array of pname:faultCount
305    slink:VkFaultData structures containing the fault information.
306
307An implementation must: only make calls to pname:pfnFaultCallback during the
308execution of an API command.
309An implementation must: only make calls into the application-provided fault
310callback from the same thread that called the API command.
311The implementation should: not synchronize calls to the callback.
312If synchronization is needed, the callback must: provide it.
313
314The fault callback must: not call any Vulkan commands.
315
316It is implementation-dependent whether faults reported by this callback are
317also reported via flink:vkGetFaultData, but each unique fault will be
318reported by at most one callback.
319--
320
321ifdef::hidden[]
322// tag::scaddition[]
323  * slink:VkFaultData <<SCID-6>>
324  * slink:VkFaultCallbackInfo <<SCID-6>>
325  * elink:VkFaultLevel <<SCID-6>>
326  * elink:VkFaultType <<SCID-6>>
327  * elink:VkFaultQueryBehavior <<SCID-6>>
328  * tlink:PFN_vkFaultCallbackFunction <<SCID-6>>
329  * flink:vkGetFaultData <<SCID-6>>
330// end::scaddition[]
331endif::hidden[]
332