1# 'gpu' Dialect
2
3Note: this dialect is more likely to change than others in the near future; use
4with caution.
5
6This dialect provides middle-level abstractions for launching GPU kernels
7following a programming model similar to that of CUDA or OpenCL. It provides
8abstractions for kernel invocations (and may eventually provide those for device
9management) that are not present at the lower level (e.g., as LLVM IR intrinsics
10for GPUs). Its goal is to abstract away device- and driver-specific
11manipulations to launch a GPU kernel and provide a simple path towards GPU
12execution from MLIR. It may be targeted, for example, by DSLs using MLIR. The
13dialect uses `gpu` as its canonical prefix.
14
15## Memory attribution
16
17Memory buffers are defined at the function level, either in "gpu.launch" or in
18"gpu.func" ops. This encoding makes it clear where the memory belongs and makes
19the lifetime of the memory visible. The memory is only accessible while the
20kernel is launched/the function is currently invoked. The latter is more strict
21than actual GPU implementations but using static memory at the function level is
22just for convenience. It is also always possible to pass pointers to the
23workgroup memory into other functions, provided they expect the correct memory
24space.
25
26The buffers are considered live throughout the execution of the GPU function
27body. The absence of memory attribution syntax means that the function does not
28require special buffers. Rationale: although the underlying models declare
29memory buffers at the module level, we chose to do it at the function level to
30provide some structuring for the lifetime of those buffers; this avoids the
31incentive to use the buffers for communicating between different kernels or
32launches of the same kernel, which should be done through function arguments
33instead; we chose not to use `alloca`-style approach that would require more
34complex lifetime analysis following the principles of MLIR that promote
35structure and representing analysis results in the IR.
36
37## Operations
38
39[include "Dialects/GPUOps.md"]
40