1# 'std' Dialect
2
3This dialect provides documentation for operations within the Standard dialect.
4
5Note: This dialect is a collection of operations for several different concepts,
6and should be split into multiple more-focused dialects accordingly.
7
8**Please post an RFC on the [forum](https://llvm.discourse.group/c/mlir/31)
9before adding or changing any operation in this dialect.**
10
11[TOC]
12
13## Operations
14
15[include "Dialects/StandardOps.md"]
16
17### 'dma_start' operation
18
19Syntax:
20
21```
22operation ::= `dma_start` ssa-use`[`ssa-use-list`]` `,`
23               ssa-use`[`ssa-use-list`]` `,` ssa-use `,`
24               ssa-use`[`ssa-use-list`]` (`,` ssa-use `,` ssa-use)?
25              `:` memref-type `,` memref-type `,` memref-type
26```
27
28Starts a non-blocking DMA operation that transfers data from a source memref to
29a destination memref. The operands include the source and destination memref's
30each followed by its indices, size of the data transfer in terms of the number
31of elements (of the elemental type of the memref), a tag memref with its
32indices, and optionally two additional arguments corresponding to the stride (in
33terms of number of elements) and the number of elements to transfer per stride.
34The tag location is used by a dma_wait operation to check for completion. The
35indices of the source memref, destination memref, and the tag memref have the
36same restrictions as any load/store operation in an affine context (whenever DMA
37operations appear in an affine context). See
38[restrictions on dimensions and symbols](Affine.md#restrictions-on-dimensions-and-symbols)
39in affine contexts. This allows powerful static analysis and transformations in
40the presence of such DMAs including rescheduling, pipelining / overlap with
41computation, and checking for matching start/end operations. The source and
42destination memref need not be of the same dimensionality, but need to have the
43same elemental type.
44
45For example, a `dma_start` operation that transfers 32 vector elements from a
46memref `%src` at location `[%i, %j]` to memref `%dst` at `[%k, %l]` would be
47specified as shown below.
48
49Example:
50
51```mlir
52%size = constant 32 : index
53%tag = alloc() : memref<1 x i32, affine_map<(d0) -> (d0)>, 4>
54%idx = constant 0 : index
55dma_start %src[%i, %j], %dst[%k, %l], %size, %tag[%idx] :
56     memref<40 x 8 x vector<16xf32>, affine_map<(d0, d1) -> (d0, d1)>, 0>,
57     memref<2 x 4 x vector<16xf32>, affine_map<(d0, d1) -> (d0, d1)>, 2>,
58     memref<1 x i32>, affine_map<(d0) -> (d0)>, 4>
59```
60
61### 'dma_wait' operation
62
63Syntax:
64
65```
66operation ::= `dma_wait` ssa-use`[`ssa-use-list`]` `,` ssa-use `:` memref-type
67```
68
69Blocks until the completion of a DMA operation associated with the tag element
70specified with a tag memref and its indices. The operands include the tag memref
71followed by its indices and the number of elements associated with the DMA being
72waited on. The indices of the tag memref have the same restrictions as
73load/store indices.
74
75Example:
76
77```mlir
78dma_wait %tag[%idx], %size : memref<1 x i32, affine_map<(d0) -> (d0)>, 4>
79```
80