1# MLIR Language Reference
2
3MLIR (Multi-Level IR) is a compiler intermediate representation with
4similarities to traditional three-address SSA representations (like
5[LLVM IR](http://llvm.org/docs/LangRef.html) or
6[SIL](https://github.com/apple/swift/blob/master/docs/SIL.rst)), but which
7introduces notions from polyhedral loop optimization as first-class concepts.
8This hybrid design is optimized to represent, analyze, and transform high level
9dataflow graphs as well as target-specific code generated for high performance
10data parallel systems. Beyond its representational capabilities, its single
11continuous design provides a framework to lower from dataflow graphs to
12high-performance target-specific code.
13
14This document defines and describes the key concepts in MLIR, and is intended
15to be a dry reference document - the [rationale
16documentation](Rationale/Rationale.md),
17[glossary](../getting_started/Glossary.md), and other content are hosted
18elsewhere.
19
20MLIR is designed to be used in three different forms: a human-readable textual
21form suitable for debugging, an in-memory form suitable for programmatic
22transformations and analysis, and a compact serialized form suitable for
23storage and transport. The different forms all describe the same semantic
24content. This document describes the human-readable textual form.
25
26[TOC]
27
28## High-Level Structure
29
30MLIR is fundamentally based on a graph-like data structure of nodes, called
31*Operations*, and edges, called *Values*. Each Value is the result of exactly
32one Operation or Block Argument, and has a *Value Type* defined by the [type
33system](#type-system).  [Operations](#operations) are contained in
34[Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations
35are also ordered within their containing block and Blocks are ordered in their
36containing region, although this order may or may not be semantically
37meaningful in a given [kind of region](Interfaces.md#regionkindinterfaces)).
38Operations may also contain regions, enabling hierarchical structures to be
39represented.
40
41Operations can represent many different concepts, from higher-level concepts
42like function definitions, function calls, buffer allocations, view or slices
43of buffers, and process creation, to lower-level concepts like
44target-independent arithmetic, target-specific instructions, configuration
45registers, and logic gates. These different concepts are represented by
46different operations in MLIR and the set of operations usable in MLIR can be
47arbitrarily extended.
48
49MLIR also provides an extensible framework for transformations on operations,
50using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary
51set of passes on an arbitrary set of operations results in a significant
52scaling challenge, since each transformation must potentially take into
53account the semantics of any operation. MLIR addresses this complexity by
54allowing operation semantics to be described abstractly using
55[Traits](Traits.md) and [Interfaces](Interfaces.md), enabling transformations
56to operate on operations more generically.  Traits often describe verification
57constraints on valid IR, enabling complex invariants to be captured and
58checked. (see [Op vs
59Operation](docs/Tutorials/Toy/Ch-2/#op-vs-operation-using-mlir-operations))
60
61One obvious application of MLIR is to represent an
62[SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR,
63like the LLVM core IR, with appropriate choice of Operation Types to define
64[Modules](#module), [Functions](#functions), Branches, Allocations, and
65verification constraints to ensure the SSA Dominance property. MLIR includes a
66'standard' dialect which defines just such structures. However, MLIR is
67intended to be general enough to represent other compiler-like data
68structures, such as Abstract Syntax Trees in a language frontend, generated
69instructions in a target-specific backend, or circuits in a High-Level
70Synthesis tool.
71
72Here's an example of an MLIR module:
73
74```mlir
75// Compute A*B using an implementation of multiply kernel and print the
76// result using a TensorFlow op. The dimensions of A and B are partially
77// known. The shapes are assumed to match.
78func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) {
79  // Compute the inner dimension of %A using the dim operation.
80  %n = dim %A, 1 : tensor<100x?xf32>
81
82  // Allocate addressable "buffers" and copy tensors %A and %B into them.
83  %A_m = alloc(%n) : memref<100x?xf32>
84  tensor_store %A to %A_m : memref<100x?xf32>
85
86  %B_m = alloc(%n) : memref<?x50xf32>
87  tensor_store %B to %B_m : memref<?x50xf32>
88
89  // Call function @multiply passing memrefs as arguments,
90  // and getting returned the result of the multiplication.
91  %C_m = call @multiply(%A_m, %B_m)
92          : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>)
93
94  dealloc %A_m : memref<100x?xf32>
95  dealloc %B_m : memref<?x50xf32>
96
97  // Load the buffer data into a higher level "tensor" value.
98  %C = tensor_load %C_m : memref<100x50xf32>
99  dealloc %C_m : memref<100x50xf32>
100
101  // Call TensorFlow built-in function to print the result tensor.
102  "tf.Print"(%C){message: "mul result"}
103                  : (tensor<100x50xf32) -> (tensor<100x50xf32>)
104
105  return %C : tensor<100x50xf32>
106}
107
108// A function that multiplies two memrefs and returns the result.
109func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>)
110          -> (memref<100x50xf32>)  {
111  // Compute the inner dimension of %A.
112  %n = dim %A, 1 : memref<100x?xf32>
113
114  // Allocate memory for the multiplication result.
115  %C = alloc() : memref<100x50xf32>
116
117  // Multiplication loop nest.
118  affine.for %i = 0 to 100 {
119     affine.for %j = 0 to 50 {
120        store 0 to %C[%i, %j] : memref<100x50xf32>
121        affine.for %k = 0 to %n {
122           %a_v  = load %A[%i, %k] : memref<100x?xf32>
123           %b_v  = load %B[%k, %j] : memref<?x50xf32>
124           %prod = mulf %a_v, %b_v : f32
125           %c_v  = load %C[%i, %j] : memref<100x50xf32>
126           %sum  = addf %c_v, %prod : f32
127           store %sum, %C[%i, %j] : memref<100x50xf32>
128        }
129     }
130  }
131  return %C : memref<100x50xf32>
132}
133```
134
135## Notation
136
137MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip
138through a textual form. This is important for development of the compiler -
139e.g.  for understanding the state of code as it is being transformed and
140writing test cases.
141
142This document describes the grammar using
143[Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form).
144
145This is the EBNF grammar used in this document, presented in yellow boxes.
146
147```
148alternation ::= expr0 | expr1 | expr2  // Either expr0 or expr1 or expr2.
149sequence    ::= expr0 expr1 expr2      // Sequence of expr0 expr1 expr2.
150repetition0 ::= expr*  // 0 or more occurrences.
151repetition1 ::= expr+  // 1 or more occurrences.
152optionality ::= expr?  // 0 or 1 occurrence.
153grouping    ::= (expr) // Everything inside parens is grouped together.
154literal     ::= `abcd` // Matches the literal `abcd`.
155```
156
157Code examples are presented in blue boxes.
158
159```mlir
160// This is an example use of the grammar above:
161// This matches things like: ba, bana, boma, banana, banoma, bomana...
162example ::= `b` (`an` | `om`)* `a`
163```
164
165### Common syntax
166
167The following core grammar productions are used in this document:
168
169```
170// TODO: Clarify the split between lexing (tokens) and parsing (grammar).
171digit     ::= [0-9]
172hex_digit ::= [0-9a-fA-F]
173letter    ::= [a-zA-Z]
174id-punct  ::= [$._-]
175
176integer-literal ::= decimal-literal | hexadecimal-literal
177decimal-literal ::= digit+
178hexadecimal-literal ::= `0x` hex_digit+
179float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)?
180string-literal  ::= `"` [^"\n\f\v\r]* `"`   TODO: define escaping rules
181```
182
183Not listed here, but MLIR does support comments. They use standard BCPL syntax,
184starting with a `//` and going until the end of the line.
185
186### Identifiers and keywords
187
188Syntax:
189
190```
191// Identifiers
192bare-id ::= (letter|[_]) (letter|digit|[_$.])*
193bare-id-list ::= bare-id (`,` bare-id)*
194value-id ::= `%` suffix-id
195suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*))
196
197symbol-ref-id ::= `@` (suffix-id | string-literal)
198value-id-list ::= value-id (`,` value-id)*
199
200// Uses of value, e.g. in an operand list to an operation.
201value-use ::= value-id
202value-use-list ::= value-use (`,` value-use)*
203```
204
205Identifiers name entities such as values, types and functions, and are
206chosen by the writer of MLIR code. Identifiers may be descriptive (e.g.
207`%batch_size`, `@matmul`), or may be non-descriptive when they are
208auto-generated (e.g. `%23`, `@func42`). Identifier names for values may be
209used in an MLIR text file but are not persisted as part of the IR - the printer
210will give them anonymous names like `%42`.
211
212MLIR guarantees identifiers never collide with keywords by prefixing identifiers
213with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts
214(e.g. affine expressions), identifiers are not prefixed, for brevity. New
215keywords may be added to future versions of MLIR without danger of collision
216with existing identifiers.
217
218Value identifiers are only [in scope](#value-scoping) for the (nested)
219region in which they are defined and cannot be accessed or referenced
220outside of that region. Argument identifiers in mapping functions are
221in scope for the mapping body. Particular operations may further limit
222which identifiers are in scope in their regions. For instance, the
223scope of values in a region with [SSA control flow
224semantics](#control-flow-and-ssacfg-regions) is constrained according
225to the standard definition of [SSA
226dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)). Another
227example is the [IsolatedFromAbove trait](Traits.md#isolatedfromabove),
228which restricts directly accessing values defined in containing
229regions.
230
231Function identifiers and mapping identifiers are associated with
232[Symbols](SymbolsAndSymbolTables) and have scoping rules dependent on
233symbol attributes.
234
235## Dialects
236
237Dialects are the mechanism by which to engage with and extend the MLIR
238ecosystem. They allow for defining new [operations](#operations), as well as
239[attributes](#attributes) and [types](#type-system). Each dialect is given a
240unique `namespace` that is prefixed to each defined attribute/operation/type.
241For example, the [Affine dialect](Dialects/Affine.md) defines the namespace:
242`affine`.
243
244MLIR allows for multiple dialects, even those outside of the main tree, to
245co-exist together within one module. Dialects are produced and consumed by
246certain passes. MLIR provides a [framework](DialectConversion.md) to convert
247between, and within, different dialects.
248
249A few of the dialects supported by MLIR:
250
251*   [Affine dialect](Dialects/Affine.md)
252*   [GPU dialect](Dialects/GPU.md)
253*   [LLVM dialect](Dialects/LLVM.md)
254*   [SPIR-V dialect](Dialects/SPIR-V.md)
255*   [Standard dialect](Dialects/Standard.md)
256*   [Vector dialect](Dialects/Vector.md)
257
258### Target specific operations
259
260Dialects provide a modular way in which targets can expose target-specific
261operations directly through to MLIR. As an example, some targets go through
262LLVM. LLVM has a rich set of intrinsics for certain target-independent
263operations (e.g. addition with overflow check) as well as providing access to
264target-specific operations for the targets it supports (e.g. vector
265permutation operations). LLVM intrinsics in MLIR are represented via
266operations that start with an "llvm." name.
267
268Example:
269
270```mlir
271// LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
272%x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1)
273```
274
275These operations only work when targeting LLVM as a backend (e.g. for CPUs and
276GPUs), and are required to align with the LLVM definition of these intrinsics.
277
278## Operations
279
280Syntax:
281
282```
283operation         ::= op-result-list? (generic-operation | custom-operation)
284                      trailing-location?
285generic-operation ::= string-literal `(` value-use-list? `)`  successor-list?
286                      (`(` region-list `)`)? attribute-dict? `:` function-type
287custom-operation  ::= bare-id custom-operation-format
288op-result-list    ::= op-result (`,` op-result)* `=`
289op-result         ::= value-id (`:` integer-literal)
290successor-list    ::= successor (`,` successor)*
291successor         ::= caret-id (`:` bb-arg-list)?
292region-list       ::= region (`,` region)*
293trailing-location ::= (`loc` `(` location `)`)?
294```
295
296MLIR introduces a uniform concept called _operations_ to enable describing
297many different levels of abstractions and computations. Operations in MLIR are
298fully extensible (there is no fixed list of operations) and have
299application-specific semantics. For example, MLIR supports [target-independent
300operations](Dialects/Standard.md#memory-operations), [affine
301operations](Dialects/Affine.md), and [target-specific machine
302operations](#target-specific-operations).
303
304The internal representation of an operation is simple: an operation is
305identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`,
306`ppc.eieio`, etc), can return zero or more results, take zero or more
307operands, may have zero or more attributes, may have zero or more successors,
308and zero or more enclosed [regions](#regions). The generic printing form
309includes all these elements literally, with a function type to indicate the
310types of the results and operands.
311
312Example:
313
314```mlir
315// An operation that produces two results.
316// The results of %result can be accessed via the <name> `#` <opNo> syntax.
317%result:2 = "foo_div"() : () -> (f32, i32)
318
319// Pretty form that defines a unique name for each result.
320%foo, %bar = "foo_div"() : () -> (f32, i32)
321
322// Invoke a TensorFlow function called tf.scramble with two inputs
323// and an attribute "fruit".
324%2 = "tf.scramble"(%result#0, %bar) {fruit: "banana"} : (f32, i32) -> f32
325```
326
327In addition to the basic syntax above, dialects may register known operations.
328This allows those dialects to support _custom assembly form_ for parsing and
329printing operations. In the operation sets listed below, we show both forms.
330
331### Terminator Operations
332
333These are a special category of operations that *must* terminate a block, e.g.
334[branches](Dialects/Standard.md#terminator-operations). These operations may
335also have a list of successors ([blocks](#blocks) and their arguments).
336
337Example:
338
339```mlir
340// Branch to ^bb1 or ^bb2 depending on the condition %cond.
341// Pass value %v to ^bb2, but not to ^bb1.
342"cond_br"(%cond)[^bb1, ^bb2(%v : index)] : (i1) -> ()
343```
344
345### Module
346
347```
348module ::= `module` symbol-ref-id? (`attributes` attribute-dict)? region
349```
350
351An MLIR Module represents a top-level container operation. It contains a single
352[SSACFG region](#control-flow-and-ssacfg-regions) containing a single block
353which can contain any operations. Operations within this region cannot
354implicitly capture values defined outside the module, i.e. Modules are
355[IsolatedFromAbove](Traits.md#isolatedfromabove). Modules have an optional
356[symbol name](SymbolsAndSymbolTables.md) which can be used to refer to them in
357operations.
358
359### Functions
360
361An MLIR Function is an operation with a name containing a single [SSACFG
362region](#control-flow-and-ssacfg-regions).  Operations within this region
363cannot implicitly capture values defined outside of the function,
364i.e. Functions are [IsolatedFromAbove](Traits.md#isolatedfromabove).  All
365external references must use function arguments or attributes that establish a
366symbolic connection (e.g. symbols referenced by name via a string attribute
367like [SymbolRefAttr](#symbol-reference-attribute)):
368
369```
370function ::= `func` function-signature function-attributes? function-body?
371
372function-signature ::= symbol-ref-id `(` argument-list `)`
373                       (`->` function-result-list)?
374
375argument-list ::= (named-argument (`,` named-argument)*) | /*empty*/
376argument-list ::= (type attribute-dict? (`,` type attribute-dict?)*) | /*empty*/
377named-argument ::= value-id `:` type attribute-dict?
378
379function-result-list ::= function-result-list-parens
380                       | non-function-type
381function-result-list-parens ::= `(` `)`
382                              | `(` function-result-list-no-parens `)`
383function-result-list-no-parens ::= function-result (`,` function-result)*
384function-result ::= type attribute-dict?
385
386function-attributes ::= `attributes` attribute-dict
387function-body ::= region
388```
389
390An external function declaration (used when referring to a function declared
391in some other module) has no body. While the MLIR textual form provides a nice
392inline syntax for function arguments, they are internally represented as
393"block arguments" to the first block in the region.
394
395Only dialect attribute names may be specified in the attribute dictionaries
396for function arguments, results, or the function itself.
397
398Examples:
399
400```mlir
401// External function definitions.
402func @abort()
403func @scribble(i32, i64, memref<? x 128 x f32, #layout_map0>) -> f64
404
405// A function that returns its argument twice:
406func @count(%x: i64) -> (i64, i64)
407  attributes {fruit: "banana"} {
408  return %x, %x: i64, i64
409}
410
411// A function with an argument attribute
412func @example_fn_arg(%x: i32 {swift.self = unit})
413
414// A function with a result attribute
415func @example_fn_result() -> (f64 {dialectName.attrName = 0 : i64})
416
417// A function with an attribute
418func @example_fn_attr() attributes {dialectName.attrName = false}
419```
420
421## Blocks
422
423Syntax:
424
425```
426block           ::= block-label operation+
427block-label     ::= block-id block-arg-list? `:`
428block-id        ::= caret-id
429caret-id        ::= `^` suffix-id
430value-id-and-type ::= value-id `:` type
431
432// Non-empty list of names and types.
433value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)*
434
435block-arg-list ::= `(` value-id-and-type-list? `)`
436```
437
438A *Block* is an ordered list of operations, concluding with a single
439[terminator operation](#terminator-operations). In [SSACFG
440regions](#control-flow-and-ssacfg-regions), each block represents a compiler
441[basic block](https://en.wikipedia.org/wiki/Basic_block) where instructions
442inside the block are executed in order and terminator operations implement
443control flow branches between basic blocks.
444
445Blocks in MLIR take a list of block arguments, notated in a function-like
446way. Block arguments are bound to values specified by the semantics of
447individual operations. Block arguments of the entry block of a region are also
448arguments to the region and the values bound to these arguments are determined
449by the semantics of the containing operation. Block arguments of other blocks
450are determined by the semantics of terminator operations, e.g. Branches, which
451have the block as a successor. In regions with [control
452flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure to
453implicitly represent the passage of control-flow dependent values without the
454complex nuances of PHI nodes in traditional SSA representations. Note that
455values which are not control-flow dependent can be referenced directly and do
456not need to be passed through block arguments.
457
458Here is a simple example function showing branches, returns, and block
459arguments:
460
461```mlir
462func @simple(i64, i1) -> i64 {
463^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
464  cond_br %cond, ^bb1, ^bb2
465
466^bb1:
467  br ^bb3(%a: i64)    // Branch passes %a as the argument
468
469^bb2:
470  %b = addi %a, %a : i64
471  br ^bb3(%b: i64)    // Branch passes %b as the argument
472
473// ^bb3 receives an argument, named %c, from predecessors
474// and passes it on to bb4 along with %a. %a is referenced
475// directly from its defining operation and is not passed through
476// an argument of ^bb3.
477^bb3(%c: i64):
478  br ^bb4(%c, %a : i64, i64)
479
480^bb4(%d : i64, %e : i64):
481  %0 = addi %d, %e : i64
482  return %0 : i64   // Return is also a terminator.
483}
484```
485
486**Context:** The "block argument" representation eliminates a number
487of special cases from the IR compared to traditional "PHI nodes are
488operations" SSA IRs (like LLVM). For example, the [parallel copy
489semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf)
490of SSA is immediately apparent, and function arguments are no longer a
491special case: they become arguments to the entry block [[more
492rationale](Rationale/Rationale.md#block-arguments-vs-phi-nodes)]. Blocks
493are also a fundamental concept that cannot be represented by
494operations because values defined in an operation cannot be accessed
495outside the operation.
496
497## Regions
498
499### Definition
500
501A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a
502region is not imposed by the IR. Instead, the containing operation defines the
503semantics of the regions it contains. MLIR currently defines two kinds of
504regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe
505control flow between blocks, and [Graph regions](#graph-regions), which do not
506require control flow between block. The kinds of regions within an operation
507are described using the
508[RegionKindInterface](Interfaces.md#regionkindinterfaces).
509
510Regions do not have a name or an address, only the blocks contained in a
511region do. Regions must be contained within operations and have no type or
512attributes. The first block in the region is a special block called the 'entry
513block'. The arguments to the entry block are also the arguments of the region
514itself. The entry block cannot be listed as a successor of any other
515block. The syntax for a region is as follows:
516
517```
518region ::= `{` block* `}`
519```
520
521A function body is an example of a region: it consists of a CFG of blocks and
522has additional semantic restrictions that other types of regions may not have.
523For example, in a function body, block terminators must either branch to a
524different block, or return from a function where the types of the `return`
525arguments must match the result types of the function signature.  Similarly,
526the function arguments must match the types and count of the region arguments.
527In general, operations with regions can define these correspondances
528arbitrarily.
529
530### Value Scoping
531
532Regions provide hierarchical encapsulation of programs: it is impossible to
533reference, i.e. branch to, a block which is not in the same region as the
534source of the reference, i.e. a terminator operation. Similarly, regions
535provides a natural scoping for value visibility: values defined in a region
536don't escape to the enclosing region, if any. By default, operations inside a
537region can reference values defined outside of the region whenever it would
538have been legal for operands of the enclosing operation to reference those
539values, but this can be restricted using traits, such as
540[OpTrait::IsolatedFromAbove](Traits.md#isolatedfromabove), or a custom
541verifier.
542
543Example:
544
545```mlir
546  "any_op"(%a) ({ // if %a is in-scope in the containing region...
547	 // then %a is in-scope here too.
548    %new_value = "another_op"(%a) : (i64) -> (i64)
549  }) : (i64) -> (i64)
550```
551
552MLIR defines a generalized 'hierarchical dominance' concept that operates
553across hierarchy and defines whether a value is 'in scope' and can be used by
554a particular operation. Whether a value can be used by another operation in
555the same region is defined by the kind of region. A value defined in a region
556can be used by an operation which has a parent in the same region, if and only
557if the parent could use the value. A value defined by an argument to a region
558can always be used by any operation deeply contained in the region. A value
559defined in a region can never be used outside of the region.
560
561### Control Flow and SSACFG Regions
562
563In MLIR, control flow semantics of a region is indicated by
564[RegionKind::SSACFG](Interfaces.md#regionkindinterfaces).  Informally, these
565regions support semantics where operations in a region 'execute
566sequentially'. Before an operation executes, its operands have well-defined
567values. After an operation executes, the operands have the same values and
568results also have well-defined values. After an operation executes, the next
569operation in the block executes until the operation is the terminator operation
570at the end of a block, in which case some other operation will execute. The
571determination of the next instruction to execute is the 'passing of control
572flow'.
573
574In general, when control flow is passed to an operation, MLIR does not
575restrict when control flow enters or exits the regions contained in that
576operation. However, when control flow enters a region, it always begins in the
577first block of the region, called the *entry* block.  Terminator operations
578ending each block represent control flow by explicitly specifying the
579successor blocks of the block. Control flow can only pass to one of the
580specified successor blocks as in a `branch` operation, or back to the
581containing operation as in a `return` operation. Terminator operations without
582successors can only pass control back to the containing operation. Within
583these restrictions, the particular semantics of terminator operations is
584determined by the specific dialect operations involved. Blocks (other than the
585entry block) that are not listed as a successor of a terminator operation are
586defined to be unreachable and can be removed without affecting the semantics
587of the containing operation.
588
589Although control flow always enters a region through the entry block, control
590flow may exit a region through any block with an appropriate terminator. The
591standard dialect leverages this capability to define operations with
592Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different
593blocks in the region and exiting through any block with a `return`
594operation. This behavior is similar to that of a function body in most
595programming languages. In addition, control flow may also not reach the end of
596a block or region, for example if a function call does not return.
597
598Example:
599
600```mlir
601func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region
602^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
603  cond_br %cond, ^bb1, ^bb2
604
605^bb1:
606  // This def for %value does not dominate ^bb2
607  %value = "op.convert"(%a) : (i64) -> i64
608  br ^bb3(%a: i64)    // Branch passes %a as the argument
609
610^bb2:
611  accelerator.launch() { // An SSACFG region
612    ^bb0:
613      // Region of code nested under "accelerator.launch", it can reference %a but
614      // not %value.
615      %new_value = "accelerator.do_something"(%a) : (i64) -> ()
616  }
617  // %new_value cannot be referenced outside of the region
618
619^bb3:
620  ...
621}
622```
623
624#### Operations with Multiple Regions
625
626An operation containing multiple regions also completely determines the
627semantics of those regions. In particular, when control flow is passed to an
628operation, it may transfer control flow to any contained region. When control
629flow exits a region and is returned to the containing operation, the
630containing operation may pass control flow to any region in the same
631operation. An operation may also pass control flow to multiple contained
632regions concurrently. An operation may also pass control flow into regions
633that were specified in other operations, in particular those that defined the
634values or symbols the given operation uses as in a call operation. This
635passage of control is generally independent of passage of control flow through
636the basic blocks of the containing region.
637
638#### Closure
639
640Regions allow defining an operation that creates a closure, for example by
641“boxing” the body of the region into a value they produce. It remains up to the
642operation to define its semantics. Note that if an operation triggers
643asynchronous execution of the region, it is under the responsibility of the
644operation caller to wait for the region to be executed guaranteeing that any
645directly used values remain live.
646
647### Graph Regions
648
649In MLIR, graph-like semantics in a region is indicated by
650[RegionKind::Graph](Interfaces.md#regionkindinterfaces). Graph regions are
651appropriate for concurrent semantics without control flow, or for modeling
652generic directed graph data structures. Graph regions are appropriate for
653representing cyclic relationships between coupled values where there is no
654fundamental order to the relationships. For instance, operations in a graph
655region may represent independent threads of control with values representing
656streams of data. As usual in MLIR, the particular semantics of a region is
657completely determined by its containing operation. Graph regions may only
658contain a single basic block (the entry block).
659
660**Rationale:** Currently graph regions are arbitrarily limited to a single
661basic block, although there is no particular semantic reason for this
662limitation. This limitation has been added to make it easier to stabilize the
663pass infrastructure and commonly used passes for processing graph regions to
664properly handle feedback loops. Multi-block regions may be allowed in the
665future if use cases that require it arise.
666
667In graph regions, MLIR operations naturally represent nodes, while each MLIR
668value represents a multi-edge connecting a single source node and multiple
669destination nodes. All values defined in the region as results of operations
670are in scope within the region and can be accessed by any other operation in
671the region. In graph regions, the order of operations within a block and the
672order of blocks in a region is not semantically meaningful and non-terminator
673operations may be freely reordered, for instance, by canonicalization. Other
674kinds of graphs, such as graphs with multiple source nodes and multiple
675destination nodes, can also be represented by representing graph edges as MLIR
676operations.
677
678Note that cycles can occur within a single block in a graph region, or between
679basic blocks.
680
681```mlir
682"test.graph_region"() ({ // A Graph region
683  %1 = "op1"(%1, %3) : (i32, i32) -> (i32)  // OK: %1, %3 allowed here
684  %2 = "test.ssacfg_region"() ({
685	 %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region
686  }) : () -> (i32)
687  %3 = "op2"(%1, %4) : (i32, i32) -> (i32)  // OK: %4 allowed here
688  %4 = "op3"(%1) : (i32) -> (i32)
689}) : () -> ()
690```
691
692### Arguments and Results
693
694The arguments of the first block of a region are treated as arguments of the
695region. The source of these arguments is defined by the semantics of the parent
696operation. They may correspond to some of the values the operation itself uses.
697
698Regions produce a (possibly empty) list of values. The operation semantics
699defines the relation between the region results and the operation results.
700
701## Type System
702
703Each value in MLIR has a type defined by the type system below. There are a
704number of primitive types (like integers) and also aggregate types for tensors
705and memory buffers. MLIR [builtin types](#builtin-types) do not include
706structures, arrays, or dictionaries.
707
708MLIR has an open type system (i.e. there is no fixed list of types), and types
709may have application-specific semantics. For example, MLIR supports a set of
710[dialect types](#dialect-types).
711
712```
713type ::= type-alias | dialect-type | builtin-type
714
715type-list-no-parens ::=  type (`,` type)*
716type-list-parens ::= `(` `)`
717                   | `(` type-list-no-parens `)`
718
719// This is a common way to refer to a value with a specified type.
720ssa-use-and-type ::= ssa-use `:` type
721
722// Non-empty list of names and types.
723ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)*
724```
725
726### Type Aliases
727
728```
729type-alias-def ::= '!' alias-name '=' 'type' type
730type-alias ::= '!' alias-name
731```
732
733MLIR supports defining named aliases for types. A type alias is an identifier
734that can be used in the place of the type that it defines. These aliases *must*
735be defined before their uses. Alias names may not contain a '.', since those
736names are reserved for [dialect types](#dialect-types).
737
738Example:
739
740```mlir
741!avx_m128 = type vector<4 x f32>
742
743// Using the original type.
744"foo"(%x) : vector<4 x f32> -> ()
745
746// Using the type alias.
747"foo"(%x) : !avx_m128 -> ()
748```
749
750### Dialect Types
751
752Similarly to operations, dialects may define custom extensions to the type
753system.
754
755```
756dialect-namespace ::= bare-id
757
758opaque-dialect-item ::= dialect-namespace '<' string-literal '>'
759
760pretty-dialect-item ::= dialect-namespace '.' pretty-dialect-item-lead-ident
761                                              pretty-dialect-item-body?
762
763pretty-dialect-item-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*'
764pretty-dialect-item-body ::= '<' pretty-dialect-item-contents+ '>'
765pretty-dialect-item-contents ::= pretty-dialect-item-body
766                              | '(' pretty-dialect-item-contents+ ')'
767                              | '[' pretty-dialect-item-contents+ ']'
768                              | '{' pretty-dialect-item-contents+ '}'
769                              | '[^[<({>\])}\0]+'
770
771dialect-type ::= '!' opaque-dialect-item
772dialect-type ::= '!' pretty-dialect-item
773```
774
775Dialect types can be specified in a verbose form, e.g. like this:
776
777```mlir
778// LLVM type that wraps around llvm IR types.
779!llvm<"i32*">
780
781// Tensor flow string type.
782!tf.string
783
784// Complex type
785!foo<"something<abcd>">
786
787// Even more complex type
788!foo<"something<a%%123^^^>>>">
789```
790
791Dialect types that are simple enough can use the pretty format, which is a
792lighter weight syntax that is equivalent to the above forms:
793
794```mlir
795// Tensor flow string type.
796!tf.string
797
798// Complex type
799!foo.something<abcd>
800```
801
802Sufficiently complex dialect types are required to use the verbose form for
803generality. For example, the more complex type shown above wouldn't be valid in
804the lighter syntax: `!foo.something<a%%123^^^>>>` because it contains characters
805that are not allowed in the lighter syntax, as well as unbalanced `<>`
806characters.
807
808See [here](Tutorials/DefiningAttributesAndTypes.md) to learn how to define dialect types.
809
810### Builtin Types
811
812Builtin types are a core set of [dialect types](#dialect-types) that are defined
813in a builtin dialect and thus available to all users of MLIR.
814
815```
816builtin-type ::=      complex-type
817                    | float-type
818                    | function-type
819                    | index-type
820                    | integer-type
821                    | memref-type
822                    | none-type
823                    | tensor-type
824                    | tuple-type
825                    | vector-type
826```
827
828#### Complex Type
829
830Syntax:
831
832```
833complex-type ::= `complex` `<` type `>`
834```
835
836The value of `complex` type represents a complex number with a parameterized
837element type, which is composed of a real and imaginary value of that element
838type. The element must be a floating point or integer scalar type.
839
840Examples:
841
842```mlir
843complex<f32>
844complex<i32>
845```
846
847#### Floating Point Types
848
849Syntax:
850
851```
852// Floating point.
853float-type ::= `f16` | `bf16` | `f32` | `f64`
854```
855
856MLIR supports float types of certain widths that are widely used as indicated
857above.
858
859#### Function Type
860
861Syntax:
862
863```
864// MLIR functions can return multiple values.
865function-result-type ::= type-list-parens
866                       | non-function-type
867
868function-type ::= type-list-parens `->` function-result-type
869```
870
871MLIR supports first-class functions: for example, the
872[`constant` operation](Dialects/Standard.md#stdconstant-constantop) produces the
873address of a function as a value. This value may be passed to and
874returned from functions, merged across control flow boundaries with
875[block arguments](#blocks), and called with the
876[`call_indirect` operation](Dialects/Standard.md#call-indirect-operation).
877
878Function types are also used to indicate the arguments and results of
879[operations](#operations).
880
881#### Index Type
882
883Syntax:
884
885```
886// Target word-sized integer.
887index-type ::= `index`
888```
889
890The `index` type is a signless integer whose size is equal to the natural
891machine word of the target
892([rationale](Rationale/Rationale.md#integer-signedness-semantics)) and is used
893by the affine constructs in MLIR. Unlike fixed-size integers, it cannot be used
894as an element of vector
895([rationale](Rationale/Rationale.md#index-type-disallowed-in-vector-types)).
896
897**Rationale:** integers of platform-specific bit widths are practical to express
898sizes, dimensionalities and subscripts.
899
900#### Integer Type
901
902Syntax:
903
904```
905// Sized integers like i1, i4, i8, i16, i32.
906signed-integer-type ::= `si` [1-9][0-9]*
907unsigned-integer-type ::= `ui` [1-9][0-9]*
908signless-integer-type ::= `i` [1-9][0-9]*
909integer-type ::= signed-integer-type |
910                 unsigned-integer-type |
911                 signless-integer-type
912```
913
914MLIR supports arbitrary precision integer types. Integer types have a designated
915width and may have signedness semantics.
916
917**Rationale:** low precision integers (like `i2`, `i4` etc) are useful for
918low-precision inference chips, and arbitrary precision integers are useful for
919hardware synthesis (where a 13 bit multiplier is a lot cheaper/smaller than a 16
920bit one).
921
922TODO: Need to decide on a representation for quantized integers
923([initial thoughts](Rationale/Rationale.md#quantized-integer-operations)).
924
925#### Memref Type
926
927Syntax:
928
929```
930memref-type ::= ranked-memref-type | unranked-memref-type
931
932ranked-memref-type ::= `memref` `<` dimension-list-ranked tensor-memref-element-type
933                      (`,` layout-specification)? (`,` memory-space)? `>`
934
935unranked-memref-type ::= `memref` `<*x` tensor-memref-element-type
936                         (`,` memory-space)? `>`
937
938stride-list ::= `[` (dimension (`,` dimension)*)? `]`
939strided-layout ::= `offset:` dimension `,` `strides: ` stride-list
940layout-specification ::= semi-affine-map | strided-layout
941memory-space ::= integer-literal /* | TODO: address-space-id */
942```
943
944A `memref` type is a reference to a region of memory (similar to a buffer
945pointer, but more powerful). The buffer pointed to by a memref can be allocated,
946aliased and deallocated. A memref can be used to read and write data from/to the
947memory region which it references. Memref types use the same shape specifier as
948tensor types. Note that `memref<f32>`, `memref<0 x f32>`, `memref<1 x 0 x f32>`,
949and `memref<0 x 1 x f32>` are all different types.
950
951A `memref` is allowed to have an unknown rank (e.g. `memref<*xf32>`). The
952purpose of unranked memrefs is to allow external library functions to receive
953memref arguments of any rank without versioning the functions based on the rank.
954Other uses of this type are disallowed or will have undefined behavior.
955
956##### Codegen of Unranked Memref
957
958Using unranked memref in codegen besides the case mentioned above is highly
959discouraged. Codegen is concerned with generating loop nests and specialized
960instructions for high-performance, unranked memref is concerned with hiding the
961rank and thus, the number of enclosing loops required to iterate over the data.
962However, if there is a need to code-gen unranked memref, one possible path is to
963cast into a static ranked type based on the dynamic rank. Another possible path
964is to emit a single while loop conditioned on a linear index and perform
965delinearization of the linear index to a dynamic array containing the (unranked)
966indices. While this is possible, it is expected to not be a good idea to perform
967this during codegen as the cost of the translations is expected to be
968prohibitive and optimizations at this level are not expected to be worthwhile.
969If expressiveness is the main concern, irrespective of performance, passing
970unranked memrefs to an external C++ library and implementing rank-agnostic logic
971there is expected to be significantly simpler.
972
973Unranked memrefs may provide expressiveness gains in the future and help bridge
974the gap with unranked tensors. Unranked memrefs will not be expected to be
975exposed to codegen but one may query the rank of an unranked memref (a special
976op will be needed for this purpose) and perform a switch and cast to a ranked
977memref as a prerequisite to codegen.
978
979Example:
980
981```mlir
982// With static ranks, we need a function for each possible argument type
983%A = alloc() : memref<16x32xf32>
984%B = alloc() : memref<16x32x64xf32>
985call @helper_2D(%A) : (memref<16x32xf32>)->()
986call @helper_3D(%B) : (memref<16x32x64xf32>)->()
987
988// With unknown rank, the functions can be unified under one unranked type
989%A = alloc() : memref<16x32xf32>
990%B = alloc() : memref<16x32x64xf32>
991// Remove rank info
992%A_u = memref_cast %A : memref<16x32xf32> -> memref<*xf32>
993%B_u = memref_cast %B : memref<16x32x64xf32> -> memref<*xf32>
994// call same function with dynamic ranks
995call @helper(%A_u) : (memref<*xf32>)->()
996call @helper(%B_u) : (memref<*xf32>)->()
997```
998
999The core syntax and representation of a layout specification is a
1000[semi-affine map](Dialects/Affine.md#semi-affine-maps). Additionally, syntactic
1001sugar is supported to make certain layout specifications more intuitive to read.
1002For the moment, a `memref` supports parsing a strided form which is converted to
1003a semi-affine map automatically.
1004
1005The memory space of a memref is specified by a target-specific integer index. If
1006no memory space is specified, then the default memory space (0) is used. The
1007default space is target specific but always at index 0.
1008
1009TODO: MLIR will eventually have target-dialects which allow symbolic use of
1010memory hierarchy names (e.g. L3, L2, L1, ...) but we have not spec'd the details
1011of that mechanism yet. Until then, this document pretends that it is valid to
1012refer to these memories by `bare-id`.
1013
1014The notionally dynamic value of a memref value includes the address of the
1015buffer allocated, as well as the symbols referred to by the shape, layout map,
1016and index maps.
1017
1018Examples of memref static type
1019
1020```mlir
1021// Identity index/layout map
1022#identity = affine_map<(d0, d1) -> (d0, d1)>
1023
1024// Column major layout.
1025#col_major = affine_map<(d0, d1, d2) -> (d2, d1, d0)>
1026
1027// A 2-d tiled layout with tiles of size 128 x 256.
1028#tiled_2d_128x256 = affine_map<(d0, d1) -> (d0 div 128, d1 div 256, d0 mod 128, d1 mod 256)>
1029
1030// A tiled data layout with non-constant tile sizes.
1031#tiled_dynamic = affine_map<(d0, d1)[s0, s1] -> (d0 floordiv s0, d1 floordiv s1,
1032                             d0 mod s0, d1 mod s1)>
1033
1034// A layout that yields a padding on two at either end of the minor dimension.
1035#padded = affine_map<(d0, d1) -> (d0, (d1 + 2) floordiv 2, (d1 + 2) mod 2)>
1036
1037
1038// The dimension list "16x32" defines the following 2D index space:
1039//
1040//   { (i, j) : 0 <= i < 16, 0 <= j < 32 }
1041//
1042memref<16x32xf32, #identity>
1043
1044// The dimension list "16x4x?" defines the following 3D index space:
1045//
1046//   { (i, j, k) : 0 <= i < 16, 0 <= j < 4, 0 <= k < N }
1047//
1048// where N is a symbol which represents the runtime value of the size of
1049// the third dimension.
1050//
1051// %N here binds to the size of the third dimension.
1052%A = alloc(%N) : memref<16x4x?xf32, #col_major>
1053
1054// A 2-d dynamic shaped memref that also has a dynamically sized tiled layout.
1055// The memref index space is of size %M x %N, while %B1 and %B2 bind to the
1056// symbols s0, s1 respectively of the layout map #tiled_dynamic. Data tiles of
1057// size %B1 x %B2 in the logical space will be stored contiguously in memory.
1058// The allocation size will be (%M ceildiv %B1) * %B1 * (%N ceildiv %B2) * %B2
1059// f32 elements.
1060%T = alloc(%M, %N) [%B1, %B2] : memref<?x?xf32, #tiled_dynamic>
1061
1062// A memref that has a two-element padding at either end. The allocation size
1063// will fit 16 * 64 float elements of data.
1064%P = alloc() : memref<16x64xf32, #padded>
1065
1066// Affine map with symbol 's0' used as offset for the first dimension.
1067#imapS = affine_map<(d0, d1) [s0] -> (d0 + s0, d1)>
1068// Allocate memref and bind the following symbols:
1069// '%n' is bound to the dynamic second dimension of the memref type.
1070// '%o' is bound to the symbol 's0' in the affine map of the memref type.
1071%n = ...
1072%o = ...
1073%A = alloc (%n)[%o] : <16x?xf32, #imapS>
1074```
1075
1076##### Index Space
1077
1078A memref dimension list defines an index space within which the memref can be
1079indexed to access data.
1080
1081##### Index
1082
1083Data is accessed through a memref type using a multidimensional index into the
1084multidimensional index space defined by the memref's dimension list.
1085
1086Examples
1087
1088```mlir
1089// Allocates a memref with 2D index space:
1090//   { (i, j) : 0 <= i < 16, 0 <= j < 32 }
1091%A = alloc() : memref<16x32xf32, #imapA>
1092
1093// Loads data from memref '%A' using a 2D index: (%i, %j)
1094%v = load %A[%i, %j] : memref<16x32xf32, #imapA>
1095```
1096
1097##### Index Map
1098
1099An index map is a one-to-one
1100[semi-affine map](Dialects/Affine.md#semi-affine-maps) that transforms a
1101multidimensional index from one index space to another. For example, the
1102following figure shows an index map which maps a 2-dimensional index from a 2x2
1103index space to a 3x3 index space, using symbols `S0` and `S1` as offsets.
1104
1105![Index Map Example](/includes/img/index-map.svg)
1106
1107The number of domain dimensions and range dimensions of an index map can be
1108different, but must match the number of dimensions of the input and output index
1109spaces on which the map operates. The index space is always non-negative and
1110integral. In addition, an index map must specify the size of each of its range
1111dimensions onto which it maps. Index map symbols must be listed in order with
1112symbols for dynamic dimension sizes first, followed by other required symbols.
1113
1114##### Layout Map
1115
1116A layout map is a [semi-affine map](Dialects/Affine.md#semi-affine-maps) which
1117encodes logical to physical index space mapping, by mapping input dimensions to
1118their ordering from most-major (slowest varying) to most-minor (fastest
1119varying). Therefore, an identity layout map corresponds to a row-major layout.
1120Identity layout maps do not contribute to the MemRef type identification and are
1121discarded on construction. That is, a type with an explicit identity map is
1122`memref<?x?xf32, (i,j)->(i,j)>` is strictly the same as the one without layout
1123maps, `memref<?x?xf32>`.
1124
1125Layout map examples:
1126
1127```mlir
1128// MxN matrix stored in row major layout in memory:
1129#layout_map_row_major = (i, j) -> (i, j)
1130
1131// MxN matrix stored in column major layout in memory:
1132#layout_map_col_major = (i, j) -> (j, i)
1133
1134// MxN matrix stored in a 2-d blocked/tiled layout with 64x64 tiles.
1135#layout_tiled = (i, j) -> (i floordiv 64, j floordiv 64, i mod 64, j mod 64)
1136```
1137
1138##### Affine Map Composition
1139
1140A memref specifies a semi-affine map composition as part of its type. A
1141semi-affine map composition is a composition of semi-affine maps beginning with
1142zero or more index maps, and ending with a layout map. The composition must be
1143conformant: the number of dimensions of the range of one map, must match the
1144number of dimensions of the domain of the next map in the composition.
1145
1146The semi-affine map composition specified in the memref type, maps from accesses
1147used to index the memref in load/store operations to other index spaces (i.e.
1148logical to physical index mapping). Each of the
1149[semi-affine maps](Dialects/Affine.md) and thus its composition is required to
1150be one-to-one.
1151
1152The semi-affine map composition can be used in dependence analysis, memory
1153access pattern analysis, and for performance optimizations like vectorization,
1154copy elision and in-place updates. If an affine map composition is not specified
1155for the memref, the identity affine map is assumed.
1156
1157##### Strided MemRef
1158
1159A memref may specify strides as part of its type. A stride specification is a
1160list of integer values that are either static or `?` (dynamic case). Strides
1161encode the distance, in number of elements, in (linear) memory between
1162successive entries along a particular dimension. A stride specification is
1163syntactic sugar for an equivalent strided memref representation using
1164semi-affine maps. For example, `memref<42x16xf32, offset: 33, strides: [1, 64]>`
1165specifies a non-contiguous memory region of `42` by `16` `f32` elements such
1166that:
1167
11681.  the minimal size of the enclosing memory region must be `33 + 42 * 1 + 16 *
1169    64 = 1066` elements;
11702.  the address calculation for accessing element `(i, j)` computes `33 + i +
1171    64 * j`
11723.  the distance between two consecutive elements along the outer dimension is
1173    `1` element and the distance between two consecutive elements along the
1174    outer dimension is `64` elements.
1175
1176This corresponds to a column major view of the memory region and is internally
1177represented as the type `memref<42x16xf32, (i, j) -> (33 + i + 64 * j)>`.
1178
1179The specification of strides must not alias: given an n-D strided memref,
1180indices `(i1, ..., in)` and `(j1, ..., jn)` may not refer to the same memory
1181address unless `i1 == j1, ..., in == jn`.
1182
1183Strided memrefs represent a view abstraction over preallocated data. They are
1184constructed with special ops, yet to be introduced. Strided memrefs are a
1185special subclass of memrefs with generic semi-affine map and correspond to a
1186normalized memref descriptor when lowering to LLVM.
1187
1188#### None Type
1189
1190Syntax:
1191
1192```
1193none-type ::= `none`
1194```
1195
1196The `none` type is a unit type, i.e. a type with exactly one possible value,
1197where its value does not have a defined dynamic representation.
1198
1199#### Tensor Type
1200
1201Syntax:
1202
1203```
1204tensor-type ::= `tensor` `<` dimension-list tensor-memref-element-type `>`
1205tensor-memref-element-type ::= vector-element-type | vector-type | complex-type
1206
1207// memref requires a known rank, but tensor does not.
1208dimension-list ::= dimension-list-ranked | (`*` `x`)
1209dimension-list-ranked ::= (dimension `x`)*
1210dimension ::= `?` | decimal-literal
1211```
1212
1213Values with tensor type represents aggregate N-dimensional data values, and
1214have a known element type. It may have an unknown rank (indicated by `*`) or may
1215have a fixed rank with a list of dimensions. Each dimension may be a static
1216non-negative decimal constant or be dynamically determined (indicated by `?`).
1217
1218The runtime representation of the MLIR tensor type is intentionally abstracted -
1219you cannot control layout or get a pointer to the data. For low level buffer
1220access, MLIR has a [`memref` type](#memref-type). This abstracted runtime
1221representation holds both the tensor data values as well as information about
1222the (potentially dynamic) shape of the tensor. The
1223[`dim` operation](Dialects/Standard.md#dim-operation) returns the size of a
1224dimension from a value of tensor type.
1225
1226Note: hexadecimal integer literals are not allowed in tensor type declarations
1227to avoid confusion between `0xf32` and `0 x f32`. Zero sizes are allowed in
1228tensors and treated as other sizes, e.g., `tensor<0 x 1 x i32>` and `tensor<1 x
12290 x i32>` are different types. Since zero sizes are not allowed in some other
1230types, such tensors should be optimized away before lowering tensors to vectors.
1231
1232Examples:
1233
1234```mlir
1235// Tensor with unknown rank.
1236tensor<* x f32>
1237
1238// Known rank but unknown dimensions.
1239tensor<? x ? x ? x ? x f32>
1240
1241// Partially known dimensions.
1242tensor<? x ? x 13 x ? x f32>
1243
1244// Full static shape.
1245tensor<17 x 4 x 13 x 4 x f32>
1246
1247// Tensor with rank zero. Represents a scalar.
1248tensor<f32>
1249
1250// Zero-element dimensions are allowed.
1251tensor<0 x 42 x f32>
1252
1253// Zero-element tensor of f32 type (hexadecimal literals not allowed here).
1254tensor<0xf32>
1255```
1256
1257#### Tuple Type
1258
1259Syntax:
1260
1261```
1262tuple-type ::= `tuple` `<` (type ( `,` type)*)? `>`
1263```
1264
1265The value of `tuple` type represents a fixed-size collection of elements, where
1266each element may be of a different type.
1267
1268**Rationale:** Though this type is first class in the type system, MLIR provides
1269no standard operations for operating on `tuple` types
1270([rationale](Rationale/Rationale.md#tuple-types)).
1271
1272Examples:
1273
1274```mlir
1275// Empty tuple.
1276tuple<>
1277
1278// Single element
1279tuple<f32>
1280
1281// Many elements.
1282tuple<i32, f32, tensor<i1>, i5>
1283```
1284
1285#### Vector Type
1286
1287Syntax:
1288
1289```
1290vector-type ::= `vector` `<` static-dimension-list vector-element-type `>`
1291vector-element-type ::= float-type | integer-type
1292
1293static-dimension-list ::= (decimal-literal `x`)+
1294```
1295
1296The vector type represents a SIMD style vector, used by target-specific
1297operation sets like AVX. While the most common use is for 1D vectors (e.g.
1298vector<16 x f32>) we also support multidimensional registers on targets that
1299support them (like TPUs).
1300
1301Vector shapes must be positive decimal integers.
1302
1303Note: hexadecimal integer literals are not allowed in vector type declarations,
1304`vector<0x42xi32>` is invalid because it is interpreted as a 2D vector with
1305shape `(0, 42)` and zero shapes are not allowed.
1306
1307## Attributes
1308
1309Syntax:
1310
1311```
1312attribute-dict ::= `{` `}`
1313                 | `{` attribute-entry (`,` attribute-entry)* `}`
1314attribute-entry ::= dialect-attribute-entry | dependent-attribute-entry
1315dialect-attribute-entry ::= dialect-namespace `.` bare-id `=` attribute-value
1316dependent-attribute-entry ::= dependent-attribute-name `=` attribute-value
1317dependent-attribute-name ::= ((letter|[_]) (letter|digit|[_$])*)
1318                           | string-literal
1319```
1320
1321Attributes are the mechanism for specifying constant data on operations in
1322places where a variable is never allowed - e.g. the index of a
1323[`dim` operation](Dialects/Standard.md#stddim-dimop), or the stride of a
1324convolution. They consist of a name and a concrete attribute value. The set of
1325expected attributes, their structure, and their interpretation are all
1326contextually dependent on what they are attached to.
1327
1328There are two main classes of attributes: dependent and dialect. Dependent
1329attributes derive their structure and meaning from what they are attached to;
1330e.g., the meaning of the `index` attribute on a `dim` operation is defined by
1331the `dim` operation. Dialect attributes, on the other hand, derive their context
1332and meaning from a specific dialect. An example of a dialect attribute may be a
1333`swift.self` function argument attribute that indicates an argument is the
1334self/context parameter. The context of this attribute is defined by the `swift`
1335dialect and not the function argument.
1336
1337Attribute values are represented by the following forms:
1338
1339```
1340attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute
1341```
1342
1343### Attribute Value Aliases
1344
1345```
1346attribute-alias ::= '#' alias-name '=' attribute-value
1347attribute-alias ::= '#' alias-name
1348```
1349
1350MLIR supports defining named aliases for attribute values. An attribute alias is
1351an identifier that can be used in the place of the attribute that it defines.
1352These aliases *must* be defined before their uses. Alias names may not contain a
1353'.', since those names are reserved for
1354[dialect attributes](#dialect-attribute-values).
1355
1356Example:
1357
1358```mlir
1359#map = affine_map<(d0) -> (d0 + 10)>
1360
1361// Using the original attribute.
1362%b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a)
1363
1364// Using the attribute alias.
1365%b = affine.apply #map(%a)
1366```
1367
1368### Dialect Attribute Values
1369
1370Similarly to operations, dialects may define custom attribute values. The
1371syntactic structure of these values is identical to custom dialect type values,
1372except that dialect attribute values are distinguished with a leading '#', while
1373dialect types are distinguished with a leading '!'.
1374
1375```
1376dialect-attribute-value ::= '#' opaque-dialect-item
1377dialect-attribute-value ::= '#' pretty-dialect-item
1378```
1379
1380Dialect attribute values can be specified in a verbose form, e.g. like this:
1381
1382```mlir
1383// Complex attribute value.
1384#foo<"something<abcd>">
1385
1386// Even more complex attribute value.
1387#foo<"something<a%%123^^^>>>">
1388```
1389
1390Dialect attribute values that are simple enough can use the pretty format, which
1391is a lighter weight syntax that is equivalent to the above forms:
1392
1393```mlir
1394// Complex attribute
1395#foo.something<abcd>
1396```
1397
1398Sufficiently complex dialect attribute values are required to use the verbose
1399form for generality. For example, the more complex type shown above would not be
1400valid in the lighter syntax: `#foo.something<a%%123^^^>>>` because it contains
1401characters that are not allowed in the lighter syntax, as well as unbalanced
1402`<>` characters.
1403
1404See [here](Tutorials/DefiningAttributesAndTypes.md) on how to define dialect
1405attribute values.
1406
1407### Builtin Attribute Values
1408
1409Builtin attributes are a core set of
1410[dialect attributes](#dialect-attribute-values) that are defined in a builtin
1411dialect and thus available to all users of MLIR.
1412
1413```
1414builtin-attribute ::=    affine-map-attribute
1415                       | array-attribute
1416                       | bool-attribute
1417                       | dictionary-attribute
1418                       | elements-attribute
1419                       | float-attribute
1420                       | integer-attribute
1421                       | integer-set-attribute
1422                       | string-attribute
1423                       | symbol-ref-attribute
1424                       | type-attribute
1425                       | unit-attribute
1426```
1427
1428#### AffineMap Attribute
1429
1430Syntax:
1431
1432```
1433affine-map-attribute ::= `affine_map` `<` affine-map `>`
1434```
1435
1436An affine-map attribute is an attribute that represents an affine-map object.
1437
1438#### Array Attribute
1439
1440Syntax:
1441
1442```
1443array-attribute ::= `[` (attribute-value (`,` attribute-value)*)? `]`
1444```
1445
1446An array attribute is an attribute that represents a collection of attribute
1447values.
1448
1449#### Boolean Attribute
1450
1451Syntax:
1452
1453```
1454bool-attribute ::= bool-literal
1455```
1456
1457A boolean attribute is a literal attribute that represents a one-bit boolean
1458value, true or false.
1459
1460#### Dictionary Attribute
1461
1462Syntax:
1463
1464```
1465dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}`
1466```
1467
1468A dictionary attribute is an attribute that represents a sorted collection of
1469named attribute values. The elements are sorted by name, and each name must be
1470unique within the collection.
1471
1472#### Elements Attributes
1473
1474Syntax:
1475
1476```
1477elements-attribute ::= dense-elements-attribute
1478                     | opaque-elements-attribute
1479                     | sparse-elements-attribute
1480```
1481
1482An elements attribute is a literal attribute that represents a constant
1483[vector](#vector-type) or [tensor](#tensor-type) value.
1484
1485##### Dense Elements Attribute
1486
1487Syntax:
1488
1489```
1490dense-elements-attribute ::= `dense` `<` attribute-value `>` `:`
1491                             ( tensor-type | vector-type )
1492```
1493
1494A dense elements attribute is an elements attribute where the storage for the
1495constant vector or tensor value has been densely packed. The attribute supports
1496storing integer or floating point elements, with integer/index/floating element
1497types. It also support storing string elements with a custom dialect string
1498element type.
1499
1500##### Opaque Elements Attribute
1501
1502Syntax:
1503
1504```
1505opaque-elements-attribute ::= `opaque` `<` dialect-namespace  `,`
1506                              hex-string-literal `>` `:`
1507                              ( tensor-type | vector-type )
1508```
1509
1510An opaque elements attribute is an elements attribute where the content of the
1511value is opaque. The representation of the constant stored by this elements
1512attribute is only understood, and thus decodable, by the dialect that created
1513it.
1514
1515Note: The parsed string literal must be in hexadecimal form.
1516
1517##### Sparse Elements Attribute
1518
1519Syntax:
1520
1521```
1522sparse-elements-attribute ::= `sparse` `<` attribute-value `,` attribute-value
1523                              `>` `:` ( tensor-type | vector-type )
1524```
1525
1526A sparse elements attribute is an elements attribute that represents a sparse
1527vector or tensor object. This is where very few of the elements are non-zero.
1528
1529The attribute uses COO (coordinate list) encoding to represent the sparse
1530elements of the elements attribute. The indices are stored via a 2-D tensor of
153164-bit integer elements with shape [N, ndims], which specifies the indices of
1532the elements in the sparse tensor that contains non-zero values. The element
1533values are stored via a 1-D tensor with shape [N], that supplies the
1534corresponding values for the indices.
1535
1536Example:
1537
1538```mlir
1539  sparse<[[0, 0], [1, 2]], [1, 5]> : tensor<3x4xi32>
1540
1541// This represents the following tensor:
1542///  [[1, 0, 0, 0],
1543///   [0, 0, 5, 0],
1544///   [0, 0, 0, 0]]
1545```
1546
1547#### Float Attribute
1548
1549Syntax:
1550
1551```
1552float-attribute ::= (float-literal (`:` float-type)?)
1553                  | (hexadecimal-literal `:` float-type)
1554```
1555
1556A float attribute is a literal attribute that represents a floating point value
1557of the specified [float type](#floating-point-types). It can be represented in
1558the hexadecimal form where the hexadecimal value is interpreted as bits of the
1559underlying binary representation. This form is useful for representing infinity
1560and NaN floating point values. To avoid confusion with integer attributes,
1561hexadecimal literals _must_ be followed by a float type to define a float
1562attribute.
1563
1564Examples:
1565
1566```
156742.0         // float attribute defaults to f64 type
156842.0 : f32   // float attribute of f32 type
15690x7C00 : f16 // positive infinity
15700x7CFF : f16 // NaN (one of possible values)
157142 : f32     // Error: expected integer type
1572```
1573
1574#### Integer Attribute
1575
1576Syntax:
1577
1578```
1579integer-attribute ::= integer-literal ( `:` (index-type | integer-type) )?
1580```
1581
1582An integer attribute is a literal attribute that represents an integral value of
1583the specified integer or index type. The default type for this attribute, if one
1584is not specified, is a 64-bit integer.
1585
1586##### Integer Set Attribute
1587
1588Syntax:
1589
1590```
1591integer-set-attribute ::= `affine_set` `<` integer-set `>`
1592```
1593
1594An integer-set attribute is an attribute that represents an integer-set object.
1595
1596#### String Attribute
1597
1598Syntax:
1599
1600```
1601string-attribute ::= string-literal (`:` type)?
1602```
1603
1604A string attribute is an attribute that represents a string literal value.
1605
1606#### Symbol Reference Attribute
1607
1608Syntax:
1609
1610```
1611symbol-ref-attribute ::= symbol-ref-id (`::` symbol-ref-id)*
1612```
1613
1614A symbol reference attribute is a literal attribute that represents a named
1615reference to an operation that is nested within an operation with the
1616`OpTrait::SymbolTable` trait. As such, this reference is given meaning by the
1617nearest parent operation containing the `OpTrait::SymbolTable` trait. It may
1618optionally contain a set of nested references that further resolve to a symbol
1619nested within a different symbol table.
1620
1621This attribute can only be held internally by
1622[array attributes](#array-attribute) and
1623[dictionary attributes](#dictionary-attribute)(including the top-level operation
1624attribute dictionary), i.e. no other attribute kinds such as Locations or
1625extended attribute kinds.
1626
1627**Rationale:** Identifying accesses to global data is critical to
1628enabling efficient multi-threaded compilation. Restricting global
1629data access to occur through symbols and limiting the places that can
1630legally hold a symbol reference simplifies reasoning about these data
1631accesses.
1632
1633See [`Symbols And SymbolTables`](SymbolsAndSymbolTables.md) for more
1634information.
1635
1636#### Type Attribute
1637
1638Syntax:
1639
1640```
1641type-attribute ::= type
1642```
1643
1644A type attribute is an attribute that represents a [type object](#type-system).
1645
1646#### Unit Attribute
1647
1648```
1649unit-attribute ::= `unit`
1650```
1651
1652A unit attribute is an attribute that represents a value of `unit` type. The
1653`unit` type allows only one value forming a singleton set. This attribute value
1654is used to represent attributes that only have meaning from their existence.
1655
1656One example of such an attribute could be the `swift.self` attribute. This
1657attribute indicates that a function parameter is the self/context parameter. It
1658could be represented as a [boolean attribute](#boolean-attribute)(true or
1659false), but a value of false doesn't really bring any value. The parameter
1660either is the self/context or it isn't.
1661
1662```mlir
1663// A unit attribute defined with the `unit` value specifier.
1664func @verbose_form(i1) attributes {dialectName.unitAttr = unit}
1665
1666// A unit attribute can also be defined without the value specifier.
1667func @simple_form(i1) attributes {dialectName.unitAttr}
1668```
1669