1# Chapter 1: Toy Language and AST 2 3[TOC] 4 5## The Language 6 7This tutorial will be illustrated with a toy language that we’ll call “Toy” 8(naming is hard...). Toy is a tensor-based language that allows you to define 9functions, perform some math computation, and print results. 10 11Given that we want to keep things simple, the codegen will be limited to tensors 12of rank <= 2, and the only datatype in Toy is a 64-bit floating point type (aka 13‘double’ in C parlance). As such, all values are implicitly double precision, 14`Values` are immutable (i.e. every operation returns a newly allocated value), 15and deallocation is automatically managed. But enough with the long description; 16nothing is better than walking through an example to get a better understanding: 17 18```toy 19def main() { 20 # Define a variable `a` with shape <2, 3>, initialized with the literal value. 21 # The shape is inferred from the supplied literal. 22 var a = [[1, 2, 3], [4, 5, 6]]; 23 24 # b is identical to a, the literal tensor is implicitly reshaped: defining new 25 # variables is the way to reshape tensors (element count must match). 26 var b<2, 3> = [1, 2, 3, 4, 5, 6]; 27 28 # transpose() and print() are the only builtin, the following will transpose 29 # a and b and perform an element-wise multiplication before printing the result. 30 print(transpose(a) * transpose(b)); 31} 32``` 33 34Type checking is statically performed through type inference; the language only 35requires type declarations to specify tensor shapes when needed. Functions are 36generic: their parameters are unranked (in other words, we know these are 37tensors, but we don't know their dimensions). They are specialized for every 38newly discovered signature at call sites. Let's revisit the previous example by 39adding a user-defined function: 40 41```toy 42# User defined generic function that operates on unknown shaped arguments. 43def multiply_transpose(a, b) { 44 return transpose(a) * transpose(b); 45} 46 47def main() { 48 # Define a variable `a` with shape <2, 3>, initialized with the literal value. 49 var a = [[1, 2, 3], [4, 5, 6]]; 50 var b<2, 3> = [1, 2, 3, 4, 5, 6]; 51 52 # This call will specialize `multiply_transpose` with <2, 3> for both 53 # arguments and deduce a return type of <3, 2> in initialization of `c`. 54 var c = multiply_transpose(a, b); 55 56 # A second call to `multiply_transpose` with <2, 3> for both arguments will 57 # reuse the previously specialized and inferred version and return <3, 2>. 58 var d = multiply_transpose(b, a); 59 60 # A new call with <3, 2> (instead of <2, 3>) for both dimensions will 61 # trigger another specialization of `multiply_transpose`. 62 var e = multiply_transpose(c, d); 63 64 # Finally, calling into `multiply_transpose` with incompatible shape will 65 # trigger a shape inference error. 66 var f = multiply_transpose(transpose(a), c); 67} 68``` 69 70## The AST 71 72The AST from the above code is fairly straightforward; here is a dump of it: 73 74``` 75Module: 76 Function 77 Proto 'multiply_transpose' @test/Examples/Toy/Ch1/ast.toy:4:1' 78 Params: [a, b] 79 Block { 80 Return 81 BinOp: * @test/Examples/Toy/Ch1/ast.toy:5:25 82 Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:10 83 var: a @test/Examples/Toy/Ch1/ast.toy:5:20 84 ] 85 Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:25 86 var: b @test/Examples/Toy/Ch1/ast.toy:5:35 87 ] 88 } // Block 89 Function 90 Proto 'main' @test/Examples/Toy/Ch1/ast.toy:8:1' 91 Params: [] 92 Block { 93 VarDecl a<> @test/Examples/Toy/Ch1/ast.toy:11:3 94 Literal: <2, 3>[ <3>[ 1.000000e+00, 2.000000e+00, 3.000000e+00], <3>[ 4.000000e+00, 5.000000e+00, 6.000000e+00]] @test/Examples/Toy/Ch1/ast.toy:11:11 95 VarDecl b<2, 3> @test/Examples/Toy/Ch1/ast.toy:15:3 96 Literal: <6>[ 1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00] @test/Examples/Toy/Ch1/ast.toy:15:17 97 VarDecl c<> @test/Examples/Toy/Ch1/ast.toy:19:3 98 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:19:11 99 var: a @test/Examples/Toy/Ch1/ast.toy:19:30 100 var: b @test/Examples/Toy/Ch1/ast.toy:19:33 101 ] 102 VarDecl d<> @test/Examples/Toy/Ch1/ast.toy:22:3 103 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:22:11 104 var: b @test/Examples/Toy/Ch1/ast.toy:22:30 105 var: a @test/Examples/Toy/Ch1/ast.toy:22:33 106 ] 107 VarDecl e<> @test/Examples/Toy/Ch1/ast.toy:25:3 108 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:25:11 109 var: b @test/Examples/Toy/Ch1/ast.toy:25:30 110 var: c @test/Examples/Toy/Ch1/ast.toy:25:33 111 ] 112 VarDecl f<> @test/Examples/Toy/Ch1/ast.toy:28:3 113 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:28:11 114 Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:28:30 115 var: a @test/Examples/Toy/Ch1/ast.toy:28:40 116 ] 117 var: c @test/Examples/Toy/Ch1/ast.toy:28:44 118 ] 119 } // Block 120``` 121 122You can reproduce this result and play with the example in the 123`examples/toy/Ch1/` directory; try running `path/to/BUILD/bin/toyc-ch1 124test/Examples/Toy/Ch1/ast.toy -emit=ast`. 125 126The code for the lexer is fairly straightforward; it is all in a single header: 127`examples/toy/Ch1/include/toy/Lexer.h`. The parser can be found in 128`examples/toy/Ch1/include/toy/Parser.h`; it is a recursive descent parser. If 129you are not familiar with such a Lexer/Parser, these are very similar to the 130LLVM Kaleidoscope equivalent that are detailed in the first two chapters of the 131[Kaleidoscope Tutorial](https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl02.html). 132 133The [next chapter](Ch-2.md) will demonstrate how to convert this AST into MLIR. 134