1==============================
2TableGen Language Introduction
3==============================
4
5.. contents::
6   :local:
7
8.. warning::
9   This document is extremely rough. If you find something lacking, please
10   fix it, file a documentation bug, or ask about it on llvm-dev.
11
12Introduction
13============
14
15This document is not meant to be a normative spec about the TableGen language
16in and of itself (i.e. how to understand a given construct in terms of how
17it affects the final set of records represented by the TableGen file). For
18the formal language specification, see :doc:`LangRef`.
19
20TableGen syntax
21===============
22
23TableGen doesn't care about the meaning of data (that is up to the backend to
24define), but it does care about syntax, and it enforces a simple type system.
25This section describes the syntax and the constructs allowed in a TableGen file.
26
27TableGen primitives
28-------------------
29
30TableGen comments
31^^^^^^^^^^^^^^^^^
32
33TableGen supports C++ style "``//``" comments, which run to the end of the
34line, and it also supports **nestable** "``/* */``" comments.
35
36.. _TableGen type:
37
38The TableGen type system
39^^^^^^^^^^^^^^^^^^^^^^^^
40
41TableGen files are strongly typed, in a simple (but complete) type-system.
42These types are used to perform automatic conversions, check for errors, and to
43help interface designers constrain the input that they allow.  Every `value
44definition`_ is required to have an associated type.
45
46TableGen supports a mixture of very low-level types (such as ``bit``) and very
47high-level types (such as ``dag``).  This flexibility is what allows it to
48describe a wide range of information conveniently and compactly.  The TableGen
49types are:
50
51``bit``
52    A 'bit' is a boolean value that can hold either 0 or 1.
53
54``int``
55    The 'int' type represents a simple 32-bit integer value, such as 5.
56
57``string``
58    The 'string' type represents an ordered sequence of characters of arbitrary
59    length.
60
61``code``
62    The `code` type represents a code fragment, which can be single/multi-line
63    string literal.
64
65``bits<n>``
66    A 'bits' type is an arbitrary, but fixed, size integer that is broken up
67    into individual bits.  This type is useful because it can handle some bits
68    being defined while others are undefined.
69
70``list<ty>``
71    This type represents a list whose elements are some other type.  The
72    contained type is arbitrary: it can even be another list type.
73
74Class type
75    Specifying a class name in a type context means that the defined value must
76    be a subclass of the specified class.  This is useful in conjunction with
77    the ``list`` type, for example, to constrain the elements of the list to a
78    common base class (e.g., a ``list<Register>`` can only contain definitions
79    derived from the "``Register``" class).
80
81``dag``
82    This type represents a nestable directed graph of elements.
83
84To date, these types have been sufficient for describing things that TableGen
85has been used for, but it is straight-forward to extend this list if needed.
86
87.. _TableGen expressions:
88
89TableGen values and expressions
90^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91
92TableGen allows for a pretty reasonable number of different expression forms
93when building up values.  These forms allow the TableGen file to be written in a
94natural syntax and flavor for the application.  The current expression forms
95supported include:
96
97``?``
98    uninitialized field
99
100``0b1001011``
101    binary integer value.
102    Note that this is sized by the number of bits given and will not be
103    silently extended/truncated.
104
105``7``
106    decimal integer value
107
108``0x7F``
109    hexadecimal integer value
110
111``"foo"``
112    a single-line string value, can be assigned to ``string`` or ``code`` variable.
113
114``[{ ... }]``
115    usually called a "code fragment", but is just a multiline string literal
116
117``[ X, Y, Z ]<type>``
118    list value.  <type> is the type of the list element and is usually optional.
119    In rare cases, TableGen is unable to deduce the element type in which case
120    the user must specify it explicitly.
121
122``{ a, b, 0b10 }``
123    initializer for a "bits<4>" value.
124    1-bit from "a", 1-bit from "b", 2-bits from 0b10.
125
126``value``
127    value reference
128
129``value{17}``
130    access to one bit of a value
131
132``value{15-17}``
133    access to an ordered sequence of bits of a value, in particular ``value{15-17}``
134    produces an order that is the reverse of ``value{17-15}``.
135
136``DEF``
137    reference to a record definition
138
139``CLASS<val list>``
140    reference to a new anonymous definition of CLASS with the specified template
141    arguments.
142
143``X.Y``
144    reference to the subfield of a value
145
146``list[4-7,17,2-3]``
147    A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it.
148    Elements may be included multiple times.
149
150``foreach <var> = [ <list> ] in { <body> }``
151
152``foreach <var> = [ <list> ] in <def>``
153    Replicate <body> or <def>, replacing instances of <var> with each value
154    in <list>.  <var> is scoped at the level of the ``foreach`` loop and must
155    not conflict with any other object introduced in <body> or <def>.  Only
156    ``def``\s and ``defm``\s are expanded within <body>.
157
158``foreach <var> = 0-15 in ...``
159
160``foreach <var> = {0-15,32-47} in ...``
161    Loop over ranges of integers. The braces are required for multiple ranges.
162
163``(DEF a, b)``
164    a dag value.  The first element is required to be a record definition, the
165    remaining elements in the list may be arbitrary other values, including
166    nested ```dag``' values.
167
168``!con(a, b, ...)``
169    Concatenate two or more DAG nodes. Their operations must equal.
170
171    Example: !con((op a1:$name1, a2:$name2), (op b1:$name3)) results in
172    the DAG node (op a1:$name1, a2:$name2, b1:$name3).
173
174``!dag(op, children, names)``
175    Generate a DAG node programmatically. 'children' and 'names' must be lists
176    of equal length or unset ('?'). 'names' must be a 'list<string>'.
177
178    Due to limitations of the type system, 'children' must be a list of items
179    of a common type. In practice, this means that they should either have the
180    same type or be records with a common superclass. Mixing dag and non-dag
181    items is not possible. However, '?' can be used.
182
183    Example: !dag(op, [a1, a2, ?], ["name1", "name2", "name3"]) results in
184    (op a1:$name1, a2:$name2, ?:$name3).
185
186``!listconcat(a, b, ...)``
187    A list value that is the result of concatenating the 'a' and 'b' lists.
188    The lists must have the same element type.
189    More than two arguments are accepted with the result being the concatenation
190    of all the lists given.
191
192``!strconcat(a, b, ...)``
193    A string value that is the result of concatenating the 'a' and 'b' strings.
194    More than two arguments are accepted with the result being the concatenation
195    of all the strings given.
196
197``str1#str2``
198    "#" (paste) is a shorthand for !strconcat.  It may concatenate things that
199    are not quoted strings, in which case an implicit !cast<string> is done on
200    the operand of the paste.
201
202``!cast<type>(a)``
203    If 'a' is a string, a record of type *type* obtained by looking up the
204    string 'a' in the list of all records defined by the time that all template
205    arguments in 'a' are fully resolved.
206
207    For example, if !cast<type>(a) appears in a multiclass definition, or in a
208    class instantiated inside of a multiclass definition, and 'a' does not
209    reference any template arguments of the multiclass, then a record of name
210    'a' must be instantiated earlier in the source file. If 'a' does reference
211    a template argument, then the lookup is delayed until defm statements
212    instantiating the multiclass (or later, if the defm occurs in another
213    multiclass and template arguments of the inner multiclass that are
214    referenced by 'a' are substituted by values that themselves contain
215    references to template arguments of the outer multiclass).
216
217    If the type of 'a' does not match *type*, TableGen aborts with an error.
218
219    Otherwise, perform a normal type cast e.g. between an int and a bit, or
220    between record types. This allows casting a record to a subclass, though if
221    the types do not match, constant folding will be inhibited. !cast<string>
222    is a special case in that the argument can be an int or a record. In the
223    latter case, the record's name is returned.
224
225``!isa<type>(a)``
226    Returns an integer: 1 if 'a' is dynamically of the given type, 0 otherwise.
227
228``!subst(a, b, c)``
229    If 'a' and 'b' are of string type or are symbol references, substitute 'b'
230    for 'a' in 'c.'  This operation is analogous to $(subst) in GNU make.
231
232``!foreach(a, b, c)``
233    For each member of dag or list 'b' apply operator 'c'. 'a' is the name
234    of a variable that will be substituted by members of 'b' in 'c'.
235    This operation is analogous to $(foreach) in GNU make.
236
237``!foldl(start, lst, a, b, expr)``
238    Perform a left-fold over 'lst' with the given starting value. 'a' and 'b'
239    are variable names which will be substituted in 'expr'. If you think of
240    expr as a function f(a,b), the fold will compute
241    'f(...f(f(start, lst[0]), lst[1]), ...), lst[n-1])' for a list of length n.
242    As usual, 'a' will be of the type of 'start', and 'b' will be of the type
243    of elements of 'lst'. These types need not be the same, but 'expr' must be
244    of the same type as 'start'.
245
246``!head(a)``
247    The first element of list 'a.'
248
249``!tail(a)``
250    The 2nd-N elements of list 'a.'
251
252``!empty(a)``
253    An integer {0,1} indicating whether list 'a' is empty.
254
255``!size(a)``
256    An integer indicating the number of elements in list 'a'.
257
258``!if(a,b,c)``
259  'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise.
260
261``!eq(a,b)``
262    'bit 1' if string a is equal to string b, 0 otherwise.  This only operates
263    on string, int and bit objects.  Use !cast<string> to compare other types of
264    objects.
265
266``!ne(a,b)``
267    The negation of ``!eq(a,b)``.
268
269``!le(a,b), !lt(a,b), !ge(a,b), !gt(a,b)``
270    (Signed) comparison of integer values that returns bit 1 or 0 depending on
271    the result of the comparison.
272
273``!shl(a,b)`` ``!srl(a,b)`` ``!sra(a,b)``
274    The usual shift operators. Operations are on 64-bit integers, the result
275    is undefined for shift counts outside [0, 63].
276
277``!add(a,b,...)`` ``!and(a,b,...)`` ``!or(a,b,...)``
278    The usual arithmetic and binary operators.
279
280Note that all of the values have rules specifying how they convert to values
281for different types.  These rules allow you to assign a value like "``7``"
282to a "``bits<4>``" value, for example.
283
284Classes and definitions
285-----------------------
286
287As mentioned in the :doc:`introduction <index>`, classes and definitions (collectively known as
288'records') in TableGen are the main high-level unit of information that TableGen
289collects.  Records are defined with a ``def`` or ``class`` keyword, the record
290name, and an optional list of "`template arguments`_".  If the record has
291superclasses, they are specified as a comma separated list that starts with a
292colon character ("``:``").  If `value definitions`_ or `let expressions`_ are
293needed for the class, they are enclosed in curly braces ("``{}``"); otherwise,
294the record ends with a semicolon.
295
296Here is a simple TableGen file:
297
298.. code-block:: text
299
300  class C { bit V = 1; }
301  def X : C;
302  def Y : C {
303    string Greeting = "hello";
304  }
305
306This example defines two definitions, ``X`` and ``Y``, both of which derive from
307the ``C`` class.  Because of this, they both get the ``V`` bit value.  The ``Y``
308definition also gets the Greeting member as well.
309
310In general, classes are useful for collecting together the commonality between a
311group of records and isolating it in a single place.  Also, classes permit the
312specification of default values for their subclasses, allowing the subclasses to
313override them as they wish.
314
315.. _value definition:
316.. _value definitions:
317
318Value definitions
319^^^^^^^^^^^^^^^^^
320
321Value definitions define named entries in records.  A value must be defined
322before it can be referred to as the operand for another value definition or
323before the value is reset with a `let expression`_.  A value is defined by
324specifying a `TableGen type`_ and a name.  If an initial value is available, it
325may be specified after the type with an equal sign.  Value definitions require
326terminating semicolons.
327
328.. _let expression:
329.. _let expressions:
330.. _"let" expressions within a record:
331
332'let' expressions
333^^^^^^^^^^^^^^^^^
334
335A record-level let expression is used to change the value of a value definition
336in a record.  This is primarily useful when a superclass defines a value that a
337derived class or definition wants to override.  Let expressions consist of the
338'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new
339value.  For example, a new class could be added to the example above, redefining
340the ``V`` field for all of its subclasses:
341
342.. code-block:: text
343
344  class D : C { let V = 0; }
345  def Z : D;
346
347In this case, the ``Z`` definition will have a zero value for its ``V`` value,
348despite the fact that it derives (indirectly) from the ``C`` class, because the
349``D`` class overrode its value.
350
351References between variables in a record are substituted late, which gives
352``let`` expressions unusual power. Consider this admittedly silly example:
353
354.. code-block:: text
355
356  class A<int x> {
357    int Y = x;
358    int Yplus1 = !add(Y, 1);
359    int xplus1 = !add(x, 1);
360  }
361  def Z : A<5> {
362    let Y = 10;
363  }
364
365The value of ``Z.xplus1`` will be 6, but the value of ``Z.Yplus1`` is 11. Use
366this power wisely.
367
368.. _template arguments:
369
370Class template arguments
371^^^^^^^^^^^^^^^^^^^^^^^^
372
373TableGen permits the definition of parameterized classes as well as normal
374concrete classes.  Parameterized TableGen classes specify a list of variable
375bindings (which may optionally have defaults) that are bound when used.  Here is
376a simple example:
377
378.. code-block:: text
379
380  class FPFormat<bits<3> val> {
381    bits<3> Value = val;
382  }
383  def NotFP      : FPFormat<0>;
384  def ZeroArgFP  : FPFormat<1>;
385  def OneArgFP   : FPFormat<2>;
386  def OneArgFPRW : FPFormat<3>;
387  def TwoArgFP   : FPFormat<4>;
388  def CompareFP  : FPFormat<5>;
389  def CondMovFP  : FPFormat<6>;
390  def SpecialFP  : FPFormat<7>;
391
392In this case, template arguments are used as a space efficient way to specify a
393list of "enumeration values", each with a "``Value``" field set to the specified
394integer.
395
396The more esoteric forms of `TableGen expressions`_ are useful in conjunction
397with template arguments.  As an example:
398
399.. code-block:: text
400
401  class ModRefVal<bits<2> val> {
402    bits<2> Value = val;
403  }
404
405  def None   : ModRefVal<0>;
406  def Mod    : ModRefVal<1>;
407  def Ref    : ModRefVal<2>;
408  def ModRef : ModRefVal<3>;
409
410  class Value<ModRefVal MR> {
411    // Decode some information into a more convenient format, while providing
412    // a nice interface to the user of the "Value" class.
413    bit isMod = MR.Value{0};
414    bit isRef = MR.Value{1};
415
416    // other stuff...
417  }
418
419  // Example uses
420  def bork : Value<Mod>;
421  def zork : Value<Ref>;
422  def hork : Value<ModRef>;
423
424This is obviously a contrived example, but it shows how template arguments can
425be used to decouple the interface provided to the user of the class from the
426actual internal data representation expected by the class.  In this case,
427running ``llvm-tblgen`` on the example prints the following definitions:
428
429.. code-block:: text
430
431  def bork {      // Value
432    bit isMod = 1;
433    bit isRef = 0;
434  }
435  def hork {      // Value
436    bit isMod = 1;
437    bit isRef = 1;
438  }
439  def zork {      // Value
440    bit isMod = 0;
441    bit isRef = 1;
442  }
443
444This shows that TableGen was able to dig into the argument and extract a piece
445of information that was requested by the designer of the "Value" class.  For
446more realistic examples, please see existing users of TableGen, such as the X86
447backend.
448
449Multiclass definitions and instances
450^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
451
452While classes with template arguments are a good way to factor commonality
453between two instances of a definition, multiclasses allow a convenient notation
454for defining multiple definitions at once (instances of implicitly constructed
455classes).  For example, consider an 3-address instruction set whose instructions
456come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``"
457(e.g. SPARC). In this case, you'd like to specify in one place that this
458commonality exists, then in a separate place indicate what all the ops are.
459
460Here is an example TableGen fragment that shows this idea:
461
462.. code-block:: text
463
464  def ops;
465  def GPR;
466  def Imm;
467  class inst<int opc, string asmstr, dag operandlist>;
468
469  multiclass ri_inst<int opc, string asmstr> {
470    def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
471                   (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
472    def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
473                   (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
474  }
475
476  // Instantiations of the ri_inst multiclass.
477  defm ADD : ri_inst<0b111, "add">;
478  defm SUB : ri_inst<0b101, "sub">;
479  defm MUL : ri_inst<0b100, "mul">;
480  ...
481
482The name of the resultant definitions has the multidef fragment names appended
483to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc.  A defm may
484inherit from multiple multiclasses, instantiating definitions from each
485multiclass.  Using a multiclass this way is exactly equivalent to instantiating
486the classes multiple times yourself, e.g. by writing:
487
488.. code-block:: text
489
490  def ops;
491  def GPR;
492  def Imm;
493  class inst<int opc, string asmstr, dag operandlist>;
494
495  class rrinst<int opc, string asmstr>
496    : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
497           (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
498
499  class riinst<int opc, string asmstr>
500    : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
501           (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
502
503  // Instantiations of the ri_inst multiclass.
504  def ADD_rr : rrinst<0b111, "add">;
505  def ADD_ri : riinst<0b111, "add">;
506  def SUB_rr : rrinst<0b101, "sub">;
507  def SUB_ri : riinst<0b101, "sub">;
508  def MUL_rr : rrinst<0b100, "mul">;
509  def MUL_ri : riinst<0b100, "mul">;
510  ...
511
512A ``defm`` can also be used inside a multiclass providing several levels of
513multiclass instantiations.
514
515.. code-block:: text
516
517  class Instruction<bits<4> opc, string Name> {
518    bits<4> opcode = opc;
519    string name = Name;
520  }
521
522  multiclass basic_r<bits<4> opc> {
523    def rr : Instruction<opc, "rr">;
524    def rm : Instruction<opc, "rm">;
525  }
526
527  multiclass basic_s<bits<4> opc> {
528    defm SS : basic_r<opc>;
529    defm SD : basic_r<opc>;
530    def X : Instruction<opc, "x">;
531  }
532
533  multiclass basic_p<bits<4> opc> {
534    defm PS : basic_r<opc>;
535    defm PD : basic_r<opc>;
536    def Y : Instruction<opc, "y">;
537  }
538
539  defm ADD : basic_s<0xf>, basic_p<0xf>;
540  ...
541
542  // Results
543  def ADDPDrm { ...
544  def ADDPDrr { ...
545  def ADDPSrm { ...
546  def ADDPSrr { ...
547  def ADDSDrm { ...
548  def ADDSDrr { ...
549  def ADDY { ...
550  def ADDX { ...
551
552``defm`` declarations can inherit from classes too, the rule to follow is that
553the class list must start after the last multiclass, and there must be at least
554one multiclass before them.
555
556.. code-block:: text
557
558  class XD { bits<4> Prefix = 11; }
559  class XS { bits<4> Prefix = 12; }
560
561  class I<bits<4> op> {
562    bits<4> opcode = op;
563  }
564
565  multiclass R {
566    def rr : I<4>;
567    def rm : I<2>;
568  }
569
570  multiclass Y {
571    defm SS : R, XD;
572    defm SD : R, XS;
573  }
574
575  defm Instr : Y;
576
577  // Results
578  def InstrSDrm {
579    bits<4> opcode = { 0, 0, 1, 0 };
580    bits<4> Prefix = { 1, 1, 0, 0 };
581  }
582  ...
583  def InstrSSrr {
584    bits<4> opcode = { 0, 1, 0, 0 };
585    bits<4> Prefix = { 1, 0, 1, 1 };
586  }
587
588File scope entities
589-------------------
590
591File inclusion
592^^^^^^^^^^^^^^
593
594TableGen supports the '``include``' token, which textually substitutes the
595specified file in place of the include directive.  The filename should be
596specified as a double quoted string immediately after the '``include``' keyword.
597Example:
598
599.. code-block:: text
600
601  include "foo.td"
602
603'let' expressions
604^^^^^^^^^^^^^^^^^
605
606"Let" expressions at file scope are similar to `"let" expressions within a
607record`_, except they can specify a value binding for multiple records at a
608time, and may be useful in certain other cases.  File-scope let expressions are
609really just another way that TableGen allows the end-user to factor out
610commonality from the records.
611
612File-scope "let" expressions take a comma-separated list of bindings to apply,
613and one or more records to bind the values in.  Here are some examples:
614
615.. code-block:: text
616
617  let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in
618    def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>;
619
620  let isCall = 1 in
621    // All calls clobber the non-callee saved registers...
622    let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
623                MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
624                XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in {
625      def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops),
626                             "call\t${dst:call}", []>;
627      def CALL32r     : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
628                          "call\t{*}$dst", [(X86call GR32:$dst)]>;
629      def CALL32m     : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
630                          "call\t{*}$dst", []>;
631    }
632
633File-scope "let" expressions are often useful when a couple of definitions need
634to be added to several records, and the records do not otherwise need to be
635opened, as in the case with the ``CALL*`` instructions above.
636
637It's also possible to use "let" expressions inside multiclasses, providing more
638ways to factor out commonality from the records, specially if using several
639levels of multiclass instantiations. This also avoids the need of using "let"
640expressions within subsequent records inside a multiclass.
641
642.. code-block:: text
643
644  multiclass basic_r<bits<4> opc> {
645    let Predicates = [HasSSE2] in {
646      def rr : Instruction<opc, "rr">;
647      def rm : Instruction<opc, "rm">;
648    }
649    let Predicates = [HasSSE3] in
650      def rx : Instruction<opc, "rx">;
651  }
652
653  multiclass basic_ss<bits<4> opc> {
654    let IsDouble = 0 in
655      defm SS : basic_r<opc>;
656
657    let IsDouble = 1 in
658      defm SD : basic_r<opc>;
659  }
660
661  defm ADD : basic_ss<0xf>;
662
663Looping
664^^^^^^^
665
666TableGen supports the '``foreach``' block, which textually replicates the loop
667body, substituting iterator values for iterator references in the body.
668Example:
669
670.. code-block:: text
671
672  foreach i = [0, 1, 2, 3] in {
673    def R#i : Register<...>;
674    def F#i : Register<...>;
675  }
676
677This will create objects ``R0``, ``R1``, ``R2`` and ``R3``.  ``foreach`` blocks
678may be nested. If there is only one item in the body the braces may be
679elided:
680
681.. code-block:: text
682
683  foreach i = [0, 1, 2, 3] in
684    def R#i : Register<...>;
685
686Code Generator backend info
687===========================
688
689Expressions used by code generator to describe instructions and isel patterns:
690
691``(implicit a)``
692    an implicitly defined physical register.  This tells the dag instruction
693    selection emitter the input pattern's extra definitions matches implicit
694    physical register definitions.
695
696