1=================
2TableGen BackEnds
3=================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11TableGen backends are at the core of TableGen's functionality. The source files
12provide the semantics to a generated (in memory) structure, but it's up to the
13backend to print this out in a way that is meaningful to the user (normally a
14C program including a file or a textual list of warnings, options and error
15messages).
16
17TableGen is used by both LLVM and Clang with very different goals. LLVM uses it
18as a way to automate the generation of massive amounts of information regarding
19instructions, schedules, cores and architecture features. Some backends generate
20output that is consumed by more than one source file, so they need to be created
21in a way that is easy to use pre-processor tricks. Some backends can also print
22C code structures, so that they can be directly included as-is.
23
24Clang, on the other hand, uses it mainly for diagnostic messages (errors,
25warnings, tips) and attributes, so more on the textual end of the scale.
26
27LLVM BackEnds
28=============
29
30.. warning::
31   This document is raw. Each section below needs three sub-sections: description
32   of its purpose with a list of users, output generated from generic input, and
33   finally why it needed a new backend (in case there's something similar).
34
35Overall, each backend will take the same TableGen file type and transform into
36similar output for different targets/uses. There is an implicit contract between
37the TableGen files, the back-ends and their users.
38
39For instance, a global contract is that each back-end produces macro-guarded
40sections. Based on whether the file is included by a header or a source file,
41or even in which context of each file the include is being used, you have
42todefine a macro just before including it, to get the right output:
43
44.. code-block:: c++
45
46  #define GET_REGINFO_TARGET_DESC
47  #include "ARMGenRegisterInfo.inc"
48
49And just part of the generated file would be included. This is useful if
50you need the same information in multiple formats (instantiation, initialization,
51getter/setter functions, etc) from the same source TableGen file without having
52to re-compile the TableGen file multiple times.
53
54Sometimes, multiple macros might be defined before the same include file to
55output multiple blocks:
56
57.. code-block:: c++
58
59  #define GET_REGISTER_MATCHER
60  #define GET_SUBTARGET_FEATURE_NAME
61  #define GET_MATCHER_IMPLEMENTATION
62  #include "ARMGenAsmMatcher.inc"
63
64The macros will be undef'd automatically as they're used, in the include file.
65
66On all LLVM back-ends, the ``llvm-tblgen`` binary will be executed on the root
67TableGen file ``<Target>.td``, which should include all others. This guarantees
68that all information needed is accessible, and that no duplication is needed
69in the TableGen files.
70
71CodeEmitter
72-----------
73
74**Purpose**: CodeEmitterGen uses the descriptions of instructions and their fields to
75construct an automated code emitter: a function that, given a MachineInstr,
76returns the (currently, 32-bit unsigned) value of the instruction.
77
78**Output**: C++ code, implementing the target's CodeEmitter
79class by overriding the virtual functions as ``<Target>CodeEmitter::function()``.
80
81**Usage**: Used to include directly at the end of ``<Target>MCCodeEmitter.cpp``.
82
83RegisterInfo
84------------
85
86**Purpose**: This tablegen backend is responsible for emitting a description of a target
87register file for a code generator.  It uses instances of the Register,
88RegisterAliases, and RegisterClass classes to gather this information.
89
90**Output**: C++ code with enums and structures representing the register mappings,
91properties, masks, etc.
92
93**Usage**: Both on ``<Target>BaseRegisterInfo`` and ``<Target>MCTargetDesc`` (headers
94and source files) with macros defining in which they are for declaration vs.
95initialization issues.
96
97InstrInfo
98---------
99
100**Purpose**: This tablegen backend is responsible for emitting a description of the target
101instruction set for the code generator. (what are the differences from CodeEmitter?)
102
103**Output**: C++ code with enums and structures representing the instruction mappings,
104properties, masks, etc.
105
106**Usage**: Both on ``<Target>BaseInstrInfo`` and ``<Target>MCTargetDesc`` (headers
107and source files) with macros defining in which they are for declaration vs.
108initialization issues.
109
110AsmWriter
111---------
112
113**Purpose**: Emits an assembly printer for the current target.
114
115**Output**: Implementation of ``<Target>InstPrinter::printInstruction()``, among
116other things.
117
118**Usage**: Included directly into ``InstPrinter/<Target>InstPrinter.cpp``.
119
120AsmMatcher
121----------
122
123**Purpose**: Emits a target specifier matcher for
124converting parsed assembly operands in the MCInst structures. It also
125emits a matcher for custom operand parsing. Extensive documentation is
126written on the ``AsmMatcherEmitter.cpp`` file.
127
128**Output**: Assembler parsers' matcher functions, declarations, etc.
129
130**Usage**: Used in back-ends' ``AsmParser/<Target>AsmParser.cpp`` for
131building the AsmParser class.
132
133Disassembler
134------------
135
136**Purpose**: Contains disassembler table emitters for various
137architectures. Extensive documentation is written on the
138``DisassemblerEmitter.cpp`` file.
139
140**Output**: Decoding tables, static decoding functions, etc.
141
142**Usage**: Directly included in ``Disassembler/<Target>Disassembler.cpp``
143to cater for all default decodings, after all hand-made ones.
144
145PseudoLowering
146--------------
147
148**Purpose**: Generate pseudo instruction lowering.
149
150**Output**: Implements ``<Target>AsmPrinter::emitPseudoExpansionLowering()``.
151
152**Usage**: Included directly into ``<Target>AsmPrinter.cpp``.
153
154CallingConv
155-----------
156
157**Purpose**: Responsible for emitting descriptions of the calling
158conventions supported by this target.
159
160**Output**: Implement static functions to deal with calling conventions
161chained by matching styles, returning false on no match.
162
163**Usage**: Used in ISelLowering and FastIsel as function pointers to
164implementation returned by a CC selection function.
165
166DAGISel
167-------
168
169**Purpose**: Generate a DAG instruction selector.
170
171**Output**: Creates huge functions for automating DAG selection.
172
173**Usage**: Included in ``<Target>ISelDAGToDAG.cpp`` inside the target's
174implementation of ``SelectionDAGISel``.
175
176DFAPacketizer
177-------------
178
179**Purpose**: This class parses the Schedule.td file and produces an API that
180can be used to reason about whether an instruction can be added to a packet
181on a VLIW architecture. The class internally generates a deterministic finite
182automaton (DFA) that models all possible mappings of machine instructions
183to functional units as instructions are added to a packet.
184
185**Output**: Scheduling tables for GPU back-ends (Hexagon, AMD).
186
187**Usage**: Included directly on ``<Target>InstrInfo.cpp``.
188
189FastISel
190--------
191
192**Purpose**: This tablegen backend emits code for use by the "fast"
193instruction selection algorithm. See the comments at the top of
194lib/CodeGen/SelectionDAG/FastISel.cpp for background. This file
195scans through the target's tablegen instruction-info files
196and extracts instructions with obvious-looking patterns, and it emits
197code to look up these instructions by type and operator.
198
199**Output**: Generates ``Predicate`` and ``FastEmit`` methods.
200
201**Usage**: Implements private methods of the targets' implementation
202of ``FastISel`` class.
203
204Subtarget
205---------
206
207**Purpose**: Generate subtarget enumerations.
208
209**Output**: Enums, globals, local tables for sub-target information.
210
211**Usage**: Populates ``<Target>Subtarget`` and
212``MCTargetDesc/<Target>MCTargetDesc`` files (both headers and source).
213
214Intrinsic
215---------
216
217**Purpose**: Generate (target) intrinsic information.
218
219OptParserDefs
220-------------
221
222**Purpose**: Print enum values for a class.
223
224SearchableTables
225----------------
226
227**Purpose**: Generate custom searchable tables.
228
229**Output**: Enums, global tables and lookup helper functions.
230
231**Usage**: This backend allows generating free-form, target-specific tables
232from TableGen records. The ARM and AArch64 targets use this backend to generate
233tables of system registers; the AMDGPU target uses it to generate meta-data
234about complex image and memory buffer instructions.
235
236More documentation is available in ``include/llvm/TableGen/SearchableTable.td``,
237which also contains the definitions of TableGen classes which must be
238instantiated in order to define the enums and tables emitted by this backend.
239
240CTags
241-----
242
243**Purpose**: This tablegen backend emits an index of definitions in ctags(1)
244format. A helper script, utils/TableGen/tdtags, provides an easier-to-use
245interface; run 'tdtags -H' for documentation.
246
247X86EVEX2VEX
248-----------
249
250**Purpose**: This X86 specific tablegen backend emits tables that map EVEX
251encoded instructions to their VEX encoded identical instruction.
252
253Clang BackEnds
254==============
255
256ClangAttrClasses
257----------------
258
259**Purpose**: Creates Attrs.inc, which contains semantic attribute class
260declarations for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
261This file is included as part of ``Attr.h``.
262
263ClangAttrParserStringSwitches
264-----------------------------
265
266**Purpose**: Creates AttrParserStringSwitches.inc, which contains
267StringSwitch::Case statements for parser-related string switches. Each switch
268is given its own macro (such as ``CLANG_ATTR_ARG_CONTEXT_LIST``, or
269``CLANG_ATTR_IDENTIFIER_ARG_LIST``), which is expected to be defined before
270including AttrParserStringSwitches.inc, and undefined after.
271
272ClangAttrImpl
273-------------
274
275**Purpose**: Creates AttrImpl.inc, which contains semantic attribute class
276definitions for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
277This file is included as part of ``AttrImpl.cpp``.
278
279ClangAttrList
280-------------
281
282**Purpose**: Creates AttrList.inc, which is used when a list of semantic
283attribute identifiers is required. For instance, ``AttrKinds.h`` includes this
284file to generate the list of ``attr::Kind`` enumeration values. This list is
285separated out into multiple categories: attributes, inheritable attributes, and
286inheritable parameter attributes. This categorization happens automatically
287based on information in ``Attr.td`` and is used to implement the ``classof``
288functionality required for ``dyn_cast`` and similar APIs.
289
290ClangAttrPCHRead
291----------------
292
293**Purpose**: Creates AttrPCHRead.inc, which is used to deserialize attributes
294in the ``ASTReader::ReadAttributes`` function.
295
296ClangAttrPCHWrite
297-----------------
298
299**Purpose**: Creates AttrPCHWrite.inc, which is used to serialize attributes in
300the ``ASTWriter::WriteAttributes`` function.
301
302ClangAttrSpellings
303---------------------
304
305**Purpose**: Creates AttrSpellings.inc, which is used to implement the
306``__has_attribute`` feature test macro.
307
308ClangAttrSpellingListIndex
309--------------------------
310
311**Purpose**: Creates AttrSpellingListIndex.inc, which is used to map parsed
312attribute spellings (including which syntax or scope was used) to an attribute
313spelling list index. These spelling list index values are internal
314implementation details exposed via
315``AttributeList::getAttributeSpellingListIndex``.
316
317ClangAttrVisitor
318-------------------
319
320**Purpose**: Creates AttrVisitor.inc, which is used when implementing
321recursive AST visitors.
322
323ClangAttrTemplateInstantiate
324----------------------------
325
326**Purpose**: Creates AttrTemplateInstantiate.inc, which implements the
327``instantiateTemplateAttribute`` function, used when instantiating a template
328that requires an attribute to be cloned.
329
330ClangAttrParsedAttrList
331-----------------------
332
333**Purpose**: Creates AttrParsedAttrList.inc, which is used to generate the
334``AttributeList::Kind`` parsed attribute enumeration.
335
336ClangAttrParsedAttrImpl
337-----------------------
338
339**Purpose**: Creates AttrParsedAttrImpl.inc, which is used by
340``AttributeList.cpp`` to implement several functions on the ``AttributeList``
341class. This functionality is implemented via the ``AttrInfoMap ParsedAttrInfo``
342array, which contains one element per parsed attribute object.
343
344ClangAttrParsedAttrKinds
345------------------------
346
347**Purpose**: Creates AttrParsedAttrKinds.inc, which is used to implement the
348``AttributeList::getKind`` function, mapping a string (and syntax) to a parsed
349attribute ``AttributeList::Kind`` enumeration.
350
351ClangAttrDump
352-------------
353
354**Purpose**: Creates AttrDump.inc, which dumps information about an attribute.
355It is used to implement ``ASTDumper::dumpAttr``.
356
357ClangDiagsDefs
358--------------
359
360Generate Clang diagnostics definitions.
361
362ClangDiagGroups
363---------------
364
365Generate Clang diagnostic groups.
366
367ClangDiagsIndexName
368-------------------
369
370Generate Clang diagnostic name index.
371
372ClangCommentNodes
373-----------------
374
375Generate Clang AST comment nodes.
376
377ClangDeclNodes
378--------------
379
380Generate Clang AST declaration nodes.
381
382ClangStmtNodes
383--------------
384
385Generate Clang AST statement nodes.
386
387ClangSACheckers
388---------------
389
390Generate Clang Static Analyzer checkers.
391
392ClangCommentHTMLTags
393--------------------
394
395Generate efficient matchers for HTML tag names that are used in documentation comments.
396
397ClangCommentHTMLTagsProperties
398------------------------------
399
400Generate efficient matchers for HTML tag properties.
401
402ClangCommentHTMLNamedCharacterReferences
403----------------------------------------
404
405Generate function to translate named character references to UTF-8 sequences.
406
407ClangCommentCommandInfo
408-----------------------
409
410Generate command properties for commands that are used in documentation comments.
411
412ClangCommentCommandList
413-----------------------
414
415Generate list of commands that are used in documentation comments.
416
417ArmNeon
418-------
419
420Generate arm_neon.h for clang.
421
422ArmNeonSema
423-----------
424
425Generate ARM NEON sema support for clang.
426
427ArmNeonTest
428-----------
429
430Generate ARM NEON tests for clang.
431
432AttrDocs
433--------
434
435**Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is
436used for documenting user-facing attributes.
437
438General BackEnds
439================
440
441JSON
442----
443
444**Purpose**: Output all the values in every ``def``, as a JSON data
445structure that can be easily parsed by a variety of languages. Useful
446for writing custom backends without having to modify TableGen itself,
447or for performing auxiliary analysis on the same TableGen data passed
448to a built-in backend.
449
450**Output**:
451
452The root of the output file is a JSON object (i.e. dictionary),
453containing the following fixed keys:
454
455* ``!tablegen_json_version``: a numeric version field that will
456  increase if an incompatible change is ever made to the structure of
457  this data. The format described here corresponds to version 1.
458
459* ``!instanceof``: a dictionary whose keys are the class names defined
460  in the TableGen input. For each key, the corresponding value is an
461  array of strings giving the names of ``def`` records that derive
462  from that class. So ``root["!instanceof"]["Instruction"]``, for
463  example, would list the names of all the records deriving from the
464  class ``Instruction``.
465
466For each ``def`` record, the root object also has a key for the record
467name. The corresponding value is a subsidiary object containing the
468following fixed keys:
469
470* ``!superclasses``: an array of strings giving the names of all the
471  classes that this record derives from.
472
473* ``!fields``: an array of strings giving the names of all the variables
474  in this record that were defined with the ``field`` keyword.
475
476* ``!name``: a string giving the name of the record. This is always
477  identical to the key in the JSON root object corresponding to this
478  record's dictionary. (If the record is anonymous, the name is
479  arbitrary.)
480
481* ``!anonymous``: a boolean indicating whether the record's name was
482  specified by the TableGen input (if it is ``false``), or invented by
483  TableGen itself (if ``true``).
484
485For each variable defined in a record, the ``def`` object for that
486record also has a key for the variable name. The corresponding value
487is a translation into JSON of the variable's value, using the
488conventions described below.
489
490Some TableGen data types are translated directly into the
491corresponding JSON type:
492
493* A completely undefined value (e.g. for a variable declared without
494  initializer in some superclass of this record, and never initialized
495  by the record itself or any other superclass) is emitted as the JSON
496  ``null`` value.
497
498* ``int`` and ``bit`` values are emitted as numbers. Note that
499  TableGen ``int`` values are capable of holding integers too large to
500  be exactly representable in IEEE double precision. The integer
501  literal in the JSON output will show the full exact integer value.
502  So if you need to retrieve large integers with full precision, you
503  should use a JSON reader capable of translating such literals back
504  into 64-bit integers without losing precision, such as Python's
505  standard ``json`` module.
506
507* ``string`` and ``code`` values are emitted as JSON strings.
508
509* ``list<T>`` values, for any element type ``T``, are emitted as JSON
510  arrays. Each element of the array is represented in turn using these
511  same conventions.
512
513* ``bits`` values are also emitted as arrays. A ``bits`` array is
514  ordered from least-significant bit to most-significant. So the
515  element with index ``i`` corresponds to the bit described as
516  ``x{i}`` in TableGen source. However, note that this means that
517  scripting languages are likely to *display* the array in the
518  opposite order from the way it appears in the TableGen source or in
519  the diagnostic ``-print-records`` output.
520
521All other TableGen value types are emitted as a JSON object,
522containing two standard fields: ``kind`` is a discriminator describing
523which kind of value the object represents, and ``printable`` is a
524string giving the same representation of the value that would appear
525in ``-print-records``.
526
527* A reference to a ``def`` object has ``kind=="def"``, and has an
528  extra field ``def`` giving the name of the object referred to.
529
530* A reference to another variable in the same record has
531  ``kind=="var"``, and has an extra field ``var`` giving the name of
532  the variable referred to.
533
534* A reference to a specific bit of a ``bits``-typed variable in the
535  same record has ``kind=="varbit"``, and has two extra fields:
536  ``var`` gives the name of the variable referred to, and ``index``
537  gives the index of the bit.
538
539* A value of type ``dag`` has ``kind=="dag"``, and has two extra
540  fields. ``operator`` gives the initial value after the opening
541  parenthesis of the dag initializer; ``args`` is an array giving the
542  following arguments. The elements of ``args`` are arrays of length
543  2, giving the value of each argument followed by its colon-suffixed
544  name (if any). For example, in the JSON representation of the dag
545  value ``(Op 22, "hello":$foo)`` (assuming that ``Op`` is the name of
546  a record defined elsewhere with a ``def`` statement):
547
548  * ``operator`` will be an object in which ``kind=="def"`` and
549    ``def=="Op"``
550
551  * ``args`` will be the array ``[[22, null], ["hello", "foo"]]``.
552
553* If any other kind of value or complicated expression appears in the
554  output, it will have ``kind=="complex"``, and no additional fields.
555  These values are not expected to be needed by backends. The standard
556  ``printable`` field can be used to extract a representation of them
557  in TableGen source syntax if necessary.
558
559How to write a back-end
560=======================
561
562TODO.
563
564Until we get a step-by-step HowTo for writing TableGen backends, you can at
565least grab the boilerplate (build system, new files, etc.) from Clang's
566r173931.
567
568TODO: How they work, how to write one.  This section should not contain details
569about any particular backend, except maybe ``-print-enums`` as an example.  This
570should highlight the APIs in ``TableGen/Record.h``.
571
572