1======================= 2Writing an LLVM Backend 3======================= 4 5.. toctree:: 6 :hidden: 7 8 HowToUseInstrMappings 9 10.. contents:: 11 :local: 12 13Introduction 14============ 15 16This document describes techniques for writing compiler backends that convert 17the LLVM Intermediate Representation (IR) to code for a specified machine or 18other languages. Code intended for a specific machine can take the form of 19either assembly code or binary code (usable for a JIT compiler). 20 21The backend of LLVM features a target-independent code generator that may 22create output for several types of target CPUs --- including X86, PowerPC, 23ARM, and SPARC. The backend may also be used to generate code targeted at SPUs 24of the Cell processor or GPUs to support the execution of compute kernels. 25 26The document focuses on existing examples found in subdirectories of 27``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document 28focuses on the example of creating a static compiler (one that emits text 29assembly) for a SPARC target, because SPARC has fairly standard 30characteristics, such as a RISC instruction set and straightforward calling 31conventions. 32 33Audience 34-------- 35 36The audience for this document is anyone who needs to write an LLVM backend to 37generate code for a specific hardware or software target. 38 39Prerequisite Reading 40-------------------- 41 42These essential documents must be read before reading this document: 43 44* `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for 45 the LLVM assembly language. 46 47* :doc:`CodeGenerator` --- a guide to the components (classes and code 48 generation algorithms) for translating the LLVM internal representation into 49 machine code for a specified target. Pay particular attention to the 50 descriptions of code generation stages: Instruction Selection, Scheduling and 51 Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code 52 Insertion, Late Machine Code Optimizations, and Code Emission. 53 54* :doc:`TableGen/index` --- a document that describes the TableGen 55 (``tblgen``) application that manages domain-specific information to support 56 LLVM code generation. TableGen processes input from a target description 57 file (``.td`` suffix) and generates C++ code that can be used for code 58 generation. 59 60* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as 61 are several ``SelectionDAG`` processing steps. 62 63To follow the SPARC examples in this document, have a copy of `The SPARC 64Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for 65reference. For details about the ARM instruction set, refer to the `ARM 66Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about 67the GNU Assembler format (``GAS``), see `Using As 68<http://sourceware.org/binutils/docs/as/index.html>`_, especially for the 69assembly printer. "Using As" contains a list of target machine dependent 70features. 71 72Basic Steps 73----------- 74 75To write a compiler backend for LLVM that converts the LLVM IR to code for a 76specified target (machine or other language), follow these steps: 77 78* Create a subclass of the ``TargetMachine`` class that describes 79 characteristics of your target machine. Copy existing examples of specific 80 ``TargetMachine`` class and header files; for example, start with 81 ``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file 82 names for your target. Similarly, change code that references "``Sparc``" to 83 reference your target. 84 85* Describe the register set of the target. Use TableGen to generate code for 86 register definition, register aliases, and register classes from a 87 target-specific ``RegisterInfo.td`` input file. You should also write 88 additional code for a subclass of the ``TargetRegisterInfo`` class that 89 represents the class register file data used for register allocation and also 90 describes the interactions between registers. 91 92* Describe the instruction set of the target. Use TableGen to generate code 93 for target-specific instructions from target-specific versions of 94 ``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write 95 additional code for a subclass of the ``TargetInstrInfo`` class to represent 96 machine instructions supported by the target machine. 97 98* Describe the selection and conversion of the LLVM IR from a Directed Acyclic 99 Graph (DAG) representation of instructions to native target-specific 100 instructions. Use TableGen to generate code that matches patterns and 101 selects instructions based on additional information in a target-specific 102 version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``, 103 where ``XXX`` identifies the specific target, to perform pattern matching and 104 DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp`` 105 to replace or remove operations and data types that are not supported 106 natively in a SelectionDAG. 107 108* Write code for an assembly printer that converts LLVM IR to a GAS format for 109 your target machine. You should add assembly strings to the instructions 110 defined in your target-specific version of ``TargetInstrInfo.td``. You 111 should also write code for a subclass of ``AsmPrinter`` that performs the 112 LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``. 113 114* Optionally, add support for subtargets (i.e., variants with different 115 capabilities). You should also write code for a subclass of the 116 ``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and 117 ``-mattr=`` command-line options. 118 119* Optionally, add JIT support and create a machine code emitter (subclass of 120 ``TargetJITInfo``) that is used to emit binary code directly into memory. 121 122In the ``.cpp`` and ``.h``. files, initially stub up these methods and then 123implement them later. Initially, you may not know which private members that 124the class will need and which components will need to be subclassed. 125 126Preliminaries 127------------- 128 129To actually create your compiler backend, you need to create and modify a few 130files. The absolute minimum is discussed here. But to actually use the LLVM 131target-independent code generator, you must perform the steps described in the 132:doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document. 133 134First, you should create a subdirectory under ``lib/Target`` to hold all the 135files related to your target. If your target is called "Dummy", create the 136directory ``lib/Target/Dummy``. 137 138In this new directory, create a ``Makefile``. It is easiest to copy a 139``Makefile`` of another target and modify it. It should at least contain the 140``LEVEL``, ``LIBRARYNAME`` and ``TARGET`` variables, and then include 141``$(LEVEL)/Makefile.common``. The library can be named ``LLVMDummy`` (for 142example, see the MIPS target). Alternatively, you can split the library into 143``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which should be 144implemented in a subdirectory below ``lib/Target/Dummy`` (for example, see the 145PowerPC target). 146 147Note that these two naming schemes are hardcoded into ``llvm-config``. Using 148any other naming scheme will confuse ``llvm-config`` and produce a lot of 149(seemingly unrelated) linker errors when linking ``llc``. 150 151To make your target actually do something, you need to implement a subclass of 152``TargetMachine``. This implementation should typically be in the file 153``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target`` 154directory will be built and should work. To use LLVM's target independent code 155generator, you should do what all current machine backends do: create a 156subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a 157subclass of ``TargetMachine``.) 158 159To get LLVM to actually build and link your target, you need to add it to the 160``TARGETS_TO_BUILD`` variable. To do this, you modify the configure script to 161know about your target when parsing the ``--enable-targets`` option. Search 162the configure script for ``TARGETS_TO_BUILD``, add your target to the lists 163there (some creativity required), and then reconfigure. Alternatively, you can 164change ``autoconf/configure.ac`` and regenerate configure by running 165``./autoconf/AutoRegen.sh``. 166 167Target Machine 168============== 169 170``LLVMTargetMachine`` is designed as a base class for targets implemented with 171the LLVM target-independent code generator. The ``LLVMTargetMachine`` class 172should be specialized by a concrete target class that implements the various 173virtual methods. ``LLVMTargetMachine`` is defined as a subclass of 174``TargetMachine`` in ``include/llvm/Target/TargetMachine.h``. The 175``TargetMachine`` class implementation (``TargetMachine.cpp``) also processes 176numerous command-line options. 177 178To create a concrete target-specific subclass of ``LLVMTargetMachine``, start 179by copying an existing ``TargetMachine`` class and header. You should name the 180files that you create to reflect your specific target. For instance, for the 181SPARC target, name the files ``SparcTargetMachine.h`` and 182``SparcTargetMachine.cpp``. 183 184For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must 185have access methods to obtain objects that represent target components. These 186methods are named ``get*Info``, and are intended to obtain the instruction set 187(``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout 188(``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also 189implement the ``getDataLayout`` method to access an object with target-specific 190data characteristics, such as data type size and alignment requirements. 191 192For instance, for the SPARC target, the header file ``SparcTargetMachine.h`` 193declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that 194simply return a class member. 195 196.. code-block:: c++ 197 198 namespace llvm { 199 200 class Module; 201 202 class SparcTargetMachine : public LLVMTargetMachine { 203 const DataLayout DataLayout; // Calculates type size & alignment 204 SparcSubtarget Subtarget; 205 SparcInstrInfo InstrInfo; 206 TargetFrameInfo FrameInfo; 207 208 protected: 209 virtual const TargetAsmInfo *createTargetAsmInfo() const; 210 211 public: 212 SparcTargetMachine(const Module &M, const std::string &FS); 213 214 virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; } 215 virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; } 216 virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; } 217 virtual const TargetRegisterInfo *getRegisterInfo() const { 218 return &InstrInfo.getRegisterInfo(); 219 } 220 virtual const DataLayout *getDataLayout() const { return &DataLayout; } 221 static unsigned getModuleMatchQuality(const Module &M); 222 223 // Pass Pipeline Configuration 224 virtual bool addInstSelector(PassManagerBase &PM, bool Fast); 225 virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast); 226 }; 227 228 } // end namespace llvm 229 230* ``getInstrInfo()`` 231* ``getRegisterInfo()`` 232* ``getFrameInfo()`` 233* ``getDataLayout()`` 234* ``getSubtargetImpl()`` 235 236For some targets, you also need to support the following methods: 237 238* ``getTargetLowering()`` 239* ``getJITInfo()`` 240 241Some architectures, such as GPUs, do not support jumping to an arbitrary 242program location and implement branching using masked execution and loop using 243special instructions around the loop body. In order to avoid CFG modifications 244that introduce irreducible control flow not handled by such hardware, a target 245must call `setRequiresStructuredCFG(true)` when being initialized. 246 247In addition, the ``XXXTargetMachine`` constructor should specify a 248``TargetDescription`` string that determines the data layout for the target 249machine, including characteristics such as pointer size, alignment, and 250endianness. For example, the constructor for ``SparcTargetMachine`` contains 251the following: 252 253.. code-block:: c++ 254 255 SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS) 256 : DataLayout("E-p:32:32-f128:128:128"), 257 Subtarget(M, FS), InstrInfo(Subtarget), 258 FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) { 259 } 260 261Hyphens separate portions of the ``TargetDescription`` string. 262 263* An upper-case "``E``" in the string indicates a big-endian target data model. 264 A lower-case "``e``" indicates little-endian. 265 266* "``p:``" is followed by pointer information: size, ABI alignment, and 267 preferred alignment. If only two figures follow "``p:``", then the first 268 value is pointer size, and the second value is both ABI and preferred 269 alignment. 270 271* Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or 272 "``a``" (corresponding to integer, floating point, vector, or aggregate). 273 "``i``", "``v``", or "``a``" are followed by ABI alignment and preferred 274 alignment. "``f``" is followed by three values: the first indicates the size 275 of a long double, then ABI alignment, and then ABI preferred alignment. 276 277Target Registration 278=================== 279 280You must also register your target with the ``TargetRegistry``, which is what 281other LLVM tools use to be able to lookup and use your target at runtime. The 282``TargetRegistry`` can be used directly, but for most targets there are helper 283templates which should take care of the work for you. 284 285All targets should declare a global ``Target`` object which is used to 286represent the target during registration. Then, in the target's ``TargetInfo`` 287library, the target should define that object and use the ``RegisterTarget`` 288template to register the target. For example, the Sparc registration code 289looks like this: 290 291.. code-block:: c++ 292 293 Target llvm::TheSparcTarget; 294 295 extern "C" void LLVMInitializeSparcTargetInfo() { 296 RegisterTarget<Triple::sparc, /*HasJIT=*/false> 297 X(TheSparcTarget, "sparc", "Sparc"); 298 } 299 300This allows the ``TargetRegistry`` to look up the target by name or by target 301triple. In addition, most targets will also register additional features which 302are available in separate libraries. These registration steps are separate, 303because some clients may wish to only link in some parts of the target --- the 304JIT code generator does not require the use of the assembler printer, for 305example. Here is an example of registering the Sparc assembly printer: 306 307.. code-block:: c++ 308 309 extern "C" void LLVMInitializeSparcAsmPrinter() { 310 RegisterAsmPrinter<SparcAsmPrinter> X(TheSparcTarget); 311 } 312 313For more information, see "`llvm/Target/TargetRegistry.h 314</doxygen/TargetRegistry_8h-source.html>`_". 315 316Register Set and Register Classes 317================================= 318 319You should describe a concrete target-specific class that represents the 320register file of a target machine. This class is called ``XXXRegisterInfo`` 321(where ``XXX`` identifies the target) and represents the class register file 322data that is used for register allocation. It also describes the interactions 323between registers. 324 325You also need to define register classes to categorize related registers. A 326register class should be added for groups of registers that are all treated the 327same way for some instruction. Typical examples are register classes for 328integer, floating-point, or vector registers. A register allocator allows an 329instruction to use any register in a specified register class to perform the 330instruction in a similar manner. Register classes allocate virtual registers 331to instructions from these sets, and register classes let the 332target-independent register allocator automatically choose the actual 333registers. 334 335Much of the code for registers, including register definition, register 336aliases, and register classes, is generated by TableGen from 337``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc`` 338and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the 339implementation of ``XXXRegisterInfo`` requires hand-coding. 340 341Defining a Register 342------------------- 343 344The ``XXXRegisterInfo.td`` file typically starts with register definitions for 345a target machine. The ``Register`` class (specified in ``Target.td``) is used 346to define an object for each register. The specified string ``n`` becomes the 347``Name`` of the register. The basic ``Register`` object does not have any 348subregisters and does not specify any aliases. 349 350.. code-block:: llvm 351 352 class Register<string n> { 353 string Namespace = ""; 354 string AsmName = n; 355 string Name = n; 356 int SpillSize = 0; 357 int SpillAlignment = 0; 358 list<Register> Aliases = []; 359 list<Register> SubRegs = []; 360 list<int> DwarfNumbers = []; 361 } 362 363For example, in the ``X86RegisterInfo.td`` file, there are register definitions 364that utilize the ``Register`` class, such as: 365 366.. code-block:: llvm 367 368 def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>; 369 370This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``) 371that are used by ``gcc``, ``gdb``, or a debug information writer to identify a 372register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values 373representing 3 different modes: the first element is for X86-64, the second for 374exception handling (EH) on X86-32, and the third is generic. -1 is a special 375Dwarf number that indicates the gcc number is undefined, and -2 indicates the 376register number is invalid for this mode. 377 378From the previously described line in the ``X86RegisterInfo.td`` file, TableGen 379generates this code in the ``X86GenRegisterInfo.inc`` file: 380 381.. code-block:: c++ 382 383 static const unsigned GR8[] = { X86::AL, ... }; 384 385 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 }; 386 387 const TargetRegisterDesc RegisterDescriptors[] = { 388 ... 389 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ... 390 391From the register info file, TableGen generates a ``TargetRegisterDesc`` object 392for each register. ``TargetRegisterDesc`` is defined in 393``include/llvm/Target/TargetRegisterInfo.h`` with the following fields: 394 395.. code-block:: c++ 396 397 struct TargetRegisterDesc { 398 const char *AsmName; // Assembly language name for the register 399 const char *Name; // Printable name for the reg (for debugging) 400 const unsigned *AliasSet; // Register Alias Set 401 const unsigned *SubRegs; // Sub-register set 402 const unsigned *ImmSubRegs; // Immediate sub-register set 403 const unsigned *SuperRegs; // Super-register set 404 }; 405 406TableGen uses the entire target description file (``.td``) to determine text 407names for the register (in the ``AsmName`` and ``Name`` fields of 408``TargetRegisterDesc``) and the relationships of other registers to the defined 409register (in the other ``TargetRegisterDesc`` fields). In this example, other 410definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as 411aliases for one another, so TableGen generates a null-terminated array 412(``AL_AliasSet``) for this register alias set. 413 414The ``Register`` class is commonly used as a base class for more complex 415classes. In ``Target.td``, the ``Register`` class is the base for the 416``RegisterWithSubRegs`` class that is used to define registers that need to 417specify subregisters in the ``SubRegs`` list, as shown here: 418 419.. code-block:: llvm 420 421 class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> { 422 let SubRegs = subregs; 423 } 424 425In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC: 426a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``, 427and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a 428feature common to these subclasses. Note the use of "``let``" expressions to 429override values that are initially defined in a superclass (such as ``SubRegs`` 430field in the ``Rd`` class). 431 432.. code-block:: llvm 433 434 class SparcReg<string n> : Register<n> { 435 field bits<5> Num; 436 let Namespace = "SP"; 437 } 438 // Ri - 32-bit integer registers 439 class Ri<bits<5> num, string n> : 440 SparcReg<n> { 441 let Num = num; 442 } 443 // Rf - 32-bit floating-point registers 444 class Rf<bits<5> num, string n> : 445 SparcReg<n> { 446 let Num = num; 447 } 448 // Rd - Slots in the FP register file for 64-bit floating-point values. 449 class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> { 450 let Num = num; 451 let SubRegs = subregs; 452 } 453 454In the ``SparcRegisterInfo.td`` file, there are register definitions that 455utilize these subclasses of ``Register``, such as: 456 457.. code-block:: llvm 458 459 def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>; 460 def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>; 461 ... 462 def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>; 463 def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>; 464 ... 465 def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>; 466 def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>; 467 468The last two registers shown above (``D0`` and ``D1``) are double-precision 469floating-point registers that are aliases for pairs of single-precision 470floating-point sub-registers. In addition to aliases, the sub-register and 471super-register relationships of the defined register are in fields of a 472register's ``TargetRegisterDesc``. 473 474Defining a Register Class 475------------------------- 476 477The ``RegisterClass`` class (specified in ``Target.td``) is used to define an 478object that represents a group of related registers and also defines the 479default allocation order of the registers. A target description file 480``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes 481using the following class: 482 483.. code-block:: llvm 484 485 class RegisterClass<string namespace, 486 list<ValueType> regTypes, int alignment, dag regList> { 487 string Namespace = namespace; 488 list<ValueType> RegTypes = regTypes; 489 int Size = 0; // spill size, in bits; zero lets tblgen pick the size 490 int Alignment = alignment; 491 492 // CopyCost is the cost of copying a value between two registers 493 // default value 1 means a single instruction 494 // A negative value means copying is extremely expensive or impossible 495 int CopyCost = 1; 496 dag MemberList = regList; 497 498 // for register classes that are subregisters of this class 499 list<RegisterClass> SubRegClassList = []; 500 501 code MethodProtos = [{}]; // to insert arbitrary code 502 code MethodBodies = [{}]; 503 } 504 505To define a ``RegisterClass``, use the following 4 arguments: 506 507* The first argument of the definition is the name of the namespace. 508 509* The second argument is a list of ``ValueType`` register type values that are 510 defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include 511 integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean), 512 floating-point types (``f32``, ``f64``), and vector types (for example, 513 ``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass`` 514 must have the same ``ValueType``, but some registers may store vector data in 515 different configurations. For example a register that can process a 128-bit 516 vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4 517 32-bit integers, and so on. 518 519* The third argument of the ``RegisterClass`` definition specifies the 520 alignment required of the registers when they are stored or loaded to 521 memory. 522 523* The final argument, ``regList``, specifies which registers are in this class. 524 If an alternative allocation order method is not specified, then ``regList`` 525 also defines the order of allocation used by the register allocator. Besides 526 simply listing registers with ``(add R0, R1, ...)``, more advanced set 527 operators are available. See ``include/llvm/Target/Target.td`` for more 528 information. 529 530In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined: 531``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the 532first argument defines the namespace with the string "``SP``". ``FPRegs`` 533defines a group of 32 single-precision floating-point registers (``F0`` to 534``F31``); ``DFPRegs`` defines a group of 16 double-precision registers 535(``D0-D15``). 536 537.. code-block:: llvm 538 539 // F0, F1, F2, ..., F31 540 def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>; 541 542 def DFPRegs : RegisterClass<"SP", [f64], 64, 543 (add D0, D1, D2, D3, D4, D5, D6, D7, D8, 544 D9, D10, D11, D12, D13, D14, D15)>; 545 546 def IntRegs : RegisterClass<"SP", [i32], 32, 547 (add L0, L1, L2, L3, L4, L5, L6, L7, 548 I0, I1, I2, I3, I4, I5, 549 O0, O1, O2, O3, O4, O5, O7, 550 G1, 551 // Non-allocatable regs: 552 G2, G3, G4, 553 O6, // stack ptr 554 I6, // frame ptr 555 I7, // return address 556 G0, // constant zero 557 G5, G6, G7 // reserved for kernel 558 )>; 559 560Using ``SparcRegisterInfo.td`` with TableGen generates several output files 561that are intended for inclusion in other source code that you write. 562``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should 563be included in the header file for the implementation of the SPARC register 564implementation that you write (``SparcRegisterInfo.h``). In 565``SparcGenRegisterInfo.h.inc`` a new structure is defined called 566``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also 567specifies types, based upon the defined register classes: ``DFPRegsClass``, 568``FPRegsClass``, and ``IntRegsClass``. 569 570``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is 571included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register 572implementation. The code below shows only the generated integer registers and 573associated register classes. The order of registers in ``IntRegs`` reflects 574the order in the definition of ``IntRegs`` in the target description file. 575 576.. code-block:: c++ 577 578 // IntRegs Register Class... 579 static const unsigned IntRegs[] = { 580 SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5, 581 SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, 582 SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3, 583 SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, 584 SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5, 585 SP::G6, SP::G7, 586 }; 587 588 // IntRegsVTs Register Class Value Types... 589 static const MVT::ValueType IntRegsVTs[] = { 590 MVT::i32, MVT::Other 591 }; 592 593 namespace SP { // Register class instances 594 DFPRegsClass DFPRegsRegClass; 595 FPRegsClass FPRegsRegClass; 596 IntRegsClass IntRegsRegClass; 597 ... 598 // IntRegs Sub-register Classess... 599 static const TargetRegisterClass* const IntRegsSubRegClasses [] = { 600 NULL 601 }; 602 ... 603 // IntRegs Super-register Classess... 604 static const TargetRegisterClass* const IntRegsSuperRegClasses [] = { 605 NULL 606 }; 607 ... 608 // IntRegs Register Class sub-classes... 609 static const TargetRegisterClass* const IntRegsSubclasses [] = { 610 NULL 611 }; 612 ... 613 // IntRegs Register Class super-classes... 614 static const TargetRegisterClass* const IntRegsSuperclasses [] = { 615 NULL 616 }; 617 618 IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID, 619 IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses, 620 IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {} 621 } 622 623The register allocators will avoid using reserved registers, and callee saved 624registers are not used until all the volatile registers have been used. That 625is usually good enough, but in some cases it may be necessary to provide custom 626allocation orders. 627 628Implement a subclass of ``TargetRegisterInfo`` 629---------------------------------------------- 630 631The final step is to hand code portions of ``XXXRegisterInfo``, which 632implements the interface described in ``TargetRegisterInfo.h`` (see 633:ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or 634``false``, unless overridden. Here is a list of functions that are overridden 635for the SPARC implementation in ``SparcRegisterInfo.cpp``: 636 637* ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the 638 order of the desired callee-save stack frame offset. 639 640* ``getReservedRegs`` --- Returns a bitset indexed by physical register 641 numbers, indicating if a particular register is unavailable. 642 643* ``hasFP`` --- Return a Boolean indicating if a function should have a 644 dedicated frame pointer register. 645 646* ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo 647 instructions are used, this can be called to eliminate them. 648 649* ``eliminateFrameIndex`` --- Eliminate abstract frame indices from 650 instructions that may use them. 651 652* ``emitPrologue`` --- Insert prologue code into the function. 653 654* ``emitEpilogue`` --- Insert epilogue code into the function. 655 656.. _instruction-set: 657 658Instruction Set 659=============== 660 661During the early stages of code generation, the LLVM IR code is converted to a 662``SelectionDAG`` with nodes that are instances of the ``SDNode`` class 663containing target instructions. An ``SDNode`` has an opcode, operands, type 664requirements, and operation properties. For example, is an operation 665commutative, does an operation load from memory. The various operation node 666types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file 667(values of the ``NodeType`` enum in the ``ISD`` namespace). 668 669TableGen uses the following target description (``.td``) input files to 670generate much of the code for instruction definition: 671 672* ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and 673 other fundamental classes are defined. 674 675* ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection 676 generators, contains ``SDTC*`` classes (selection DAG type constraint), 677 definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``, 678 ``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``, 679 ``PatFrag``, ``PatLeaf``, ``ComplexPattern``. 680 681* ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific 682 instructions. 683 684* ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates, 685 condition codes, and instructions of an instruction set. For architecture 686 modifications, a different file name may be used. For example, for Pentium 687 with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with 688 MMX, this file is ``X86InstrMMX.td``. 689 690There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of 691the target. The ``XXX.td`` file includes the other ``.td`` input files, but 692its contents are only directly important for subtargets. 693 694You should describe a concrete target-specific class ``XXXInstrInfo`` that 695represents machine instructions supported by a target machine. 696``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of 697which describes one instruction. An instruction descriptor defines: 698 699* Opcode mnemonic 700* Number of operands 701* List of implicit register definitions and uses 702* Target-independent properties (such as memory access, is commutable) 703* Target-specific flags 704 705The Instruction class (defined in ``Target.td``) is mostly used as a base for 706more complex instruction classes. 707 708.. code-block:: llvm 709 710 class Instruction { 711 string Namespace = ""; 712 dag OutOperandList; // A dag containing the MI def operand list. 713 dag InOperandList; // A dag containing the MI use operand list. 714 string AsmString = ""; // The .s format to print the instruction with. 715 list<dag> Pattern; // Set to the DAG pattern for this instruction. 716 list<Register> Uses = []; 717 list<Register> Defs = []; 718 list<Predicate> Predicates = []; // predicates turned into isel match code 719 ... remainder not shown for space ... 720 } 721 722A ``SelectionDAG`` node (``SDNode``) should contain an object representing a 723target-specific instruction that is defined in ``XXXInstrInfo.td``. The 724instruction objects should represent instructions from the architecture manual 725of the target machine (such as the SPARC Architecture Manual for the SPARC 726target). 727 728A single instruction from the architecture manual is often modeled as multiple 729target instructions, depending upon its operands. For example, a manual might 730describe an add instruction that takes a register or an immediate operand. An 731LLVM target could model this with two instructions named ``ADDri`` and 732``ADDrr``. 733 734You should define a class for each instruction category and define each opcode 735as a subclass of the category with appropriate parameters such as the fixed 736binary encoding of opcodes and extended opcodes. You should map the register 737bits to the bits of the instruction in which they are encoded (for the JIT). 738Also you should specify how the instruction should be printed when the 739automatic assembly printer is used. 740 741As is described in the SPARC Architecture Manual, Version 8, there are three 742major 32-bit formats for instructions. Format 1 is only for the ``CALL`` 743instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high 744bits of a register) instructions. Format 3 is for other instructions. 745 746Each of these formats has corresponding classes in ``SparcInstrFormat.td``. 747``InstSP`` is a base class for other instruction classes. Additional base 748classes are specified for more precise formats: for example in 749``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for 750branches. There are three other base classes: ``F3_1`` for register/register 751operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for 752floating-point operations. ``SparcInstrInfo.td`` also adds the base class 753``Pseudo`` for synthetic SPARC instructions. 754 755``SparcInstrInfo.td`` largely consists of operand and instruction definitions 756for the SPARC target. In ``SparcInstrInfo.td``, the following target 757description file entry, ``LDrr``, defines the Load Integer instruction for a 758Word (the ``LD`` SPARC opcode) from a memory address to a register. The first 759parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this 760category of operation. The second parameter (``000000``\ :sub:`2`) is the 761specific operation value for ``LD``/Load Word. The third parameter is the 762output destination, which is a register operand and defined in the ``Register`` 763target description file (``IntRegs``). 764 765.. code-block:: llvm 766 767 def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr), 768 "ld [$addr], $dst", 769 [(set i32:$dst, (load ADDRrr:$addr))]>; 770 771The fourth parameter is the input source, which uses the address operand 772``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``: 773 774.. code-block:: llvm 775 776 def MEMrr : Operand<i32> { 777 let PrintMethod = "printMemOperand"; 778 let MIOperandInfo = (ops IntRegs, IntRegs); 779 } 780 781The fifth parameter is a string that is used by the assembly printer and can be 782left as an empty string until the assembly printer interface is implemented. 783The sixth and final parameter is the pattern used to match the instruction 784during the SelectionDAG Select Phase described in :doc:`CodeGenerator`. 785This parameter is detailed in the next section, :ref:`instruction-selector`. 786 787Instruction class definitions are not overloaded for different operand types, 788so separate versions of instructions are needed for register, memory, or 789immediate value operands. For example, to perform a Load Integer instruction 790for a Word from an immediate operand to a register, the following instruction 791class is defined: 792 793.. code-block:: llvm 794 795 def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr), 796 "ld [$addr], $dst", 797 [(set i32:$dst, (load ADDRri:$addr))]>; 798 799Writing these definitions for so many similar instructions can involve a lot of 800cut and paste. In ``.td`` files, the ``multiclass`` directive enables the 801creation of templates to define several instruction classes at once (using the 802``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass`` 803pattern ``F3_12`` is defined to create 2 instruction classes each time 804``F3_12`` is invoked: 805 806.. code-block:: llvm 807 808 multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> { 809 def rr : F3_1 <2, Op3Val, 810 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c), 811 !strconcat(OpcStr, " $b, $c, $dst"), 812 [(set i32:$dst, (OpNode i32:$b, i32:$c))]>; 813 def ri : F3_2 <2, Op3Val, 814 (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c), 815 !strconcat(OpcStr, " $b, $c, $dst"), 816 [(set i32:$dst, (OpNode i32:$b, simm13:$c))]>; 817 } 818 819So when the ``defm`` directive is used for the ``XOR`` and ``ADD`` 820instructions, as seen below, it creates four instruction objects: ``XORrr``, 821``XORri``, ``ADDrr``, and ``ADDri``. 822 823.. code-block:: llvm 824 825 defm XOR : F3_12<"xor", 0b000011, xor>; 826 defm ADD : F3_12<"add", 0b000000, add>; 827 828``SparcInstrInfo.td`` also includes definitions for condition codes that are 829referenced by branch instructions. The following definitions in 830``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code. 831For example, the 10\ :sup:`th` bit represents the "greater than" condition for 832integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for 833floats. 834 835.. code-block:: llvm 836 837 def ICC_NE : ICC_VAL< 9>; // Not Equal 838 def ICC_E : ICC_VAL< 1>; // Equal 839 def ICC_G : ICC_VAL<10>; // Greater 840 ... 841 def FCC_U : FCC_VAL<23>; // Unordered 842 def FCC_G : FCC_VAL<22>; // Greater 843 def FCC_UG : FCC_VAL<21>; // Unordered or Greater 844 ... 845 846(Note that ``Sparc.h`` also defines enums that correspond to the same SPARC 847condition codes. Care must be taken to ensure the values in ``Sparc.h`` 848correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``, 849``SPCC::FCC_U = 23`` and so on.) 850 851Instruction Operand Mapping 852--------------------------- 853 854The code generator backend maps instruction operands to fields in the 855instruction. Operands are assigned to unbound fields in the instruction in the 856order they are defined. Fields are bound when they are assigned a value. For 857example, the Sparc target defines the ``XNORrr`` instruction as a ``F3_1`` 858format instruction having three operands. 859 860.. code-block:: llvm 861 862 def XNORrr : F3_1<2, 0b000111, 863 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c), 864 "xnor $b, $c, $dst", 865 [(set i32:$dst, (not (xor i32:$b, i32:$c)))]>; 866 867The instruction templates in ``SparcInstrFormats.td`` show the base class for 868``F3_1`` is ``InstSP``. 869 870.. code-block:: llvm 871 872 class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction { 873 field bits<32> Inst; 874 let Namespace = "SP"; 875 bits<2> op; 876 let Inst{31-30} = op; 877 dag OutOperandList = outs; 878 dag InOperandList = ins; 879 let AsmString = asmstr; 880 let Pattern = pattern; 881 } 882 883``InstSP`` leaves the ``op`` field unbound. 884 885.. code-block:: llvm 886 887 class F3<dag outs, dag ins, string asmstr, list<dag> pattern> 888 : InstSP<outs, ins, asmstr, pattern> { 889 bits<5> rd; 890 bits<6> op3; 891 bits<5> rs1; 892 let op{1} = 1; // Op = 2 or 3 893 let Inst{29-25} = rd; 894 let Inst{24-19} = op3; 895 let Inst{18-14} = rs1; 896 } 897 898``F3`` binds the ``op`` field and defines the ``rd``, ``op3``, and ``rs1`` 899fields. ``F3`` format instructions will bind the operands ``rd``, ``op3``, and 900``rs1`` fields. 901 902.. code-block:: llvm 903 904 class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins, 905 string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> { 906 bits<8> asi = 0; // asi not currently used 907 bits<5> rs2; 908 let op = opVal; 909 let op3 = op3val; 910 let Inst{13} = 0; // i field = 0 911 let Inst{12-5} = asi; // address space identifier 912 let Inst{4-0} = rs2; 913 } 914 915``F3_1`` binds the ``op3`` field and defines the ``rs2`` fields. ``F3_1`` 916format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2`` 917fields. This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``, 918and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively. 919 920Instruction Operand Name Mapping 921^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 922 923TableGen will also generate a function called getNamedOperandIdx() which 924can be used to look up an operand's index in a MachineInstr based on its 925TableGen name. Setting the UseNamedOperandTable bit in an instruction's 926TableGen definition will add all of its operands to an enumeration in the 927llvm::XXX:OpName namespace and also add an entry for it into the OperandMap 928table, which can be queried using getNamedOperandIdx() 929 930.. code-block:: llvm 931 932 int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0 933 int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b); // => 1 934 int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c); // => 2 935 int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d); // => -1 936 937 ... 938 939The entries in the OpName enum are taken verbatim from the TableGen definitions, 940so operands with lowercase names will have lower case entries in the enum. 941 942To include the getNamedOperandIdx() function in your backend, you will need 943to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h. 944For example: 945 946XXXInstrInfo.cpp: 947 948.. code-block:: c++ 949 950 #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function 951 #include "XXXGenInstrInfo.inc" 952 953XXXInstrInfo.h: 954 955.. code-block:: c++ 956 957 #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum 958 #include "XXXGenInstrInfo.inc" 959 960 namespace XXX { 961 int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex); 962 } // End namespace XXX 963 964Instruction Operand Types 965^^^^^^^^^^^^^^^^^^^^^^^^^ 966 967TableGen will also generate an enumeration consisting of all named Operand 968types defined in the backend, in the llvm::XXX::OpTypes namespace. 969Some common immediate Operand types (for instance i8, i32, i64, f32, f64) 970are defined for all targets in ``include/llvm/Target/Target.td``, and are 971available in each Target's OpTypes enum. Also, only named Operand types appear 972in the enumeration: anonymous types are ignored. 973For example, the X86 backend defines ``brtarget`` and ``brtarget8``, both 974instances of the TableGen ``Operand`` class, which represent branch target 975operands: 976 977.. code-block:: llvm 978 979 def brtarget : Operand<OtherVT>; 980 def brtarget8 : Operand<OtherVT>; 981 982This results in: 983 984.. code-block:: c++ 985 986 namespace X86 { 987 namespace OpTypes { 988 enum OperandType { 989 ... 990 brtarget, 991 brtarget8, 992 ... 993 i32imm, 994 i64imm, 995 ... 996 OPERAND_TYPE_LIST_END 997 } // End namespace OpTypes 998 } // End namespace X86 999 1000In typical TableGen fashion, to use the enum, you will need to define a 1001preprocessor macro: 1002 1003.. code-block:: c++ 1004 1005 #define GET_INSTRINFO_OPERAND_TYPES_ENUM // For OpTypes enum 1006 #include "XXXGenInstrInfo.inc" 1007 1008 1009Instruction Scheduling 1010---------------------- 1011 1012Instruction itineraries can be queried using MCDesc::getSchedClass(). The 1013value can be named by an enumemation in llvm::XXX::Sched namespace generated 1014by TableGen in XXXGenInstrInfo.inc. The name of the schedule classes are 1015the same as provided in XXXSchedule.td plus a default NoItinerary class. 1016 1017Instruction Relation Mapping 1018---------------------------- 1019 1020This TableGen feature is used to relate instructions with each other. It is 1021particularly useful when you have multiple instruction formats and need to 1022switch between them after instruction selection. This entire feature is driven 1023by relation models which can be defined in ``XXXInstrInfo.td`` files 1024according to the target-specific instruction set. Relation models are defined 1025using ``InstrMapping`` class as a base. TableGen parses all the models 1026and generates instruction relation maps using the specified information. 1027Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file 1028along with the functions to query them. For the detailed information on how to 1029use this feature, please refer to :doc:`HowToUseInstrMappings`. 1030 1031Implement a subclass of ``TargetInstrInfo`` 1032------------------------------------------- 1033 1034The final step is to hand code portions of ``XXXInstrInfo``, which implements 1035the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`). 1036These functions return ``0`` or a Boolean or they assert, unless overridden. 1037Here's a list of functions that are overridden for the SPARC implementation in 1038``SparcInstrInfo.cpp``: 1039 1040* ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct 1041 load from a stack slot, return the register number of the destination and the 1042 ``FrameIndex`` of the stack slot. 1043 1044* ``isStoreToStackSlot`` --- If the specified machine instruction is a direct 1045 store to a stack slot, return the register number of the destination and the 1046 ``FrameIndex`` of the stack slot. 1047 1048* ``copyPhysReg`` --- Copy values between a pair of physical registers. 1049 1050* ``storeRegToStackSlot`` --- Store a register value to a stack slot. 1051 1052* ``loadRegFromStackSlot`` --- Load a register value from a stack slot. 1053 1054* ``storeRegToAddr`` --- Store a register value to memory. 1055 1056* ``loadRegFromAddr`` --- Load a register value from memory. 1057 1058* ``foldMemoryOperand`` --- Attempt to combine instructions of any load or 1059 store instruction for the specified operand(s). 1060 1061Branch Folding and If Conversion 1062-------------------------------- 1063 1064Performance can be improved by combining instructions or by eliminating 1065instructions that are never reached. The ``AnalyzeBranch`` method in 1066``XXXInstrInfo`` may be implemented to examine conditional instructions and 1067remove unnecessary instructions. ``AnalyzeBranch`` looks at the end of a 1068machine basic block (MBB) for opportunities for improvement, such as branch 1069folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine 1070function passes (see the source files ``BranchFolding.cpp`` and 1071``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``AnalyzeBranch`` 1072to improve the control flow graph that represents the instructions. 1073 1074Several implementations of ``AnalyzeBranch`` (for ARM, Alpha, and X86) can be 1075examined as models for your own ``AnalyzeBranch`` implementation. Since SPARC 1076does not implement a useful ``AnalyzeBranch``, the ARM target implementation is 1077shown below. 1078 1079``AnalyzeBranch`` returns a Boolean value and takes four parameters: 1080 1081* ``MachineBasicBlock &MBB`` --- The incoming block to be examined. 1082 1083* ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a 1084 conditional branch that evaluates to true, ``TBB`` is the destination. 1085 1086* ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to 1087 false, ``FBB`` is returned as the destination. 1088 1089* ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a 1090 condition for a conditional branch. 1091 1092In the simplest case, if a block ends without a branch, then it falls through 1093to the successor block. No destination blocks are specified for either ``TBB`` 1094or ``FBB``, so both parameters return ``NULL``. The start of the 1095``AnalyzeBranch`` (see code below for the ARM target) shows the function 1096parameters and the code for the simplest case. 1097 1098.. code-block:: c++ 1099 1100 bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB, 1101 MachineBasicBlock *&TBB, 1102 MachineBasicBlock *&FBB, 1103 std::vector<MachineOperand> &Cond) const 1104 { 1105 MachineBasicBlock::iterator I = MBB.end(); 1106 if (I == MBB.begin() || !isUnpredicatedTerminator(--I)) 1107 return false; 1108 1109If a block ends with a single unconditional branch instruction, then 1110``AnalyzeBranch`` (shown below) should return the destination of that branch in 1111the ``TBB`` parameter. 1112 1113.. code-block:: c++ 1114 1115 if (LastOpc == ARM::B || LastOpc == ARM::tB) { 1116 TBB = LastInst->getOperand(0).getMBB(); 1117 return false; 1118 } 1119 1120If a block ends with two unconditional branches, then the second branch is 1121never reached. In that situation, as shown below, remove the last branch 1122instruction and return the penultimate branch in the ``TBB`` parameter. 1123 1124.. code-block:: c++ 1125 1126 if ((SecondLastOpc == ARM::B || SecondLastOpc == ARM::tB) && 1127 (LastOpc == ARM::B || LastOpc == ARM::tB)) { 1128 TBB = SecondLastInst->getOperand(0).getMBB(); 1129 I = LastInst; 1130 I->eraseFromParent(); 1131 return false; 1132 } 1133 1134A block may end with a single conditional branch instruction that falls through 1135to successor block if the condition evaluates to false. In that case, 1136``AnalyzeBranch`` (shown below) should return the destination of that 1137conditional branch in the ``TBB`` parameter and a list of operands in the 1138``Cond`` parameter to evaluate the condition. 1139 1140.. code-block:: c++ 1141 1142 if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) { 1143 // Block ends with fall-through condbranch. 1144 TBB = LastInst->getOperand(0).getMBB(); 1145 Cond.push_back(LastInst->getOperand(1)); 1146 Cond.push_back(LastInst->getOperand(2)); 1147 return false; 1148 } 1149 1150If a block ends with both a conditional branch and an ensuing unconditional 1151branch, then ``AnalyzeBranch`` (shown below) should return the conditional 1152branch destination (assuming it corresponds to a conditional evaluation of 1153"``true``") in the ``TBB`` parameter and the unconditional branch destination 1154in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A 1155list of operands to evaluate the condition should be returned in the ``Cond`` 1156parameter. 1157 1158.. code-block:: c++ 1159 1160 unsigned SecondLastOpc = SecondLastInst->getOpcode(); 1161 1162 if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) || 1163 (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) { 1164 TBB = SecondLastInst->getOperand(0).getMBB(); 1165 Cond.push_back(SecondLastInst->getOperand(1)); 1166 Cond.push_back(SecondLastInst->getOperand(2)); 1167 FBB = LastInst->getOperand(0).getMBB(); 1168 return false; 1169 } 1170 1171For the last two cases (ending with a single conditional branch or ending with 1172one conditional and one unconditional branch), the operands returned in the 1173``Cond`` parameter can be passed to methods of other instructions to create new 1174branches or perform other operations. An implementation of ``AnalyzeBranch`` 1175requires the helper methods ``RemoveBranch`` and ``InsertBranch`` to manage 1176subsequent operations. 1177 1178``AnalyzeBranch`` should return false indicating success in most circumstances. 1179``AnalyzeBranch`` should only return true when the method is stumped about what 1180to do, for example, if a block has three terminating branches. 1181``AnalyzeBranch`` may return true if it encounters a terminator it cannot 1182handle, such as an indirect branch. 1183 1184.. _instruction-selector: 1185 1186Instruction Selector 1187==================== 1188 1189LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of 1190the ``SelectionDAG`` ideally represent native target instructions. During code 1191generation, instruction selection passes are performed to convert non-native 1192DAG instructions into native target-specific instructions. The pass described 1193in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG 1194instruction selection. Optionally, a pass may be defined (in 1195``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch 1196instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes 1197operations and data types not supported natively (legalizes) in a 1198``SelectionDAG``. 1199 1200TableGen generates code for instruction selection using the following target 1201description input files: 1202 1203* ``XXXInstrInfo.td`` --- Contains definitions of instructions in a 1204 target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is 1205 included in ``XXXISelDAGToDAG.cpp``. 1206 1207* ``XXXCallingConv.td`` --- Contains the calling and return value conventions 1208 for the target architecture, and it generates ``XXXGenCallingConv.inc``, 1209 which is included in ``XXXISelLowering.cpp``. 1210 1211The implementation of an instruction selection pass must include a header that 1212declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In 1213``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction 1214selection pass into the queue of passes to run. 1215 1216The LLVM static compiler (``llc``) is an excellent tool for visualizing the 1217contents of DAGs. To display the ``SelectionDAG`` before or after specific 1218processing phases, use the command line options for ``llc``, described at 1219:ref:`SelectionDAG-Process`. 1220 1221To describe instruction selector behavior, you should add patterns for lowering 1222LLVM code into a ``SelectionDAG`` as the last parameter of the instruction 1223definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``, 1224this entry defines a register store operation, and the last parameter describes 1225a pattern with the store DAG operator. 1226 1227.. code-block:: llvm 1228 1229 def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src), 1230 "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>; 1231 1232``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``: 1233 1234.. code-block:: llvm 1235 1236 def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>; 1237 1238The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function 1239defined in an implementation of the Instructor Selector (such as 1240``SparcISelDAGToDAG.cpp``). 1241 1242In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined 1243below: 1244 1245.. code-block:: llvm 1246 1247 def store : PatFrag<(ops node:$val, node:$ptr), 1248 (st node:$val, node:$ptr), [{ 1249 if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N)) 1250 return !ST->isTruncatingStore() && 1251 ST->getAddressingMode() == ISD::UNINDEXED; 1252 return false; 1253 }]>; 1254 1255``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the 1256``SelectCode`` method that is used to call the appropriate processing method 1257for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE`` 1258for the ``ISD::STORE`` opcode. 1259 1260.. code-block:: c++ 1261 1262 SDNode *SelectCode(SDValue N) { 1263 ... 1264 MVT::ValueType NVT = N.getNode()->getValueType(0); 1265 switch (N.getOpcode()) { 1266 case ISD::STORE: { 1267 switch (NVT) { 1268 default: 1269 return Select_ISD_STORE(N); 1270 break; 1271 } 1272 break; 1273 } 1274 ... 1275 1276The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``, 1277code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method 1278is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this 1279instruction. 1280 1281.. code-block:: c++ 1282 1283 SDNode *Select_ISD_STORE(const SDValue &N) { 1284 SDValue Chain = N.getOperand(0); 1285 if (Predicate_store(N.getNode())) { 1286 SDValue N1 = N.getOperand(1); 1287 SDValue N2 = N.getOperand(2); 1288 SDValue CPTmp0; 1289 SDValue CPTmp1; 1290 1291 // Pattern: (st:void i32:i32:$src, 1292 // ADDRrr:i32:$addr)<<P:Predicate_store>> 1293 // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src) 1294 // Pattern complexity = 13 cost = 1 size = 0 1295 if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) && 1296 N1.getNode()->getValueType(0) == MVT::i32 && 1297 N2.getNode()->getValueType(0) == MVT::i32) { 1298 return Emit_22(N, SP::STrr, CPTmp0, CPTmp1); 1299 } 1300 ... 1301 1302The SelectionDAG Legalize Phase 1303------------------------------- 1304 1305The Legalize phase converts a DAG to use types and operations that are natively 1306supported by the target. For natively unsupported types and operations, you 1307need to add code to the target-specific ``XXXTargetLowering`` implementation to 1308convert unsupported types and operations to supported ones. 1309 1310In the constructor for the ``XXXTargetLowering`` class, first use the 1311``addRegisterClass`` method to specify which types are supported and which 1312register classes are associated with them. The code for the register classes 1313are generated by TableGen from ``XXXRegisterInfo.td`` and placed in 1314``XXXGenRegisterInfo.h.inc``. For example, the implementation of the 1315constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``) 1316starts with the following code: 1317 1318.. code-block:: c++ 1319 1320 addRegisterClass(MVT::i32, SP::IntRegsRegisterClass); 1321 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass); 1322 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass); 1323 1324You should examine the node types in the ``ISD`` namespace 1325(``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations 1326the target natively supports. For operations that do **not** have native 1327support, add a callback to the constructor for the ``XXXTargetLowering`` class, 1328so the instruction selection process knows what to do. The ``TargetLowering`` 1329class callback methods (declared in ``llvm/Target/TargetLowering.h``) are: 1330 1331* ``setOperationAction`` --- General operation. 1332* ``setLoadExtAction`` --- Load with extension. 1333* ``setTruncStoreAction`` --- Truncating store. 1334* ``setIndexedLoadAction`` --- Indexed load. 1335* ``setIndexedStoreAction`` --- Indexed store. 1336* ``setConvertAction`` --- Type conversion. 1337* ``setCondCodeAction`` --- Support for a given condition code. 1338 1339Note: on older releases, ``setLoadXAction`` is used instead of 1340``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not 1341be supported. Examine your release to see what methods are specifically 1342supported. 1343 1344These callbacks are used to determine that an operation does or does not work 1345with a specified type (or types). And in all cases, the third parameter is a 1346``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or 1347``Legal``. ``SparcISelLowering.cpp`` contains examples of all four 1348``LegalAction`` values. 1349 1350Promote 1351^^^^^^^ 1352 1353For an operation without native support for a given type, the specified type 1354may be promoted to a larger type that is supported. For example, SPARC does 1355not support a sign-extending load for Boolean values (``i1`` type), so in 1356``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes 1357``i1`` type values to a large type before loading. 1358 1359.. code-block:: c++ 1360 1361 setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote); 1362 1363Expand 1364^^^^^^ 1365 1366For a type without native support, a value may need to be broken down further, 1367rather than promoted. For an operation without native support, a combination 1368of other operations may be used to similar effect. In SPARC, the 1369floating-point sine and cosine trig operations are supported by expansion to 1370other operations, as indicated by the third parameter, ``Expand``, to 1371``setOperationAction``: 1372 1373.. code-block:: c++ 1374 1375 setOperationAction(ISD::FSIN, MVT::f32, Expand); 1376 setOperationAction(ISD::FCOS, MVT::f32, Expand); 1377 1378Custom 1379^^^^^^ 1380 1381For some operations, simple type promotion or operation expansion may be 1382insufficient. In some cases, a special intrinsic function must be implemented. 1383 1384For example, a constant value may require special treatment, or an operation 1385may require spilling and restoring registers in the stack and working with 1386register allocators. 1387 1388As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion 1389from a floating point value to a signed integer, first the 1390``setOperationAction`` should be called with ``Custom`` as the third parameter: 1391 1392.. code-block:: c++ 1393 1394 setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom); 1395 1396In the ``LowerOperation`` method, for each ``Custom`` operation, a case 1397statement should be added to indicate what function to call. In the following 1398code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method: 1399 1400.. code-block:: c++ 1401 1402 SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) { 1403 switch (Op.getOpcode()) { 1404 case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG); 1405 ... 1406 } 1407 } 1408 1409Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to 1410convert the floating-point value to an integer. 1411 1412.. code-block:: c++ 1413 1414 static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) { 1415 assert(Op.getValueType() == MVT::i32); 1416 Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0)); 1417 return DAG.getNode(ISD::BITCAST, MVT::i32, Op); 1418 } 1419 1420Legal 1421^^^^^ 1422 1423The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation 1424**is** natively supported. ``Legal`` represents the default condition, so it 1425is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an 1426operation to count the bits set in an integer) is natively supported only for 1427SPARC v9. The following code enables the ``Expand`` conversion technique for 1428non-v9 SPARC implementations. 1429 1430.. code-block:: c++ 1431 1432 setOperationAction(ISD::CTPOP, MVT::i32, Expand); 1433 ... 1434 if (TM.getSubtarget<SparcSubtarget>().isV9()) 1435 setOperationAction(ISD::CTPOP, MVT::i32, Legal); 1436 1437Calling Conventions 1438------------------- 1439 1440To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses 1441interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in 1442``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor 1443file ``XXXGenCallingConv.td`` and generate the header file 1444``XXXGenCallingConv.inc``, which is typically included in 1445``XXXISelLowering.cpp``. You can use the interfaces in 1446``TargetCallingConv.td`` to specify: 1447 1448* The order of parameter allocation. 1449 1450* Where parameters and return values are placed (that is, on the stack or in 1451 registers). 1452 1453* Which registers may be used. 1454 1455* Whether the caller or callee unwinds the stack. 1456 1457The following example demonstrates the use of the ``CCIfType`` and 1458``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is, 1459if the current argument is of type ``f32`` or ``f64``), then the action is 1460performed. In this case, the ``CCAssignToReg`` action assigns the argument 1461value to the first available register: either ``R0`` or ``R1``. 1462 1463.. code-block:: llvm 1464 1465 CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>> 1466 1467``SparcCallingConv.td`` contains definitions for a target-specific return-value 1468calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention 1469(``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates 1470which registers are used for specified scalar return types. A single-precision 1471float is returned to register ``F0``, and a double-precision float goes to 1472register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``. 1473 1474.. code-block:: llvm 1475 1476 def RetCC_Sparc32 : CallingConv<[ 1477 CCIfType<[i32], CCAssignToReg<[I0, I1]>>, 1478 CCIfType<[f32], CCAssignToReg<[F0]>>, 1479 CCIfType<[f64], CCAssignToReg<[D0]>> 1480 ]>; 1481 1482The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces 1483``CCAssignToStack``, which assigns the value to a stack slot with the specified 1484size and alignment. In the example below, the first parameter, 4, indicates 1485the size of the slot, and the second parameter, also 4, indicates the stack 1486alignment along 4-byte units. (Special cases: if size is zero, then the ABI 1487size is used; if alignment is zero, then the ABI alignment is used.) 1488 1489.. code-block:: llvm 1490 1491 def CC_Sparc32 : CallingConv<[ 1492 // All arguments get passed in integer registers if there is space. 1493 CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>, 1494 CCAssignToStack<4, 4> 1495 ]>; 1496 1497``CCDelegateTo`` is another commonly used interface, which tries to find a 1498specified sub-calling convention, and, if a match is found, it is invoked. In 1499the following example (in ``X86CallingConv.td``), the definition of 1500``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is 1501assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is 1502invoked. 1503 1504.. code-block:: llvm 1505 1506 def RetCC_X86_32_C : CallingConv<[ 1507 CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>, 1508 CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>, 1509 CCDelegateTo<RetCC_X86Common> 1510 ]>; 1511 1512``CCIfCC`` is an interface that attempts to match the given name to the current 1513calling convention. If the name identifies the current calling convention, 1514then a specified action is invoked. In the following example (in 1515``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then 1516``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in 1517use, then ``RetCC_X86_32_SSE`` is invoked. 1518 1519.. code-block:: llvm 1520 1521 def RetCC_X86_32 : CallingConv<[ 1522 CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>, 1523 CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>, 1524 CCDelegateTo<RetCC_X86_32_C> 1525 ]>; 1526 1527Other calling convention interfaces include: 1528 1529* ``CCIf <predicate, action>`` --- If the predicate matches, apply the action. 1530 1531* ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``" 1532 attribute, then apply the action. 1533 1534* ``CCIfNest <action>`` --- If the argument is marked with the "``nest``" 1535 attribute, then apply the action. 1536 1537* ``CCIfNotVarArg <action>`` --- If the current function does not take a 1538 variable number of arguments, apply the action. 1539 1540* ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to 1541 ``CCAssignToReg``, but with a shadow list of registers. 1542 1543* ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the 1544 minimum specified size and alignment. 1545 1546* ``CCPromoteToType <type>`` --- Promote the current value to the specified 1547 type. 1548 1549* ``CallingConv <[actions]>`` --- Define each calling convention that is 1550 supported. 1551 1552Assembly Printer 1553================ 1554 1555During the code emission stage, the code generator may utilize an LLVM pass to 1556produce assembly output. To do this, you want to implement the code for a 1557printer that converts LLVM IR to a GAS-format assembly language for your target 1558machine, using the following steps: 1559 1560* Define all the assembly strings for your target, adding them to the 1561 instructions defined in the ``XXXInstrInfo.td`` file. (See 1562 :ref:`instruction-set`.) TableGen will produce an output file 1563 (``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction`` 1564 method for the ``XXXAsmPrinter`` class. 1565 1566* Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of 1567 the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``). 1568 1569* Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for 1570 ``TargetAsmInfo`` properties and sometimes new implementations for methods. 1571 1572* Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that 1573 performs the LLVM-to-assembly conversion. 1574 1575The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the 1576``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly, 1577``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo`` 1578replacement values that override the default values in ``TargetAsmInfo.cpp``. 1579For example in ``SparcTargetAsmInfo.cpp``: 1580 1581.. code-block:: c++ 1582 1583 SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) { 1584 Data16bitsDirective = "\t.half\t"; 1585 Data32bitsDirective = "\t.word\t"; 1586 Data64bitsDirective = 0; // .xword is only supported by V9. 1587 ZeroDirective = "\t.skip\t"; 1588 CommentString = "!"; 1589 ConstantPoolSection = "\t.section \".rodata\",#alloc\n"; 1590 } 1591 1592The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example 1593where the target specific ``TargetAsmInfo`` class uses an overridden methods: 1594``ExpandInlineAsm``. 1595 1596A target-specific implementation of ``AsmPrinter`` is written in 1597``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts 1598the LLVM to printable assembly. The implementation must include the following 1599headers that have declarations for the ``AsmPrinter`` and 1600``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of 1601``FunctionPass``. 1602 1603.. code-block:: c++ 1604 1605 #include "llvm/CodeGen/AsmPrinter.h" 1606 #include "llvm/CodeGen/MachineFunctionPass.h" 1607 1608As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set 1609up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is 1610instantiated to process variable names. 1611 1612In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in 1613``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In 1614``MachineFunctionPass``, the ``runOnFunction`` method invokes 1615``runOnMachineFunction``. Target-specific implementations of 1616``runOnMachineFunction`` differ, but generally do the following to process each 1617machine function: 1618 1619* Call ``SetupMachineFunction`` to perform initialization. 1620 1621* Call ``EmitConstantPool`` to print out (to the output stream) constants which 1622 have been spilled to memory. 1623 1624* Call ``EmitJumpTableInfo`` to print out jump tables used by the current 1625 function. 1626 1627* Print out the label for the current function. 1628 1629* Print out the code for the function, including basic block labels and the 1630 assembly for the instruction (using ``printInstruction``) 1631 1632The ``XXXAsmPrinter`` implementation must also include the code generated by 1633TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in 1634``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction`` 1635method that may call these methods: 1636 1637* ``printOperand`` 1638* ``printMemOperand`` 1639* ``printCCOperand`` (for conditional statements) 1640* ``printDataDirective`` 1641* ``printDeclare`` 1642* ``printImplicitDef`` 1643* ``printInlineAsm`` 1644 1645The implementations of ``printDeclare``, ``printImplicitDef``, 1646``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally 1647adequate for printing assembly and do not need to be overridden. 1648 1649The ``printOperand`` method is implemented with a long ``switch``/``case`` 1650statement for the type of operand: register, immediate, basic block, external 1651symbol, global address, constant pool index, or jump table index. For an 1652instruction with a memory address operand, the ``printMemOperand`` method 1653should be implemented to generate the proper output. Similarly, 1654``printCCOperand`` should be used to print a conditional operand. 1655 1656``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be 1657called to shut down the assembly printer. During ``doFinalization``, global 1658variables and constants are printed to output. 1659 1660Subtarget Support 1661================= 1662 1663Subtarget support is used to inform the code generation process of instruction 1664set variations for a given chip set. For example, the LLVM SPARC 1665implementation provided covers three major versions of the SPARC microprocessor 1666architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 166764-bit architecture), and the UltraSPARC architecture. V8 has 16 1668double-precision floating-point registers that are also usable as either 32 1669single-precision or 8 quad-precision registers. V8 is also purely big-endian. 1670V9 has 32 double-precision floating-point registers that are also usable as 16 1671quad-precision registers, but cannot be used as single-precision registers. 1672The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set 1673extensions. 1674 1675If subtarget support is needed, you should implement a target-specific 1676``XXXSubtarget`` class for your architecture. This class should process the 1677command-line options ``-mcpu=`` and ``-mattr=``. 1678 1679TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to 1680generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the 1681``SubtargetFeature`` interface is defined. The first 4 string parameters of 1682the ``SubtargetFeature`` interface are a feature name, an attribute set by the 1683feature, the value of the attribute, and a description of the feature. (The 1684fifth parameter is a list of features whose presence is implied, and its 1685default value is an empty array.) 1686 1687.. code-block:: llvm 1688 1689 class SubtargetFeature<string n, string a, string v, string d, 1690 list<SubtargetFeature> i = []> { 1691 string Name = n; 1692 string Attribute = a; 1693 string Value = v; 1694 string Desc = d; 1695 list<SubtargetFeature> Implies = i; 1696 } 1697 1698In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the 1699following features. 1700 1701.. code-block:: llvm 1702 1703 def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true", 1704 "Enable SPARC-V9 instructions">; 1705 def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8", 1706 "V8DeprecatedInsts", "true", 1707 "Enable deprecated V8 instructions in V9 mode">; 1708 def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true", 1709 "Enable UltraSPARC Visual Instruction Set extensions">; 1710 1711Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to 1712define particular SPARC processor subtypes that may have the previously 1713described features. 1714 1715.. code-block:: llvm 1716 1717 class Proc<string Name, list<SubtargetFeature> Features> 1718 : Processor<Name, NoItineraries, Features>; 1719 1720 def : Proc<"generic", []>; 1721 def : Proc<"v8", []>; 1722 def : Proc<"supersparc", []>; 1723 def : Proc<"sparclite", []>; 1724 def : Proc<"f934", []>; 1725 def : Proc<"hypersparc", []>; 1726 def : Proc<"sparclite86x", []>; 1727 def : Proc<"sparclet", []>; 1728 def : Proc<"tsc701", []>; 1729 def : Proc<"v9", [FeatureV9]>; 1730 def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>; 1731 def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>; 1732 def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>; 1733 1734From ``Target.td`` and ``Sparc.td`` files, the resulting 1735``SparcGenSubtarget.inc`` specifies enum values to identify the features, 1736arrays of constants to represent the CPU features and CPU subtypes, and the 1737``ParseSubtargetFeatures`` method that parses the features string that sets 1738specified subtarget options. The generated ``SparcGenSubtarget.inc`` file 1739should be included in the ``SparcSubtarget.cpp``. The target-specific 1740implementation of the ``XXXSubtarget`` method should follow this pseudocode: 1741 1742.. code-block:: c++ 1743 1744 XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) { 1745 // Set the default features 1746 // Determine default and user specified characteristics of the CPU 1747 // Call ParseSubtargetFeatures(FS, CPU) to parse the features string 1748 // Perform any additional operations 1749 } 1750 1751JIT Support 1752=========== 1753 1754The implementation of a target machine optionally includes a Just-In-Time (JIT) 1755code generator that emits machine code and auxiliary structures as binary 1756output that can be written directly to memory. To do this, implement JIT code 1757generation by performing the following steps: 1758 1759* Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass 1760 that transforms target-machine instructions into relocatable machine 1761 code. 1762 1763* Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for 1764 target-specific code-generation activities, such as emitting machine code and 1765 stubs. 1766 1767* Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object 1768 through its ``getJITInfo`` method. 1769 1770There are several different approaches to writing the JIT support code. For 1771instance, TableGen and target descriptor files may be used for creating a JIT 1772code generator, but are not mandatory. For the Alpha and PowerPC target 1773machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which 1774contains the binary coding of machine instructions and the 1775``getBinaryCodeForInstr`` method to access those codes. Other JIT 1776implementations do not. 1777 1778Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the 1779``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the 1780``MachineCodeEmitter`` class containing code for several callback functions 1781that write data (in bytes, words, strings, etc.) to the output stream. 1782 1783Machine Code Emitter 1784-------------------- 1785 1786In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is 1787implemented as a function pass (subclass of ``MachineFunctionPass``). The 1788target-specific implementation of ``runOnMachineFunction`` (invoked by 1789``runOnFunction`` in ``MachineFunctionPass``) iterates through the 1790``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and 1791emit binary code. ``emitInstruction`` is largely implemented with case 1792statements on the instruction types defined in ``XXXInstrInfo.h``. For 1793example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built 1794around the following ``switch``/``case`` statements: 1795 1796.. code-block:: c++ 1797 1798 switch (Desc->TSFlags & X86::FormMask) { 1799 case X86II::Pseudo: // for not yet implemented instructions 1800 ... // or pseudo-instructions 1801 break; 1802 case X86II::RawFrm: // for instructions with a fixed opcode value 1803 ... 1804 break; 1805 case X86II::AddRegFrm: // for instructions that have one register operand 1806 ... // added to their opcode 1807 break; 1808 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte 1809 ... // to specify a destination (register) 1810 break; 1811 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte 1812 ... // to specify a destination (memory) 1813 break; 1814 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte 1815 ... // to specify a source (register) 1816 break; 1817 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte 1818 ... // to specify a source (memory) 1819 break; 1820 case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on 1821 case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and 1822 case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field 1823 case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data 1824 ... 1825 break; 1826 case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on 1827 case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and 1828 case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field 1829 case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data 1830 ... 1831 break; 1832 case X86II::MRMInitReg: // for instructions whose source and 1833 ... // destination are the same register 1834 break; 1835 } 1836 1837The implementations of these case statements often first emit the opcode and 1838then get the operand(s). Then depending upon the operand, helper methods may 1839be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``, 1840for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is 1841the opcode added to the register operand. Then an object representing the 1842machine operand, ``MO1``, is extracted. The helper methods such as 1843``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``, 1844``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type. 1845(``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``, 1846``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``, 1847and ``emitJumpTableAddress`` that emit the data into the output stream.) 1848 1849.. code-block:: c++ 1850 1851 case X86II::AddRegFrm: 1852 MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg())); 1853 1854 if (CurOp != NumOps) { 1855 const MachineOperand &MO1 = MI.getOperand(CurOp++); 1856 unsigned Size = X86InstrInfo::sizeOfImm(Desc); 1857 if (MO1.isImmediate()) 1858 emitConstant(MO1.getImm(), Size); 1859 else { 1860 unsigned rt = Is64BitMode ? X86::reloc_pcrel_word 1861 : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word); 1862 if (Opcode == X86::MOV64ri) 1863 rt = X86::reloc_absolute_dword; // FIXME: add X86II flag? 1864 if (MO1.isGlobalAddress()) { 1865 bool NeedStub = isa<Function>(MO1.getGlobal()); 1866 bool isLazy = gvNeedsLazyPtr(MO1.getGlobal()); 1867 emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0, 1868 NeedStub, isLazy); 1869 } else if (MO1.isExternalSymbol()) 1870 emitExternalSymbolAddress(MO1.getSymbolName(), rt); 1871 else if (MO1.isConstantPoolIndex()) 1872 emitConstPoolAddress(MO1.getIndex(), rt); 1873 else if (MO1.isJumpTableIndex()) 1874 emitJumpTableAddress(MO1.getIndex(), rt); 1875 } 1876 } 1877 break; 1878 1879In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which 1880is a ``RelocationType`` enum that may be used to relocate addresses (for 1881example, a global address with a PIC base offset). The ``RelocationType`` enum 1882for that target is defined in the short target-specific ``XXXRelocations.h`` 1883file. The ``RelocationType`` is used by the ``relocate`` method defined in 1884``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols. 1885 1886For example, ``X86Relocations.h`` specifies the following relocation types for 1887the X86 addresses. In all four cases, the relocated value is added to the 1888value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``, 1889there is an additional initial adjustment. 1890 1891.. code-block:: c++ 1892 1893 enum RelocationType { 1894 reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc 1895 reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base 1896 reloc_absolute_word = 2, // absolute relocation; no additional adjustment 1897 reloc_absolute_dword = 3 // absolute relocation; no additional adjustment 1898 }; 1899 1900Target JIT Info 1901--------------- 1902 1903``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific 1904code-generation activities, such as emitting machine code and stubs. At 1905minimum, a target-specific version of ``XXXJITInfo`` implements the following: 1906 1907* ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a 1908 function that is used for compilation. 1909 1910* ``emitFunctionStub`` --- Returns a native function with a specified address 1911 for a callback function. 1912 1913* ``relocate`` --- Changes the addresses of referenced globals, based on 1914 relocation types. 1915 1916* Callback function that are wrappers to a function stub that is used when the 1917 real target is not initially known. 1918 1919``getLazyResolverFunction`` is generally trivial to implement. It makes the 1920incoming parameter as the global ``JITCompilerFunction`` and returns the 1921callback function that will be used a function wrapper. For the Alpha target 1922(in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is 1923simply: 1924 1925.. code-block:: c++ 1926 1927 TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction( 1928 JITCompilerFn F) { 1929 JITCompilerFunction = F; 1930 return AlphaCompilationCallback; 1931 } 1932 1933For the X86 target, the ``getLazyResolverFunction`` implementation is a little 1934more complicated, because it returns a different callback function for 1935processors with SSE instructions and XMM registers. 1936 1937The callback function initially saves and later restores the callee register 1938values, incoming arguments, and frame and return address. The callback 1939function needs low-level access to the registers or stack, so it is typically 1940implemented with assembler. 1941 1942