1============================== 2LLVM Language Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :depth: 4 8 9Abstract 10======== 11 12This document is a reference manual for the LLVM assembly language. LLVM 13is a Static Single Assignment (SSA) based representation that provides 14type safety, low-level operations, flexibility, and the capability of 15representing 'all' high-level languages cleanly. It is the common code 16representation used throughout all phases of the LLVM compilation 17strategy. 18 19Introduction 20============ 21 22The LLVM code representation is designed to be used in three different 23forms: as an in-memory compiler IR, as an on-disk bitcode representation 24(suitable for fast loading by a Just-In-Time compiler), and as a human 25readable assembly language representation. This allows LLVM to provide a 26powerful intermediate representation for efficient compiler 27transformations and analysis, while providing a natural means to debug 28and visualize the transformations. The three different forms of LLVM are 29all equivalent. This document describes the human readable 30representation and notation. 31 32The LLVM representation aims to be light-weight and low-level while 33being expressive, typed, and extensible at the same time. It aims to be 34a "universal IR" of sorts, by being at a low enough level that 35high-level ideas may be cleanly mapped to it (similar to how 36microprocessors are "universal IR's", allowing many source languages to 37be mapped to them). By providing type information, LLVM can be used as 38the target of optimizations: for example, through pointer analysis, it 39can be proven that a C automatic variable is never accessed outside of 40the current function, allowing it to be promoted to a simple SSA value 41instead of a memory location. 42 43.. _wellformed: 44 45Well-Formedness 46--------------- 47 48It is important to note that this document describes 'well formed' LLVM 49assembly language. There is a difference between what the parser accepts 50and what is considered 'well formed'. For example, the following 51instruction is syntactically okay, but not well formed: 52 53.. code-block:: llvm 54 55 %x = add i32 1, %x 56 57because the definition of ``%x`` does not dominate all of its uses. The 58LLVM infrastructure provides a verification pass that may be used to 59verify that an LLVM module is well formed. This pass is automatically 60run by the parser after parsing input assembly and by the optimizer 61before it outputs bitcode. The violations pointed out by the verifier 62pass indicate bugs in transformation passes or input to the parser. 63 64.. _identifiers: 65 66Identifiers 67=========== 68 69LLVM identifiers come in two basic types: global and local. Global 70identifiers (functions, global variables) begin with the ``'@'`` 71character. Local identifiers (register names, types) begin with the 72``'%'`` character. Additionally, there are three different formats for 73identifiers, for different purposes: 74 75#. Named values are represented as a string of characters with their 76 prefix. For example, ``%foo``, ``@DivisionByZero``, 77 ``%a.really.long.identifier``. The actual regular expression used is 78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other 79 characters in their names can be surrounded with quotes. Special 80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII 81 code for the character in hexadecimal. In this way, any character can 82 be used in a name value, even quotes themselves. The ``"\01"`` prefix 83 can be used on global values to suppress mangling. 84#. Unnamed values are represented as an unsigned numeric value with 85 their prefix. For example, ``%12``, ``@2``, ``%44``. 86#. Constants, which are described in the section Constants_ below. 87 88LLVM requires that values start with a prefix for two reasons: Compilers 89don't need to worry about name clashes with reserved words, and the set 90of reserved words may be expanded in the future without penalty. 91Additionally, unnamed identifiers allow a compiler to quickly come up 92with a temporary variable without having to avoid symbol table 93conflicts. 94 95Reserved words in LLVM are very similar to reserved words in other 96languages. There are keywords for different opcodes ('``add``', 97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``', 98'``i32``', etc...), and others. These reserved words cannot conflict 99with variable names, because none of them start with a prefix character 100(``'%'`` or ``'@'``). 101 102Here is an example of LLVM code to multiply the integer variable 103'``%X``' by 8: 104 105The easy way: 106 107.. code-block:: llvm 108 109 %result = mul i32 %X, 8 110 111After strength reduction: 112 113.. code-block:: llvm 114 115 %result = shl i32 %X, 3 116 117And the hard way: 118 119.. code-block:: llvm 120 121 %0 = add i32 %X, %X ; yields i32:%0 122 %1 = add i32 %0, %0 ; yields i32:%1 123 %result = add i32 %1, %1 124 125This last way of multiplying ``%X`` by 8 illustrates several important 126lexical features of LLVM: 127 128#. Comments are delimited with a '``;``' and go until the end of line. 129#. Unnamed temporaries are created when the result of a computation is 130 not assigned to a named value. 131#. Unnamed temporaries are numbered sequentially (using a per-function 132 incrementing counter, starting with 0). Note that basic blocks and unnamed 133 function parameters are included in this numbering. For example, if the 134 entry basic block is not given a label name and all function parameters are 135 named, then it will get number 0. 136 137It also shows a convention that we follow in this document. When 138demonstrating instructions, we will follow an instruction with a comment 139that defines the type and name of value produced. 140 141High Level Structure 142==================== 143 144Module Structure 145---------------- 146 147LLVM programs are composed of ``Module``'s, each of which is a 148translation unit of the input programs. Each module consists of 149functions, global variables, and symbol table entries. Modules may be 150combined together with the LLVM linker, which merges function (and 151global variable) definitions, resolves forward declarations, and merges 152symbol table entries. Here is an example of the "hello world" module: 153 154.. code-block:: llvm 155 156 ; Declare the string constant as a global constant. 157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" 158 159 ; External declaration of the puts function 160 declare i32 @puts(i8* nocapture) nounwind 161 162 ; Definition of main function 163 define i32 @main() { ; i32()* 164 ; Convert [13 x i8]* to i8*... 165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0 166 167 ; Call puts function to write out the string to stdout. 168 call i32 @puts(i8* %cast210) 169 ret i32 0 170 } 171 172 ; Named metadata 173 !0 = !{i32 42, null, !"string"} 174 !foo = !{!0} 175 176This example is made up of a :ref:`global variable <globalvars>` named 177"``.str``", an external declaration of the "``puts``" function, a 178:ref:`function definition <functionstructure>` for "``main``" and 179:ref:`named metadata <namedmetadatastructure>` "``foo``". 180 181In general, a module is made up of a list of global values (where both 182functions and global variables are global values). Global values are 183represented by a pointer to a memory location (in this case, a pointer 184to an array of char, and a pointer to a function), and have one of the 185following :ref:`linkage types <linkage>`. 186 187.. _linkage: 188 189Linkage Types 190------------- 191 192All Global Variables and Functions have one of the following types of 193linkage: 194 195``private`` 196 Global values with "``private``" linkage are only directly 197 accessible by objects in the current module. In particular, linking 198 code into a module with a private global value may cause the 199 private to be renamed as necessary to avoid collisions. Because the 200 symbol is private to the module, all references can be updated. This 201 doesn't show up in any symbol table in the object file. 202``internal`` 203 Similar to private, but the value shows as a local symbol 204 (``STB_LOCAL`` in the case of ELF) in the object file. This 205 corresponds to the notion of the '``static``' keyword in C. 206``available_externally`` 207 Globals with "``available_externally``" linkage are never emitted into 208 the object file corresponding to the LLVM module. From the linker's 209 perspective, an ``available_externally`` global is equivalent to 210 an external declaration. They exist to allow inlining and other 211 optimizations to take place given knowledge of the definition of the 212 global, which is known to be somewhere outside the module. Globals 213 with ``available_externally`` linkage are allowed to be discarded at 214 will, and allow inlining and other optimizations. This linkage type is 215 only allowed on definitions, not declarations. 216``linkonce`` 217 Globals with "``linkonce``" linkage are merged with other globals of 218 the same name when linkage occurs. This can be used to implement 219 some forms of inline functions, templates, or other code which must 220 be generated in each translation unit that uses it, but where the 221 body may be overridden with a more definitive definition later. 222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note 223 that ``linkonce`` linkage does not actually allow the optimizer to 224 inline the body of this function into callers because it doesn't 225 know if this definition of the function is the definitive definition 226 within the program or whether it will be overridden by a stronger 227 definition. To enable inlining and other optimizations, use 228 "``linkonce_odr``" linkage. 229``weak`` 230 "``weak``" linkage has the same merging semantics as ``linkonce`` 231 linkage, except that unreferenced globals with ``weak`` linkage may 232 not be discarded. This is used for globals that are declared "weak" 233 in C source code. 234``common`` 235 "``common``" linkage is most similar to "``weak``" linkage, but they 236 are used for tentative definitions in C, such as "``int X;``" at 237 global scope. Symbols with "``common``" linkage are merged in the 238 same way as ``weak symbols``, and they may not be deleted if 239 unreferenced. ``common`` symbols may not have an explicit section, 240 must have a zero initializer, and may not be marked 241 ':ref:`constant <globalvars>`'. Functions and aliases may not have 242 common linkage. 243 244.. _linkage_appending: 245 246``appending`` 247 "``appending``" linkage may only be applied to global variables of 248 pointer to array type. When two global variables with appending 249 linkage are linked together, the two global arrays are appended 250 together. This is the LLVM, typesafe, equivalent of having the 251 system linker append together "sections" with identical names when 252 .o files are linked. 253 254 Unfortunately this doesn't correspond to any feature in .o files, so it 255 can only be used for variables like ``llvm.global_ctors`` which llvm 256 interprets specially. 257 258``extern_weak`` 259 The semantics of this linkage follow the ELF object file model: the 260 symbol is weak until linked, if not linked, the symbol becomes null 261 instead of being an undefined reference. 262``linkonce_odr``, ``weak_odr`` 263 Some languages allow differing globals to be merged, such as two 264 functions with different semantics. Other languages, such as 265 ``C++``, ensure that only equivalent globals are ever merged (the 266 "one definition rule" --- "ODR"). Such languages can use the 267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the 268 global will only be merged with equivalent globals. These linkage 269 types are otherwise the same as their non-``odr`` versions. 270``external`` 271 If none of the above identifiers are used, the global is externally 272 visible, meaning that it participates in linkage and can be used to 273 resolve external symbol references. 274 275It is illegal for a function *declaration* to have any linkage type 276other than ``external`` or ``extern_weak``. 277 278.. _callingconv: 279 280Calling Conventions 281------------------- 282 283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and 284:ref:`invokes <i_invoke>` can all have an optional calling convention 285specified for the call. The calling convention of any pair of dynamic 286caller/callee must match, or the behavior of the program is undefined. 287The following calling conventions are supported by LLVM, and more may be 288added in the future: 289 290"``ccc``" - The C calling convention 291 This calling convention (the default if no other calling convention 292 is specified) matches the target C calling conventions. This calling 293 convention supports varargs function calls and tolerates some 294 mismatch in the declared prototype and implemented declaration of 295 the function (as does normal C). 296"``fastcc``" - The fast calling convention 297 This calling convention attempts to make calls as fast as possible 298 (e.g. by passing things in registers). This calling convention 299 allows the target to use whatever tricks it wants to produce fast 300 code for the target, without having to conform to an externally 301 specified ABI (Application Binary Interface). `Tail calls can only 302 be optimized when this, the GHC or the HiPE convention is 303 used. <CodeGenerator.html#id80>`_ This calling convention does not 304 support varargs and requires the prototype of all callees to exactly 305 match the prototype of the function definition. 306"``coldcc``" - The cold calling convention 307 This calling convention attempts to make code in the caller as 308 efficient as possible under the assumption that the call is not 309 commonly executed. As such, these calls often preserve all registers 310 so that the call does not break any live ranges in the caller side. 311 This calling convention does not support varargs and requires the 312 prototype of all callees to exactly match the prototype of the 313 function definition. Furthermore the inliner doesn't consider such function 314 calls for inlining. 315"``cc 10``" - GHC convention 316 This calling convention has been implemented specifically for use by 317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. 318 It passes everything in registers, going to extremes to achieve this 319 by disabling callee save registers. This calling convention should 320 not be used lightly but only for specific situations such as an 321 alternative to the *register pinning* performance technique often 322 used when implementing functional programming languages. At the 323 moment only X86 supports this convention and it has the following 324 limitations: 325 326 - On *X86-32* only supports up to 4 bit type parameters. No 327 floating-point types are supported. 328 - On *X86-64* only supports up to 10 bit type parameters and 6 329 floating-point parameters. 330 331 This calling convention supports `tail call 332 optimization <CodeGenerator.html#id80>`_ but requires both the 333 caller and callee are using it. 334"``cc 11``" - The HiPE calling convention 335 This calling convention has been implemented specifically for use by 336 the `High-Performance Erlang 337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* 338 native code compiler of the `Ericsson's Open Source Erlang/OTP 339 system <http://www.erlang.org/download.shtml>`_. It uses more 340 registers for argument passing than the ordinary C calling 341 convention and defines no callee-saved registers. The calling 342 convention properly supports `tail call 343 optimization <CodeGenerator.html#id80>`_ but requires that both the 344 caller and the callee use it. It uses a *register pinning* 345 mechanism, similar to GHC's convention, for keeping frequently 346 accessed runtime components pinned to specific hardware registers. 347 At the moment only X86 supports this convention (both 32 and 64 348 bit). 349"``webkit_jscc``" - WebKit's JavaScript calling convention 350 This calling convention has been implemented for `WebKit FTL JIT 351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the 352 stack right to left (as cdecl does), and returns a value in the 353 platform's customary return register. 354"``anyregcc``" - Dynamic calling convention for code patching 355 This is a special convention that supports patching an arbitrary code 356 sequence in place of a call site. This convention forces the call 357 arguments into registers but allows them to be dynamically 358 allocated. This can currently only be used with calls to 359 llvm.experimental.patchpoint because only this intrinsic records 360 the location of its arguments in a side table. See :doc:`StackMaps`. 361"``preserve_mostcc``" - The `PreserveMost` calling convention 362 This calling convention attempts to make the code in the caller as 363 unintrusive as possible. This convention behaves identically to the `C` 364 calling convention on how arguments and return values are passed, but it 365 uses a different set of caller/callee-saved registers. This alleviates the 366 burden of saving and recovering a large register set before and after the 367 call in the caller. If the arguments are passed in callee-saved registers, 368 then they will be preserved by the callee across the call. This doesn't 369 apply for values returned in callee-saved registers. 370 371 - On X86-64 the callee preserves all general purpose registers, except for 372 R11. R11 can be used as a scratch register. Floating-point registers 373 (XMMs/YMMs) are not preserved and need to be saved by the caller. 374 375 The idea behind this convention is to support calls to runtime functions 376 that have a hot path and a cold path. The hot path is usually a small piece 377 of code that doesn't use many registers. The cold path might need to call out to 378 another function and therefore only needs to preserve the caller-saved 379 registers, which haven't already been saved by the caller. The 380 `PreserveMost` calling convention is very similar to the `cold` calling 381 convention in terms of caller/callee-saved registers, but they are used for 382 different types of function calls. `coldcc` is for function calls that are 383 rarely executed, whereas `preserve_mostcc` function calls are intended to be 384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` 385 doesn't prevent the inliner from inlining the function call. 386 387 This calling convention will be used by a future version of the ObjectiveC 388 runtime and should therefore still be considered experimental at this time. 389 Although this convention was created to optimize certain runtime calls to 390 the ObjectiveC runtime, it is not limited to this runtime and might be used 391 by other runtimes in the future too. The current implementation only 392 supports X86-64, but the intention is to support more architectures in the 393 future. 394"``preserve_allcc``" - The `PreserveAll` calling convention 395 This calling convention attempts to make the code in the caller even less 396 intrusive than the `PreserveMost` calling convention. This calling 397 convention also behaves identical to the `C` calling convention on how 398 arguments and return values are passed, but it uses a different set of 399 caller/callee-saved registers. This removes the burden of saving and 400 recovering a large register set before and after the call in the caller. If 401 the arguments are passed in callee-saved registers, then they will be 402 preserved by the callee across the call. This doesn't apply for values 403 returned in callee-saved registers. 404 405 - On X86-64 the callee preserves all general purpose registers, except for 406 R11. R11 can be used as a scratch register. Furthermore it also preserves 407 all floating-point registers (XMMs/YMMs). 408 409 The idea behind this convention is to support calls to runtime functions 410 that don't need to call out to any other functions. 411 412 This calling convention, like the `PreserveMost` calling convention, will be 413 used by a future version of the ObjectiveC runtime and should be considered 414 experimental at this time. 415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions 416 Clang generates an access function to access C++-style TLS. The access 417 function generally has an entry block, an exit block and an initialization 418 block that is run at the first time. The entry and exit blocks can access 419 a few TLS IR variables, each access will be lowered to a platform-specific 420 sequence. 421 422 This calling convention aims to minimize overhead in the caller by 423 preserving as many registers as possible (all the registers that are 424 perserved on the fast path, composed of the entry and exit blocks). 425 426 This calling convention behaves identical to the `C` calling convention on 427 how arguments and return values are passed, but it uses a different set of 428 caller/callee-saved registers. 429 430 Given that each platform has its own lowering sequence, hence its own set 431 of preserved registers, we can't use the existing `PreserveMost`. 432 433 - On X86-64 the callee preserves all general purpose registers, except for 434 RDI and RAX. 435"``swiftcc``" - This calling convention is used for Swift language. 436 - On X86-64 RCX and R8 are available for additional integer returns, and 437 XMM2 and XMM3 are available for additional FP/vector returns. 438 - On iOS platforms, we use AAPCS-VFP calling convention. 439"``cc <n>``" - Numbered convention 440 Any calling convention may be specified by number, allowing 441 target-specific calling conventions to be used. Target specific 442 calling conventions start at 64. 443 444More calling conventions can be added/defined on an as-needed basis, to 445support Pascal conventions or any other well-known target-independent 446convention. 447 448.. _visibilitystyles: 449 450Visibility Styles 451----------------- 452 453All Global Variables and Functions have one of the following visibility 454styles: 455 456"``default``" - Default style 457 On targets that use the ELF object file format, default visibility 458 means that the declaration is visible to other modules and, in 459 shared libraries, means that the declared entity may be overridden. 460 On Darwin, default visibility means that the declaration is visible 461 to other modules. Default visibility corresponds to "external 462 linkage" in the language. 463"``hidden``" - Hidden style 464 Two declarations of an object with hidden visibility refer to the 465 same object if they are in the same shared object. Usually, hidden 466 visibility indicates that the symbol will not be placed into the 467 dynamic symbol table, so no other module (executable or shared 468 library) can reference it directly. 469"``protected``" - Protected style 470 On ELF, protected visibility indicates that the symbol will be 471 placed in the dynamic symbol table, but that references within the 472 defining module will bind to the local symbol. That is, the symbol 473 cannot be overridden by another module. 474 475A symbol with ``internal`` or ``private`` linkage must have ``default`` 476visibility. 477 478.. _dllstorageclass: 479 480DLL Storage Classes 481------------------- 482 483All Global Variables, Functions and Aliases can have one of the following 484DLL storage class: 485 486``dllimport`` 487 "``dllimport``" causes the compiler to reference a function or variable via 488 a global pointer to a pointer that is set up by the DLL exporting the 489 symbol. On Microsoft Windows targets, the pointer name is formed by 490 combining ``__imp_`` and the function or variable name. 491``dllexport`` 492 "``dllexport``" causes the compiler to provide a global pointer to a pointer 493 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On 494 Microsoft Windows targets, the pointer name is formed by combining 495 ``__imp_`` and the function or variable name. Since this storage class 496 exists for defining a dll interface, the compiler, assembler and linker know 497 it is externally referenced and must refrain from deleting the symbol. 498 499.. _tls_model: 500 501Thread Local Storage Models 502--------------------------- 503 504A variable may be defined as ``thread_local``, which means that it will 505not be shared by threads (each thread will have a separated copy of the 506variable). Not all targets support thread-local variables. Optionally, a 507TLS model may be specified: 508 509``localdynamic`` 510 For variables that are only used within the current shared library. 511``initialexec`` 512 For variables in modules that will not be loaded dynamically. 513``localexec`` 514 For variables defined in the executable and only used within it. 515 516If no explicit model is given, the "general dynamic" model is used. 517 518The models correspond to the ELF TLS models; see `ELF Handling For 519Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for 520more information on under which circumstances the different models may 521be used. The target may choose a different TLS model if the specified 522model is not supported, or if a better choice of model can be made. 523 524A model can also be specified in an alias, but then it only governs how 525the alias is accessed. It will not have any effect in the aliasee. 526 527For platforms without linker support of ELF TLS model, the -femulated-tls 528flag can be used to generate GCC compatible emulated TLS code. 529 530.. _runtime_preemption_model: 531 532Runtime Preemption Specifiers 533----------------------------- 534 535Global variables, functions and aliases may have an optional runtime preemption 536specifier. If a preemption specifier isn't given explicitly, then a 537symbol is assumed to be ``dso_preemptable``. 538 539``dso_preemptable`` 540 Indicates that the function or variable may be replaced by a symbol from 541 outside the linkage unit at runtime. 542 543``dso_local`` 544 The compiler may assume that a function or variable marked as ``dso_local`` 545 will resolve to a symbol within the same linkage unit. Direct access will 546 be generated even if the definition is not within this compilation unit. 547 548.. _namedtypes: 549 550Structure Types 551--------------- 552 553LLVM IR allows you to specify both "identified" and "literal" :ref:`structure 554types <t_struct>`. Literal types are uniqued structurally, but identified types 555are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used 556to forward declare a type that is not yet available. 557 558An example of an identified structure specification is: 559 560.. code-block:: llvm 561 562 %mytype = type { %mytype*, i32 } 563 564Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only 565literal types are uniqued in recent versions of LLVM. 566 567.. _nointptrtype: 568 569Non-Integral Pointer Type 570------------------------- 571 572Note: non-integral pointer types are a work in progress, and they should be 573considered experimental at this time. 574 575LLVM IR optionally allows the frontend to denote pointers in certain address 576spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`. 577Non-integral pointer types represent pointers that have an *unspecified* bitwise 578representation; that is, the integral representation may be target dependent or 579unstable (not backed by a fixed integer). 580 581``inttoptr`` instructions converting integers to non-integral pointer types are 582ill-typed, and so are ``ptrtoint`` instructions converting values of 583non-integral pointer types to integers. Vector versions of said instructions 584are ill-typed as well. 585 586.. _globalvars: 587 588Global Variables 589---------------- 590 591Global variables define regions of memory allocated at compilation time 592instead of run-time. 593 594Global variable definitions must be initialized. 595 596Global variables in other translation units can also be declared, in which 597case they don't have an initializer. 598 599Either global variable definitions or declarations may have an explicit section 600to be placed in and may have an optional explicit alignment specified. If there 601is a mismatch between the explicit or inferred section information for the 602variable declaration and its definition the resulting behavior is undefined. 603 604A variable may be defined as a global ``constant``, which indicates that 605the contents of the variable will **never** be modified (enabling better 606optimization, allowing the global data to be placed in the read-only 607section of an executable, etc). Note that variables that need runtime 608initialization cannot be marked ``constant`` as there is a store to the 609variable. 610 611LLVM explicitly allows *declarations* of global variables to be marked 612constant, even if the final definition of the global is not. This 613capability can be used to enable slightly better optimization of the 614program, but requires the language definition to guarantee that 615optimizations based on the 'constantness' are valid for the translation 616units that do not include the definition. 617 618As SSA values, global variables define pointer values that are in scope 619(i.e. they dominate) all basic blocks in the program. Global variables 620always define a pointer to their "content" type because they describe a 621region of memory, and all memory objects in LLVM are accessed through 622pointers. 623 624Global variables can be marked with ``unnamed_addr`` which indicates 625that the address is not significant, only the content. Constants marked 626like this can be merged with other constants if they have the same 627initializer. Note that a constant with significant address *can* be 628merged with a ``unnamed_addr`` constant, the result being a constant 629whose address is significant. 630 631If the ``local_unnamed_addr`` attribute is given, the address is known to 632not be significant within the module. 633 634A global variable may be declared to reside in a target-specific 635numbered address space. For targets that support them, address spaces 636may affect how optimizations are performed and/or what target 637instructions are used to access the variable. The default address space 638is zero. The address space qualifier must precede any other attributes. 639 640LLVM allows an explicit section to be specified for globals. If the 641target supports it, it will emit globals to the section specified. 642Additionally, the global can placed in a comdat if the target has the necessary 643support. 644 645External declarations may have an explicit section specified. Section 646information is retained in LLVM IR for targets that make use of this 647information. Attaching section information to an external declaration is an 648assertion that its definition is located in the specified section. If the 649definition is located in a different section, the behavior is undefined. 650 651By default, global initializers are optimized by assuming that global 652variables defined within the module are not modified from their 653initial values before the start of the global initializer. This is 654true even for variables potentially accessible from outside the 655module, including those with external linkage or appearing in 656``@llvm.used`` or dllexported variables. This assumption may be suppressed 657by marking the variable with ``externally_initialized``. 658 659An explicit alignment may be specified for a global, which must be a 660power of 2. If not present, or if the alignment is set to zero, the 661alignment of the global is set by the target to whatever it feels 662convenient. If an explicit alignment is specified, the global is forced 663to have exactly that alignment. Targets and optimizers are not allowed 664to over-align the global if the global has an assigned section. In this 665case, the extra alignment could be observable: for example, code could 666assume that the globals are densely packed in their section and try to 667iterate over them as an array, alignment padding would break this 668iteration. The maximum alignment is ``1 << 29``. 669 670Globals can also have a :ref:`DLL storage class <dllstorageclass>`, 671an optional :ref:`runtime preemption specifier <runtime_preemption_model>`, 672an optional :ref:`global attributes <glattrs>` and 673an optional list of attached :ref:`metadata <metadata>`. 674 675Variables and aliases can have a 676:ref:`Thread Local Storage Model <tls_model>`. 677 678Syntax:: 679 680 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility] 681 [DLLStorageClass] [ThreadLocal] 682 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] 683 [ExternallyInitialized] 684 <global | constant> <Type> [<InitializerConstant>] 685 [, section "name"] [, comdat [($name)]] 686 [, align <Alignment>] (, !name !N)* 687 688For example, the following defines a global in a numbered address space 689with an initializer, section, and alignment: 690 691.. code-block:: llvm 692 693 @G = addrspace(5) constant float 1.0, section "foo", align 4 694 695The following example just declares a global variable 696 697.. code-block:: llvm 698 699 @G = external global i32 700 701The following example defines a thread-local global with the 702``initialexec`` TLS model: 703 704.. code-block:: llvm 705 706 @G = thread_local(initialexec) global i32 0, align 4 707 708.. _functionstructure: 709 710Functions 711--------- 712 713LLVM function definitions consist of the "``define``" keyword, an 714optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption 715specifier <runtime_preemption_model>`, an optional :ref:`visibility 716style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, 717an optional :ref:`calling convention <callingconv>`, 718an optional ``unnamed_addr`` attribute, a return type, an optional 719:ref:`parameter attribute <paramattrs>` for the return type, a function 720name, a (possibly empty) argument list (each with optional :ref:`parameter 721attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, 722an optional section, an optional alignment, 723an optional :ref:`comdat <langref_comdats>`, 724an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, 725an optional :ref:`prologue <prologuedata>`, 726an optional :ref:`personality <personalityfn>`, 727an optional list of attached :ref:`metadata <metadata>`, 728an opening curly brace, a list of basic blocks, and a closing curly brace. 729 730LLVM function declarations consist of the "``declare``" keyword, an 731optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style 732<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an 733optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr`` 734or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter 735attribute <paramattrs>` for the return type, a function name, a possibly 736empty list of arguments, an optional alignment, an optional :ref:`garbage 737collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional 738:ref:`prologue <prologuedata>`. 739 740A function definition contains a list of basic blocks, forming the CFG (Control 741Flow Graph) for the function. Each basic block may optionally start with a label 742(giving the basic block a symbol table entry), contains a list of instructions, 743and ends with a :ref:`terminator <terminators>` instruction (such as a branch or 744function return). If an explicit label is not provided, a block is assigned an 745implicit numbered label, using the next value from the same counter as used for 746unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function 747entry block does not have an explicit label, it will be assigned label "%0", 748then the first unnamed temporary in that block will be "%1", etc. 749 750The first basic block in a function is special in two ways: it is 751immediately executed on entrance to the function, and it is not allowed 752to have predecessor basic blocks (i.e. there can not be any branches to 753the entry block of a function). Because the block can have no 754predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. 755 756LLVM allows an explicit section to be specified for functions. If the 757target supports it, it will emit functions to the section specified. 758Additionally, the function can be placed in a COMDAT. 759 760An explicit alignment may be specified for a function. If not present, 761or if the alignment is set to zero, the alignment of the function is set 762by the target to whatever it feels convenient. If an explicit alignment 763is specified, the function is forced to have at least that much 764alignment. All alignments must be a power of 2. 765 766If the ``unnamed_addr`` attribute is given, the address is known to not 767be significant and two identical functions can be merged. 768 769If the ``local_unnamed_addr`` attribute is given, the address is known to 770not be significant within the module. 771 772Syntax:: 773 774 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass] 775 [cconv] [ret attrs] 776 <ResultType> @<FunctionName> ([argument list]) 777 [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"] 778 [comdat [($name)]] [align N] [gc] [prefix Constant] 779 [prologue Constant] [personality Constant] (!name !N)* { ... } 780 781The argument list is a comma separated sequence of arguments where each 782argument is of the following form: 783 784Syntax:: 785 786 <type> [parameter Attrs] [name] 787 788 789.. _langref_aliases: 790 791Aliases 792------- 793 794Aliases, unlike function or variables, don't create any new data. They 795are just a new symbol and metadata for an existing position. 796 797Aliases have a name and an aliasee that is either a global value or a 798constant expression. 799 800Aliases may have an optional :ref:`linkage type <linkage>`, an optional 801:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional 802:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class 803<dllstorageclass>` and an optional :ref:`tls model <tls_model>`. 804 805Syntax:: 806 807 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee> 808 809The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, 810``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers 811might not correctly handle dropping a weak symbol that is aliased. 812 813Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as 814the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point 815to the same content. 816 817If the ``local_unnamed_addr`` attribute is given, the address is known to 818not be significant within the module. 819 820Since aliases are only a second name, some restrictions apply, of which 821some can only be checked when producing an object file: 822 823* The expression defining the aliasee must be computable at assembly 824 time. Since it is just a name, no relocations can be used. 825 826* No alias in the expression can be weak as the possibility of the 827 intermediate alias being overridden cannot be represented in an 828 object file. 829 830* No global value in the expression can be a declaration, since that 831 would require a relocation, which is not possible. 832 833.. _langref_ifunc: 834 835IFuncs 836------- 837 838IFuncs, like as aliases, don't create any new data or func. They are just a new 839symbol that dynamic linker resolves at runtime by calling a resolver function. 840 841IFuncs have a name and a resolver that is a function called by dynamic linker 842that returns address of another function associated with the name. 843 844IFunc may have an optional :ref:`linkage type <linkage>` and an optional 845:ref:`visibility style <visibility>`. 846 847Syntax:: 848 849 @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver> 850 851 852.. _langref_comdats: 853 854Comdats 855------- 856 857Comdat IR provides access to COFF and ELF object file COMDAT functionality. 858 859Comdats have a name which represents the COMDAT key. All global objects that 860specify this key will only end up in the final object file if the linker chooses 861that key over some other key. Aliases are placed in the same COMDAT that their 862aliasee computes to, if any. 863 864Comdats have a selection kind to provide input on how the linker should 865choose between keys in two different object files. 866 867Syntax:: 868 869 $<Name> = comdat SelectionKind 870 871The selection kind must be one of the following: 872 873``any`` 874 The linker may choose any COMDAT key, the choice is arbitrary. 875``exactmatch`` 876 The linker may choose any COMDAT key but the sections must contain the 877 same data. 878``largest`` 879 The linker will choose the section containing the largest COMDAT key. 880``noduplicates`` 881 The linker requires that only section with this COMDAT key exist. 882``samesize`` 883 The linker may choose any COMDAT key but the sections must contain the 884 same amount of data. 885 886Note that the Mach-O platform doesn't support COMDATs, and ELF and WebAssembly 887only support ``any`` as a selection kind. 888 889Here is an example of a COMDAT group where a function will only be selected if 890the COMDAT key's section is the largest: 891 892.. code-block:: text 893 894 $foo = comdat largest 895 @foo = global i32 2, comdat($foo) 896 897 define void @bar() comdat($foo) { 898 ret void 899 } 900 901As a syntactic sugar the ``$name`` can be omitted if the name is the same as 902the global name: 903 904.. code-block:: text 905 906 $foo = comdat any 907 @foo = global i32 2, comdat 908 909 910In a COFF object file, this will create a COMDAT section with selection kind 911``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol 912and another COMDAT section with selection kind 913``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT 914section and contains the contents of the ``@bar`` symbol. 915 916There are some restrictions on the properties of the global object. 917It, or an alias to it, must have the same name as the COMDAT group when 918targeting COFF. 919The contents and size of this object may be used during link-time to determine 920which COMDAT groups get selected depending on the selection kind. 921Because the name of the object must match the name of the COMDAT group, the 922linkage of the global object must not be local; local symbols can get renamed 923if a collision occurs in the symbol table. 924 925The combined use of COMDATS and section attributes may yield surprising results. 926For example: 927 928.. code-block:: text 929 930 $foo = comdat any 931 $bar = comdat any 932 @g1 = global i32 42, section "sec", comdat($foo) 933 @g2 = global i32 42, section "sec", comdat($bar) 934 935From the object file perspective, this requires the creation of two sections 936with the same name. This is necessary because both globals belong to different 937COMDAT groups and COMDATs, at the object file level, are represented by 938sections. 939 940Note that certain IR constructs like global variables and functions may 941create COMDATs in the object file in addition to any which are specified using 942COMDAT IR. This arises when the code generator is configured to emit globals 943in individual sections (e.g. when `-data-sections` or `-function-sections` 944is supplied to `llc`). 945 946.. _namedmetadatastructure: 947 948Named Metadata 949-------------- 950 951Named metadata is a collection of metadata. :ref:`Metadata 952nodes <metadata>` (but not metadata strings) are the only valid 953operands for a named metadata. 954 955#. Named metadata are represented as a string of characters with the 956 metadata prefix. The rules for metadata names are the same as for 957 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes 958 are still valid, which allows any character to be part of a name. 959 960Syntax:: 961 962 ; Some unnamed metadata nodes, which are referenced by the named metadata. 963 !0 = !{!"zero"} 964 !1 = !{!"one"} 965 !2 = !{!"two"} 966 ; A named metadata. 967 !name = !{!0, !1, !2} 968 969.. _paramattrs: 970 971Parameter Attributes 972-------------------- 973 974The return type and each parameter of a function type may have a set of 975*parameter attributes* associated with them. Parameter attributes are 976used to communicate additional information about the result or 977parameters of a function. Parameter attributes are considered to be part 978of the function, not of the function type, so functions with different 979parameter attributes can have the same function type. 980 981Parameter attributes are simple keywords that follow the type specified. 982If multiple parameter attributes are needed, they are space separated. 983For example: 984 985.. code-block:: llvm 986 987 declare i32 @printf(i8* noalias nocapture, ...) 988 declare i32 @atoi(i8 zeroext) 989 declare signext i8 @returns_signed_char() 990 991Note that any attributes for the function result (``nounwind``, 992``readonly``) come immediately after the argument list. 993 994Currently, only the following parameter attributes are defined: 995 996``zeroext`` 997 This indicates to the code generator that the parameter or return 998 value should be zero-extended to the extent required by the target's 999 ABI by the caller (for a parameter) or the callee (for a return value). 1000``signext`` 1001 This indicates to the code generator that the parameter or return 1002 value should be sign-extended to the extent required by the target's 1003 ABI (which is usually 32-bits) by the caller (for a parameter) or 1004 the callee (for a return value). 1005``inreg`` 1006 This indicates that this parameter or return value should be treated 1007 in a special target-dependent fashion while emitting code for 1008 a function call or return (usually, by putting it in a register as 1009 opposed to memory, though some targets use it to distinguish between 1010 two different kinds of registers). Use of this attribute is 1011 target-specific. 1012``byval`` 1013 This indicates that the pointer parameter should really be passed by 1014 value to the function. The attribute implies that a hidden copy of 1015 the pointee is made between the caller and the callee, so the callee 1016 is unable to modify the value in the caller. This attribute is only 1017 valid on LLVM pointer arguments. It is generally used to pass 1018 structs and arrays by value, but is also valid on pointers to 1019 scalars. The copy is considered to belong to the caller not the 1020 callee (for example, ``readonly`` functions should not write to 1021 ``byval`` parameters). This is not a valid attribute for return 1022 values. 1023 1024 The byval attribute also supports specifying an alignment with the 1025 align attribute. It indicates the alignment of the stack slot to 1026 form and the known alignment of the pointer specified to the call 1027 site. If the alignment is not specified, then the code generator 1028 makes a target-specific assumption. 1029 1030.. _attr_inalloca: 1031 1032``inalloca`` 1033 1034 The ``inalloca`` argument attribute allows the caller to take the 1035 address of outgoing stack arguments. An ``inalloca`` argument must 1036 be a pointer to stack memory produced by an ``alloca`` instruction. 1037 The alloca, or argument allocation, must also be tagged with the 1038 inalloca keyword. Only the last argument may have the ``inalloca`` 1039 attribute, and that argument is guaranteed to be passed in memory. 1040 1041 An argument allocation may be used by a call at most once because 1042 the call may deallocate it. The ``inalloca`` attribute cannot be 1043 used in conjunction with other attributes that affect argument 1044 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The 1045 ``inalloca`` attribute also disables LLVM's implicit lowering of 1046 large aggregate return values, which means that frontend authors 1047 must lower them with ``sret`` pointers. 1048 1049 When the call site is reached, the argument allocation must have 1050 been the most recent stack allocation that is still live, or the 1051 behavior is undefined. It is possible to allocate additional stack 1052 space after an argument allocation and before its call site, but it 1053 must be cleared off with :ref:`llvm.stackrestore 1054 <int_stackrestore>`. 1055 1056 See :doc:`InAlloca` for more information on how to use this 1057 attribute. 1058 1059``sret`` 1060 This indicates that the pointer parameter specifies the address of a 1061 structure that is the return value of the function in the source 1062 program. This pointer must be guaranteed by the caller to be valid: 1063 loads and stores to the structure may be assumed by the callee not 1064 to trap and to be properly aligned. This is not a valid attribute 1065 for return values. 1066 1067.. _attr_align: 1068 1069``align <n>`` 1070 This indicates that the pointer value may be assumed by the optimizer to 1071 have the specified alignment. 1072 1073 Note that this attribute has additional semantics when combined with the 1074 ``byval`` attribute. 1075 1076.. _noalias: 1077 1078``noalias`` 1079 This indicates that objects accessed via pointer values 1080 :ref:`based <pointeraliasing>` on the argument or return value are not also 1081 accessed, during the execution of the function, via pointer values not 1082 *based* on the argument or return value. The attribute on a return value 1083 also has additional semantics described below. The caller shares the 1084 responsibility with the callee for ensuring that these requirements are met. 1085 For further details, please see the discussion of the NoAlias response in 1086 :ref:`alias analysis <Must, May, or No>`. 1087 1088 Note that this definition of ``noalias`` is intentionally similar 1089 to the definition of ``restrict`` in C99 for function arguments. 1090 1091 For function return values, C99's ``restrict`` is not meaningful, 1092 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias`` 1093 attribute on return values are stronger than the semantics of the attribute 1094 when used on function arguments. On function return values, the ``noalias`` 1095 attribute indicates that the function acts like a system memory allocation 1096 function, returning a pointer to allocated storage disjoint from the 1097 storage for any other object accessible to the caller. 1098 1099``nocapture`` 1100 This indicates that the callee does not make any copies of the 1101 pointer that outlive the callee itself. This is not a valid 1102 attribute for return values. Addresses used in volatile operations 1103 are considered to be captured. 1104 1105.. _nest: 1106 1107``nest`` 1108 This indicates that the pointer parameter can be excised using the 1109 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid 1110 attribute for return values and can only be applied to one parameter. 1111 1112``returned`` 1113 This indicates that the function always returns the argument as its return 1114 value. This is a hint to the optimizer and code generator used when 1115 generating the caller, allowing value propagation, tail call optimization, 1116 and omission of register saves and restores in some cases; it is not 1117 checked or enforced when generating the callee. The parameter and the 1118 function return type must be valid operands for the 1119 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for 1120 return values and can only be applied to one parameter. 1121 1122``nonnull`` 1123 This indicates that the parameter or return pointer is not null. This 1124 attribute may only be applied to pointer typed parameters. This is not 1125 checked or enforced by LLVM; if the parameter or return pointer is null, 1126 the behavior is undefined. 1127 1128``dereferenceable(<n>)`` 1129 This indicates that the parameter or return pointer is dereferenceable. This 1130 attribute may only be applied to pointer typed parameters. A pointer that 1131 is dereferenceable can be loaded from speculatively without a risk of 1132 trapping. The number of bytes known to be dereferenceable must be provided 1133 in parentheses. It is legal for the number of bytes to be less than the 1134 size of the pointee type. The ``nonnull`` attribute does not imply 1135 dereferenceability (consider a pointer to one element past the end of an 1136 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in 1137 ``addrspace(0)`` (which is the default address space). 1138 1139``dereferenceable_or_null(<n>)`` 1140 This indicates that the parameter or return value isn't both 1141 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same 1142 time. All non-null pointers tagged with 1143 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``. 1144 For address space 0 ``dereferenceable_or_null(<n>)`` implies that 1145 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``, 1146 and in other address spaces ``dereferenceable_or_null(<n>)`` 1147 implies that a pointer is at least one of ``dereferenceable(<n>)`` 1148 or ``null`` (i.e. it may be both ``null`` and 1149 ``dereferenceable(<n>)``). This attribute may only be applied to 1150 pointer typed parameters. 1151 1152``swiftself`` 1153 This indicates that the parameter is the self/context parameter. This is not 1154 a valid attribute for return values and can only be applied to one 1155 parameter. 1156 1157``swifterror`` 1158 This attribute is motivated to model and optimize Swift error handling. It 1159 can be applied to a parameter with pointer to pointer type or a 1160 pointer-sized alloca. At the call site, the actual argument that corresponds 1161 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or 1162 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either 1163 the parameter or the alloca) can only be loaded and stored from, or used as 1164 a ``swifterror`` argument. This is not a valid attribute for return values 1165 and can only be applied to one parameter. 1166 1167 These constraints allow the calling convention to optimize access to 1168 ``swifterror`` variables by associating them with a specific register at 1169 call boundaries rather than placing them in memory. Since this does change 1170 the calling convention, a function which uses the ``swifterror`` attribute 1171 on a parameter is not ABI-compatible with one which does not. 1172 1173 These constraints also allow LLVM to assume that a ``swifterror`` argument 1174 does not alias any other memory visible within a function and that a 1175 ``swifterror`` alloca passed as an argument does not escape. 1176 1177.. _gc: 1178 1179Garbage Collector Strategy Names 1180-------------------------------- 1181 1182Each function may specify a garbage collector strategy name, which is simply a 1183string: 1184 1185.. code-block:: llvm 1186 1187 define void @f() gc "name" { ... } 1188 1189The supported values of *name* includes those :ref:`built in to LLVM 1190<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC 1191strategy will cause the compiler to alter its output in order to support the 1192named garbage collection algorithm. Note that LLVM itself does not contain a 1193garbage collector, this functionality is restricted to generating machine code 1194which can interoperate with a collector provided externally. 1195 1196.. _prefixdata: 1197 1198Prefix Data 1199----------- 1200 1201Prefix data is data associated with a function which the code 1202generator will emit immediately before the function's entrypoint. 1203The purpose of this feature is to allow frontends to associate 1204language-specific runtime metadata with specific functions and make it 1205available through the function pointer while still allowing the 1206function pointer to be called. 1207 1208To access the data for a given function, a program may bitcast the 1209function pointer to a pointer to the constant's type and dereference 1210index -1. This implies that the IR symbol points just past the end of 1211the prefix data. For instance, take the example of a function annotated 1212with a single ``i32``, 1213 1214.. code-block:: llvm 1215 1216 define void @f() prefix i32 123 { ... } 1217 1218The prefix data can be referenced as, 1219 1220.. code-block:: llvm 1221 1222 %0 = bitcast void* () @f to i32* 1223 %a = getelementptr inbounds i32, i32* %0, i32 -1 1224 %b = load i32, i32* %a 1225 1226Prefix data is laid out as if it were an initializer for a global variable 1227of the prefix data's type. The function will be placed such that the 1228beginning of the prefix data is aligned. This means that if the size 1229of the prefix data is not a multiple of the alignment size, the 1230function's entrypoint will not be aligned. If alignment of the 1231function's entrypoint is desired, padding must be added to the prefix 1232data. 1233 1234A function may have prefix data but no body. This has similar semantics 1235to the ``available_externally`` linkage in that the data may be used by the 1236optimizers but will not be emitted in the object file. 1237 1238.. _prologuedata: 1239 1240Prologue Data 1241------------- 1242 1243The ``prologue`` attribute allows arbitrary code (encoded as bytes) to 1244be inserted prior to the function body. This can be used for enabling 1245function hot-patching and instrumentation. 1246 1247To maintain the semantics of ordinary function calls, the prologue data must 1248have a particular format. Specifically, it must begin with a sequence of 1249bytes which decode to a sequence of machine instructions, valid for the 1250module's target, which transfer control to the point immediately succeeding 1251the prologue data, without performing any other visible action. This allows 1252the inliner and other passes to reason about the semantics of the function 1253definition without needing to reason about the prologue data. Obviously this 1254makes the format of the prologue data highly target dependent. 1255 1256A trivial example of valid prologue data for the x86 architecture is ``i8 144``, 1257which encodes the ``nop`` instruction: 1258 1259.. code-block:: text 1260 1261 define void @f() prologue i8 144 { ... } 1262 1263Generally prologue data can be formed by encoding a relative branch instruction 1264which skips the metadata, as in this example of valid prologue data for the 1265x86_64 architecture, where the first two bytes encode ``jmp .+10``: 1266 1267.. code-block:: text 1268 1269 %0 = type <{ i8, i8, i8* }> 1270 1271 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... } 1272 1273A function may have prologue data but no body. This has similar semantics 1274to the ``available_externally`` linkage in that the data may be used by the 1275optimizers but will not be emitted in the object file. 1276 1277.. _personalityfn: 1278 1279Personality Function 1280-------------------- 1281 1282The ``personality`` attribute permits functions to specify what function 1283to use for exception handling. 1284 1285.. _attrgrp: 1286 1287Attribute Groups 1288---------------- 1289 1290Attribute groups are groups of attributes that are referenced by objects within 1291the IR. They are important for keeping ``.ll`` files readable, because a lot of 1292functions will use the same set of attributes. In the degenerative case of a 1293``.ll`` file that corresponds to a single ``.c`` file, the single attribute 1294group will capture the important command line flags used to build that file. 1295 1296An attribute group is a module-level object. To use an attribute group, an 1297object references the attribute group's ID (e.g. ``#37``). An object may refer 1298to more than one attribute group. In that situation, the attributes from the 1299different groups are merged. 1300 1301Here is an example of attribute groups for a function that should always be 1302inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: 1303 1304.. code-block:: llvm 1305 1306 ; Target-independent attributes: 1307 attributes #0 = { alwaysinline alignstack=4 } 1308 1309 ; Target-dependent attributes: 1310 attributes #1 = { "no-sse" } 1311 1312 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". 1313 define void @f() #0 #1 { ... } 1314 1315.. _fnattrs: 1316 1317Function Attributes 1318------------------- 1319 1320Function attributes are set to communicate additional information about 1321a function. Function attributes are considered to be part of the 1322function, not of the function type, so functions with different function 1323attributes can have the same function type. 1324 1325Function attributes are simple keywords that follow the type specified. 1326If multiple attributes are needed, they are space separated. For 1327example: 1328 1329.. code-block:: llvm 1330 1331 define void @f() noinline { ... } 1332 define void @f() alwaysinline { ... } 1333 define void @f() alwaysinline optsize { ... } 1334 define void @f() optsize { ... } 1335 1336``alignstack(<n>)`` 1337 This attribute indicates that, when emitting the prologue and 1338 epilogue, the backend should forcibly align the stack pointer. 1339 Specify the desired alignment, which must be a power of two, in 1340 parentheses. 1341``allocsize(<EltSizeParam>[, <NumEltsParam>])`` 1342 This attribute indicates that the annotated function will always return at 1343 least a given number of bytes (or null). Its arguments are zero-indexed 1344 parameter numbers; if one argument is provided, then it's assumed that at 1345 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the 1346 returned pointer. If two are provided, then it's assumed that 1347 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are 1348 available. The referenced parameters must be integer types. No assumptions 1349 are made about the contents of the returned block of memory. 1350``alwaysinline`` 1351 This attribute indicates that the inliner should attempt to inline 1352 this function into callers whenever possible, ignoring any active 1353 inlining size threshold for this caller. 1354``builtin`` 1355 This indicates that the callee function at a call site should be 1356 recognized as a built-in function, even though the function's declaration 1357 uses the ``nobuiltin`` attribute. This is only valid at call sites for 1358 direct calls to functions that are declared with the ``nobuiltin`` 1359 attribute. 1360``cold`` 1361 This attribute indicates that this function is rarely called. When 1362 computing edge weights, basic blocks post-dominated by a cold 1363 function call are also considered to be cold; and, thus, given low 1364 weight. 1365``convergent`` 1366 In some parallel execution models, there exist operations that cannot be 1367 made control-dependent on any additional values. We call such operations 1368 ``convergent``, and mark them with this attribute. 1369 1370 The ``convergent`` attribute may appear on functions or call/invoke 1371 instructions. When it appears on a function, it indicates that calls to 1372 this function should not be made control-dependent on additional values. 1373 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so 1374 calls to this intrinsic cannot be made control-dependent on additional 1375 values. 1376 1377 When it appears on a call/invoke, the ``convergent`` attribute indicates 1378 that we should treat the call as though we're calling a convergent 1379 function. This is particularly useful on indirect calls; without this we 1380 may treat such calls as though the target is non-convergent. 1381 1382 The optimizer may remove the ``convergent`` attribute on functions when it 1383 can prove that the function does not execute any convergent operations. 1384 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it 1385 can prove that the call/invoke cannot call a convergent function. 1386``inaccessiblememonly`` 1387 This attribute indicates that the function may only access memory that 1388 is not accessible by the module being compiled. This is a weaker form 1389 of ``readnone``. If the function reads or writes other memory, the 1390 behavior is undefined. 1391``inaccessiblemem_or_argmemonly`` 1392 This attribute indicates that the function may only access memory that is 1393 either not accessible by the module being compiled, or is pointed to 1394 by its pointer arguments. This is a weaker form of ``argmemonly``. If the 1395 function reads or writes other memory, the behavior is undefined. 1396``inlinehint`` 1397 This attribute indicates that the source code contained a hint that 1398 inlining this function is desirable (such as the "inline" keyword in 1399 C/C++). It is just a hint; it imposes no requirements on the 1400 inliner. 1401``jumptable`` 1402 This attribute indicates that the function should be added to a 1403 jump-instruction table at code-generation time, and that all address-taken 1404 references to this function should be replaced with a reference to the 1405 appropriate jump-instruction-table function pointer. Note that this creates 1406 a new pointer for the original function, which means that code that depends 1407 on function-pointer identity can break. So, any function annotated with 1408 ``jumptable`` must also be ``unnamed_addr``. 1409``minsize`` 1410 This attribute suggests that optimization passes and code generator 1411 passes make choices that keep the code size of this function as small 1412 as possible and perform optimizations that may sacrifice runtime 1413 performance in order to minimize the size of the generated code. 1414``naked`` 1415 This attribute disables prologue / epilogue emission for the 1416 function. This can have very system-specific consequences. 1417``no-jump-tables`` 1418 When this attribute is set to true, the jump tables and lookup tables that 1419 can be generated from a switch case lowering are disabled. 1420``nobuiltin`` 1421 This indicates that the callee function at a call site is not recognized as 1422 a built-in function. LLVM will retain the original call and not replace it 1423 with equivalent code based on the semantics of the built-in function, unless 1424 the call site uses the ``builtin`` attribute. This is valid at call sites 1425 and on function declarations and definitions. 1426``noduplicate`` 1427 This attribute indicates that calls to the function cannot be 1428 duplicated. A call to a ``noduplicate`` function may be moved 1429 within its parent function, but may not be duplicated within 1430 its parent function. 1431 1432 A function containing a ``noduplicate`` call may still 1433 be an inlining candidate, provided that the call is not 1434 duplicated by inlining. That implies that the function has 1435 internal linkage and only has one call site, so the original 1436 call is dead after inlining. 1437``noimplicitfloat`` 1438 This attributes disables implicit floating-point instructions. 1439``noinline`` 1440 This attribute indicates that the inliner should never inline this 1441 function in any situation. This attribute may not be used together 1442 with the ``alwaysinline`` attribute. 1443``nonlazybind`` 1444 This attribute suppresses lazy symbol binding for the function. This 1445 may make calls to the function faster, at the cost of extra program 1446 startup time if the function is not called during program startup. 1447``noredzone`` 1448 This attribute indicates that the code generator should not use a 1449 red zone, even if the target-specific ABI normally permits it. 1450``noreturn`` 1451 This function attribute indicates that the function never returns 1452 normally. This produces undefined behavior at runtime if the 1453 function ever does dynamically return. 1454``norecurse`` 1455 This function attribute indicates that the function does not call itself 1456 either directly or indirectly down any possible call path. This produces 1457 undefined behavior at runtime if the function ever does recurse. 1458``nounwind`` 1459 This function attribute indicates that the function never raises an 1460 exception. If the function does raise an exception, its runtime 1461 behavior is undefined. However, functions marked nounwind may still 1462 trap or generate asynchronous exceptions. Exception handling schemes 1463 that are recognized by LLVM to handle asynchronous exceptions, such 1464 as SEH, will still provide their implementation defined semantics. 1465``"null-pointer-is-valid"`` 1466 If ``"null-pointer-is-valid"`` is set to ``"true"``, then ``null`` address 1467 in address-space 0 is considered to be a valid address for memory loads and 1468 stores. Any analysis or optimization should not treat dereferencing a 1469 pointer to ``null`` as undefined behavior in this function. 1470 Note: Comparing address of a global variable to ``null`` may still 1471 evaluate to false because of a limitation in querying this attribute inside 1472 constant expressions. 1473``optforfuzzing`` 1474 This attribute indicates that this function should be optimized 1475 for maximum fuzzing signal. 1476``optnone`` 1477 This function attribute indicates that most optimization passes will skip 1478 this function, with the exception of interprocedural optimization passes. 1479 Code generation defaults to the "fast" instruction selector. 1480 This attribute cannot be used together with the ``alwaysinline`` 1481 attribute; this attribute is also incompatible 1482 with the ``minsize`` attribute and the ``optsize`` attribute. 1483 1484 This attribute requires the ``noinline`` attribute to be specified on 1485 the function as well, so the function is never inlined into any caller. 1486 Only functions with the ``alwaysinline`` attribute are valid 1487 candidates for inlining into the body of this function. 1488``optsize`` 1489 This attribute suggests that optimization passes and code generator 1490 passes make choices that keep the code size of this function low, 1491 and otherwise do optimizations specifically to reduce code size as 1492 long as they do not significantly impact runtime performance. 1493``"patchable-function"`` 1494 This attribute tells the code generator that the code 1495 generated for this function needs to follow certain conventions that 1496 make it possible for a runtime function to patch over it later. 1497 The exact effect of this attribute depends on its string value, 1498 for which there currently is one legal possibility: 1499 1500 * ``"prologue-short-redirect"`` - This style of patchable 1501 function is intended to support patching a function prologue to 1502 redirect control away from the function in a thread safe 1503 manner. It guarantees that the first instruction of the 1504 function will be large enough to accommodate a short jump 1505 instruction, and will be sufficiently aligned to allow being 1506 fully changed via an atomic compare-and-swap instruction. 1507 While the first requirement can be satisfied by inserting large 1508 enough NOP, LLVM can and will try to re-purpose an existing 1509 instruction (i.e. one that would have to be emitted anyway) as 1510 the patchable instruction larger than a short jump. 1511 1512 ``"prologue-short-redirect"`` is currently only supported on 1513 x86-64. 1514 1515 This attribute by itself does not imply restrictions on 1516 inter-procedural optimizations. All of the semantic effects the 1517 patching may have to be separately conveyed via the linkage type. 1518``"probe-stack"`` 1519 This attribute indicates that the function will trigger a guard region 1520 in the end of the stack. It ensures that accesses to the stack must be 1521 no further apart than the size of the guard region to a previous 1522 access of the stack. It takes one required string value, the name of 1523 the stack probing function that will be called. 1524 1525 If a function that has a ``"probe-stack"`` attribute is inlined into 1526 a function with another ``"probe-stack"`` attribute, the resulting 1527 function has the ``"probe-stack"`` attribute of the caller. If a 1528 function that has a ``"probe-stack"`` attribute is inlined into a 1529 function that has no ``"probe-stack"`` attribute at all, the resulting 1530 function has the ``"probe-stack"`` attribute of the callee. 1531``readnone`` 1532 On a function, this attribute indicates that the function computes its 1533 result (or decides to unwind an exception) based strictly on its arguments, 1534 without dereferencing any pointer arguments or otherwise accessing 1535 any mutable state (e.g. memory, control registers, etc) visible to 1536 caller functions. It does not write through any pointer arguments 1537 (including ``byval`` arguments) and never changes any state visible 1538 to callers. This means while it cannot unwind exceptions by calling 1539 the ``C++`` exception throwing methods (since they write to memory), there may 1540 be non-``C++`` mechanisms that throw exceptions without writing to LLVM 1541 visible memory. 1542 1543 On an argument, this attribute indicates that the function does not 1544 dereference that pointer argument, even though it may read or write the 1545 memory that the pointer points to if accessed through other pointers. 1546 1547 If a readnone function reads or writes memory visible to the program, or 1548 has other side-effects, the behavior is undefined. If a function reads from 1549 or writes to a readnone pointer argument, the behavior is undefined. 1550``readonly`` 1551 On a function, this attribute indicates that the function does not write 1552 through any pointer arguments (including ``byval`` arguments) or otherwise 1553 modify any state (e.g. memory, control registers, etc) visible to 1554 caller functions. It may dereference pointer arguments and read 1555 state that may be set in the caller. A readonly function always 1556 returns the same value (or unwinds an exception identically) when 1557 called with the same set of arguments and global state. This means while it 1558 cannot unwind exceptions by calling the ``C++`` exception throwing methods 1559 (since they write to memory), there may be non-``C++`` mechanisms that throw 1560 exceptions without writing to LLVM visible memory. 1561 1562 On an argument, this attribute indicates that the function does not write 1563 through this pointer argument, even though it may write to the memory that 1564 the pointer points to. 1565 1566 If a readonly function writes memory visible to the program, or 1567 has other side-effects, the behavior is undefined. If a function writes to 1568 a readonly pointer argument, the behavior is undefined. 1569``"stack-probe-size"`` 1570 This attribute controls the behavior of stack probes: either 1571 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any. 1572 It defines the size of the guard region. It ensures that if the function 1573 may use more stack space than the size of the guard region, stack probing 1574 sequence will be emitted. It takes one required integer value, which 1575 is 4096 by default. 1576 1577 If a function that has a ``"stack-probe-size"`` attribute is inlined into 1578 a function with another ``"stack-probe-size"`` attribute, the resulting 1579 function has the ``"stack-probe-size"`` attribute that has the lower 1580 numeric value. If a function that has a ``"stack-probe-size"`` attribute is 1581 inlined into a function that has no ``"stack-probe-size"`` attribute 1582 at all, the resulting function has the ``"stack-probe-size"`` attribute 1583 of the callee. 1584``"no-stack-arg-probe"`` 1585 This attribute disables ABI-required stack probes, if any. 1586``writeonly`` 1587 On a function, this attribute indicates that the function may write to but 1588 does not read from memory. 1589 1590 On an argument, this attribute indicates that the function may write to but 1591 does not read through this pointer argument (even though it may read from 1592 the memory that the pointer points to). 1593 1594 If a writeonly function reads memory visible to the program, or 1595 has other side-effects, the behavior is undefined. If a function reads 1596 from a writeonly pointer argument, the behavior is undefined. 1597``argmemonly`` 1598 This attribute indicates that the only memory accesses inside function are 1599 loads and stores from objects pointed to by its pointer-typed arguments, 1600 with arbitrary offsets. Or in other words, all memory operations in the 1601 function can refer to memory only using pointers based on its function 1602 arguments. 1603 1604 Note that ``argmemonly`` can be used together with ``readonly`` attribute 1605 in order to specify that function reads only from its arguments. 1606 1607 If an argmemonly function reads or writes memory other than the pointer 1608 arguments, or has other side-effects, the behavior is undefined. 1609``returns_twice`` 1610 This attribute indicates that this function can return twice. The C 1611 ``setjmp`` is an example of such a function. The compiler disables 1612 some optimizations (like tail calls) in the caller of these 1613 functions. 1614``safestack`` 1615 This attribute indicates that 1616 `SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_ 1617 protection is enabled for this function. 1618 1619 If a function that has a ``safestack`` attribute is inlined into a 1620 function that doesn't have a ``safestack`` attribute or which has an 1621 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting 1622 function will have a ``safestack`` attribute. 1623``sanitize_address`` 1624 This attribute indicates that AddressSanitizer checks 1625 (dynamic address safety analysis) are enabled for this function. 1626``sanitize_memory`` 1627 This attribute indicates that MemorySanitizer checks (dynamic detection 1628 of accesses to uninitialized memory) are enabled for this function. 1629``sanitize_thread`` 1630 This attribute indicates that ThreadSanitizer checks 1631 (dynamic thread safety analysis) are enabled for this function. 1632``sanitize_hwaddress`` 1633 This attribute indicates that HWAddressSanitizer checks 1634 (dynamic address safety analysis based on tagged pointers) are enabled for 1635 this function. 1636``speculatable`` 1637 This function attribute indicates that the function does not have any 1638 effects besides calculating its result and does not have undefined behavior. 1639 Note that ``speculatable`` is not enough to conclude that along any 1640 particular execution path the number of calls to this function will not be 1641 externally observable. This attribute is only valid on functions 1642 and declarations, not on individual call sites. If a function is 1643 incorrectly marked as speculatable and really does exhibit 1644 undefined behavior, the undefined behavior may be observed even 1645 if the call site is dead code. 1646 1647``ssp`` 1648 This attribute indicates that the function should emit a stack 1649 smashing protector. It is in the form of a "canary" --- a random value 1650 placed on the stack before the local variables that's checked upon 1651 return from the function to see if it has been overwritten. A 1652 heuristic is used to determine if a function needs stack protectors 1653 or not. The heuristic used will enable protectors for functions with: 1654 1655 - Character arrays larger than ``ssp-buffer-size`` (default 8). 1656 - Aggregates containing character arrays larger than ``ssp-buffer-size``. 1657 - Calls to alloca() with variable sizes or constant sizes greater than 1658 ``ssp-buffer-size``. 1659 1660 Variables that are identified as requiring a protector will be arranged 1661 on the stack such that they are adjacent to the stack protector guard. 1662 1663 If a function that has an ``ssp`` attribute is inlined into a 1664 function that doesn't have an ``ssp`` attribute, then the resulting 1665 function will have an ``ssp`` attribute. 1666``sspreq`` 1667 This attribute indicates that the function should *always* emit a 1668 stack smashing protector. This overrides the ``ssp`` function 1669 attribute. 1670 1671 Variables that are identified as requiring a protector will be arranged 1672 on the stack such that they are adjacent to the stack protector guard. 1673 The specific layout rules are: 1674 1675 #. Large arrays and structures containing large arrays 1676 (``>= ssp-buffer-size``) are closest to the stack protector. 1677 #. Small arrays and structures containing small arrays 1678 (``< ssp-buffer-size``) are 2nd closest to the protector. 1679 #. Variables that have had their address taken are 3rd closest to the 1680 protector. 1681 1682 If a function that has an ``sspreq`` attribute is inlined into a 1683 function that doesn't have an ``sspreq`` attribute or which has an 1684 ``ssp`` or ``sspstrong`` attribute, then the resulting function will have 1685 an ``sspreq`` attribute. 1686``sspstrong`` 1687 This attribute indicates that the function should emit a stack smashing 1688 protector. This attribute causes a strong heuristic to be used when 1689 determining if a function needs stack protectors. The strong heuristic 1690 will enable protectors for functions with: 1691 1692 - Arrays of any size and type 1693 - Aggregates containing an array of any size and type. 1694 - Calls to alloca(). 1695 - Local variables that have had their address taken. 1696 1697 Variables that are identified as requiring a protector will be arranged 1698 on the stack such that they are adjacent to the stack protector guard. 1699 The specific layout rules are: 1700 1701 #. Large arrays and structures containing large arrays 1702 (``>= ssp-buffer-size``) are closest to the stack protector. 1703 #. Small arrays and structures containing small arrays 1704 (``< ssp-buffer-size``) are 2nd closest to the protector. 1705 #. Variables that have had their address taken are 3rd closest to the 1706 protector. 1707 1708 This overrides the ``ssp`` function attribute. 1709 1710 If a function that has an ``sspstrong`` attribute is inlined into a 1711 function that doesn't have an ``sspstrong`` attribute, then the 1712 resulting function will have an ``sspstrong`` attribute. 1713``strictfp`` 1714 This attribute indicates that the function was called from a scope that 1715 requires strict floating-point semantics. LLVM will not attempt any 1716 optimizations that require assumptions about the floating-point rounding 1717 mode or that might alter the state of floating-point status flags that 1718 might otherwise be set or cleared by calling this function. 1719``"thunk"`` 1720 This attribute indicates that the function will delegate to some other 1721 function with a tail call. The prototype of a thunk should not be used for 1722 optimization purposes. The caller is expected to cast the thunk prototype to 1723 match the thunk target prototype. 1724``uwtable`` 1725 This attribute indicates that the ABI being targeted requires that 1726 an unwind table entry be produced for this function even if we can 1727 show that no exceptions passes by it. This is normally the case for 1728 the ELF x86-64 abi, but it can be disabled for some compilation 1729 units. 1730``nocf_check`` 1731 This attribute indicates that no control-flow check will be performed on 1732 the attributed entity. It disables -fcf-protection=<> for a specific 1733 entity to fine grain the HW control flow protection mechanism. The flag 1734 is target independent and currently appertains to a function or function 1735 pointer. 1736``shadowcallstack`` 1737 This attribute indicates that the ShadowCallStack checks are enabled for 1738 the function. The instrumentation checks that the return address for the 1739 function has not changed between the function prolog and eiplog. It is 1740 currently x86_64-specific. 1741 1742.. _glattrs: 1743 1744Global Attributes 1745----------------- 1746 1747Attributes may be set to communicate additional information about a global variable. 1748Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable 1749are grouped into a single :ref:`attribute group <attrgrp>`. 1750 1751.. _opbundles: 1752 1753Operand Bundles 1754--------------- 1755 1756Operand bundles are tagged sets of SSA values that can be associated 1757with certain LLVM instructions (currently only ``call`` s and 1758``invoke`` s). In a way they are like metadata, but dropping them is 1759incorrect and will change program semantics. 1760 1761Syntax:: 1762 1763 operand bundle set ::= '[' operand bundle (, operand bundle )* ']' 1764 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')' 1765 bundle operand ::= SSA value 1766 tag ::= string constant 1767 1768Operand bundles are **not** part of a function's signature, and a 1769given function may be called from multiple places with different kinds 1770of operand bundles. This reflects the fact that the operand bundles 1771are conceptually a part of the ``call`` (or ``invoke``), not the 1772callee being dispatched to. 1773 1774Operand bundles are a generic mechanism intended to support 1775runtime-introspection-like functionality for managed languages. While 1776the exact semantics of an operand bundle depend on the bundle tag, 1777there are certain limitations to how much the presence of an operand 1778bundle can influence the semantics of a program. These restrictions 1779are described as the semantics of an "unknown" operand bundle. As 1780long as the behavior of an operand bundle is describable within these 1781restrictions, LLVM does not need to have special knowledge of the 1782operand bundle to not miscompile programs containing it. 1783 1784- The bundle operands for an unknown operand bundle escape in unknown 1785 ways before control is transferred to the callee or invokee. 1786- Calls and invokes with operand bundles have unknown read / write 1787 effect on the heap on entry and exit (even if the call target is 1788 ``readnone`` or ``readonly``), unless they're overridden with 1789 callsite specific attributes. 1790- An operand bundle at a call site cannot change the implementation 1791 of the called function. Inter-procedural optimizations work as 1792 usual as long as they take into account the first two properties. 1793 1794More specific types of operand bundles are described below. 1795 1796.. _deopt_opbundles: 1797 1798Deoptimization Operand Bundles 1799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1800 1801Deoptimization operand bundles are characterized by the ``"deopt"`` 1802operand bundle tag. These operand bundles represent an alternate 1803"safe" continuation for the call site they're attached to, and can be 1804used by a suitable runtime to deoptimize the compiled frame at the 1805specified call site. There can be at most one ``"deopt"`` operand 1806bundle attached to a call site. Exact details of deoptimization is 1807out of scope for the language reference, but it usually involves 1808rewriting a compiled frame into a set of interpreted frames. 1809 1810From the compiler's perspective, deoptimization operand bundles make 1811the call sites they're attached to at least ``readonly``. They read 1812through all of their pointer typed operands (even if they're not 1813otherwise escaped) and the entire visible heap. Deoptimization 1814operand bundles do not capture their operands except during 1815deoptimization, in which case control will not be returned to the 1816compiled frame. 1817 1818The inliner knows how to inline through calls that have deoptimization 1819operand bundles. Just like inlining through a normal call site 1820involves composing the normal and exceptional continuations, inlining 1821through a call site with a deoptimization operand bundle needs to 1822appropriately compose the "safe" deoptimization continuation. The 1823inliner does this by prepending the parent's deoptimization 1824continuation to every deoptimization continuation in the inlined body. 1825E.g. inlining ``@f`` into ``@g`` in the following example 1826 1827.. code-block:: llvm 1828 1829 define void @f() { 1830 call void @x() ;; no deopt state 1831 call void @y() [ "deopt"(i32 10) ] 1832 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ] 1833 ret void 1834 } 1835 1836 define void @g() { 1837 call void @f() [ "deopt"(i32 20) ] 1838 ret void 1839 } 1840 1841will result in 1842 1843.. code-block:: llvm 1844 1845 define void @g() { 1846 call void @x() ;; still no deopt state 1847 call void @y() [ "deopt"(i32 20, i32 10) ] 1848 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ] 1849 ret void 1850 } 1851 1852It is the frontend's responsibility to structure or encode the 1853deoptimization state in a way that syntactically prepending the 1854caller's deoptimization state to the callee's deoptimization state is 1855semantically equivalent to composing the caller's deoptimization 1856continuation after the callee's deoptimization continuation. 1857 1858.. _ob_funclet: 1859 1860Funclet Operand Bundles 1861^^^^^^^^^^^^^^^^^^^^^^^ 1862 1863Funclet operand bundles are characterized by the ``"funclet"`` 1864operand bundle tag. These operand bundles indicate that a call site 1865is within a particular funclet. There can be at most one 1866``"funclet"`` operand bundle attached to a call site and it must have 1867exactly one bundle operand. 1868 1869If any funclet EH pads have been "entered" but not "exited" (per the 1870`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_), 1871it is undefined behavior to execute a ``call`` or ``invoke`` which: 1872 1873* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind 1874 intrinsic, or 1875* has a ``"funclet"`` bundle whose operand is not the most-recently-entered 1876 not-yet-exited funclet EH pad. 1877 1878Similarly, if no funclet EH pads have been entered-but-not-yet-exited, 1879executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior. 1880 1881GC Transition Operand Bundles 1882^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1883 1884GC transition operand bundles are characterized by the 1885``"gc-transition"`` operand bundle tag. These operand bundles mark a 1886call as a transition between a function with one GC strategy to a 1887function with a different GC strategy. If coordinating the transition 1888between GC strategies requires additional code generation at the call 1889site, these bundles may contain any values that are needed by the 1890generated code. For more details, see :ref:`GC Transitions 1891<gc_transition_args>`. 1892 1893.. _moduleasm: 1894 1895Module-Level Inline Assembly 1896---------------------------- 1897 1898Modules may contain "module-level inline asm" blocks, which corresponds 1899to the GCC "file scope inline asm" blocks. These blocks are internally 1900concatenated by LLVM and treated as a single unit, but may be separated 1901in the ``.ll`` file if desired. The syntax is very simple: 1902 1903.. code-block:: llvm 1904 1905 module asm "inline asm code goes here" 1906 module asm "more can go here" 1907 1908The strings can contain any character by escaping non-printable 1909characters. The escape sequence used is simply "\\xx" where "xx" is the 1910two digit hex code for the number. 1911 1912Note that the assembly string *must* be parseable by LLVM's integrated assembler 1913(unless it is disabled), even when emitting a ``.s`` file. 1914 1915.. _langref_datalayout: 1916 1917Data Layout 1918----------- 1919 1920A module may specify a target specific data layout string that specifies 1921how data is to be laid out in memory. The syntax for the data layout is 1922simply: 1923 1924.. code-block:: llvm 1925 1926 target datalayout = "layout specification" 1927 1928The *layout specification* consists of a list of specifications 1929separated by the minus sign character ('-'). Each specification starts 1930with a letter and may include other information after the letter to 1931define some aspect of the data layout. The specifications accepted are 1932as follows: 1933 1934``E`` 1935 Specifies that the target lays out data in big-endian form. That is, 1936 the bits with the most significance have the lowest address 1937 location. 1938``e`` 1939 Specifies that the target lays out data in little-endian form. That 1940 is, the bits with the least significance have the lowest address 1941 location. 1942``S<size>`` 1943 Specifies the natural alignment of the stack in bits. Alignment 1944 promotion of stack variables is limited to the natural stack 1945 alignment to avoid dynamic stack realignment. The stack alignment 1946 must be a multiple of 8-bits. If omitted, the natural stack 1947 alignment defaults to "unspecified", which does not prevent any 1948 alignment promotions. 1949``P<address space>`` 1950 Specifies the address space that corresponds to program memory. 1951 Harvard architectures can use this to specify what space LLVM 1952 should place things such as functions into. If omitted, the 1953 program memory space defaults to the default address space of 0, 1954 which corresponds to a Von Neumann architecture that has code 1955 and data in the same space. 1956``A<address space>`` 1957 Specifies the address space of objects created by '``alloca``'. 1958 Defaults to the default address space of 0. 1959``p[n]:<size>:<abi>:<pref>:<idx>`` 1960 This specifies the *size* of a pointer and its ``<abi>`` and 1961 ``<pref>``\erred alignments for address space ``n``. The fourth parameter 1962 ``<idx>`` is a size of index that used for address calculation. If not 1963 specified, the default index size is equal to the pointer size. All sizes 1964 are in bits. The address space, ``n``, is optional, and if not specified, 1965 denotes the default address space 0. The value of ``n`` must be 1966 in the range [1,2^23). 1967``i<size>:<abi>:<pref>`` 1968 This specifies the alignment for an integer type of a given bit 1969 ``<size>``. The value of ``<size>`` must be in the range [1,2^23). 1970``v<size>:<abi>:<pref>`` 1971 This specifies the alignment for a vector type of a given bit 1972 ``<size>``. 1973``f<size>:<abi>:<pref>`` 1974 This specifies the alignment for a floating-point type of a given bit 1975 ``<size>``. Only values of ``<size>`` that are supported by the target 1976 will work. 32 (float) and 64 (double) are supported on all targets; 80 1977 or 128 (different flavors of long double) are also supported on some 1978 targets. 1979``a:<abi>:<pref>`` 1980 This specifies the alignment for an object of aggregate type. 1981``m:<mangling>`` 1982 If present, specifies that llvm names are mangled in the output. Symbols 1983 prefixed with the mangling escape character ``\01`` are passed through 1984 directly to the assembler without the escape character. The mangling style 1985 options are 1986 1987 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. 1988 * ``m``: Mips mangling: Private symbols get a ``$`` prefix. 1989 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other 1990 symbols get a ``_`` prefix. 1991 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix. 1992 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``, 1993 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends 1994 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols 1995 starting with ``?`` are not mangled in any way. 1996 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C 1997 symbols do not receive a ``_`` prefix. 1998``n<size1>:<size2>:<size3>...`` 1999 This specifies a set of native integer widths for the target CPU in 2000 bits. For example, it might contain ``n32`` for 32-bit PowerPC, 2001 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of 2002 this set are considered to support most general arithmetic operations 2003 efficiently. 2004``ni:<address space0>:<address space1>:<address space2>...`` 2005 This specifies pointer types with the specified address spaces 2006 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0`` 2007 address space cannot be specified as non-integral. 2008 2009On every specification that takes a ``<abi>:<pref>``, specifying the 2010``<pref>`` alignment is optional. If omitted, the preceding ``:`` 2011should be omitted too and ``<pref>`` will be equal to ``<abi>``. 2012 2013When constructing the data layout for a given target, LLVM starts with a 2014default set of specifications which are then (possibly) overridden by 2015the specifications in the ``datalayout`` keyword. The default 2016specifications are given in this list: 2017 2018- ``E`` - big endian 2019- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. 2020- ``p[n]:64:64:64`` - Other address spaces are assumed to be the 2021 same as the default address space. 2022- ``S0`` - natural stack alignment is unspecified 2023- ``i1:8:8`` - i1 is 8-bit (byte) aligned 2024- ``i8:8:8`` - i8 is 8-bit (byte) aligned 2025- ``i16:16:16`` - i16 is 16-bit aligned 2026- ``i32:32:32`` - i32 is 32-bit aligned 2027- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred 2028 alignment of 64-bits 2029- ``f16:16:16`` - half is 16-bit aligned 2030- ``f32:32:32`` - float is 32-bit aligned 2031- ``f64:64:64`` - double is 64-bit aligned 2032- ``f128:128:128`` - quad is 128-bit aligned 2033- ``v64:64:64`` - 64-bit vector is 64-bit aligned 2034- ``v128:128:128`` - 128-bit vector is 128-bit aligned 2035- ``a:0:64`` - aggregates are 64-bit aligned 2036 2037When LLVM is determining the alignment for a given type, it uses the 2038following rules: 2039 2040#. If the type sought is an exact match for one of the specifications, 2041 that specification is used. 2042#. If no match is found, and the type sought is an integer type, then 2043 the smallest integer type that is larger than the bitwidth of the 2044 sought type is used. If none of the specifications are larger than 2045 the bitwidth then the largest integer type is used. For example, 2046 given the default specifications above, the i7 type will use the 2047 alignment of i8 (next largest) while both i65 and i256 will use the 2048 alignment of i64 (largest specified). 2049#. If no match is found, and the type sought is a vector type, then the 2050 largest vector type that is smaller than the sought vector type will 2051 be used as a fall back. This happens because <128 x double> can be 2052 implemented in terms of 64 <2 x double>, for example. 2053 2054The function of the data layout string may not be what you expect. 2055Notably, this is not a specification from the frontend of what alignment 2056the code generator should use. 2057 2058Instead, if specified, the target data layout is required to match what 2059the ultimate *code generator* expects. This string is used by the 2060mid-level optimizers to improve code, and this only works if it matches 2061what the ultimate code generator uses. There is no way to generate IR 2062that does not embed this target-specific detail into the IR. If you 2063don't specify the string, the default specifications will be used to 2064generate a Data Layout and the optimization phases will operate 2065accordingly and introduce target specificity into the IR with respect to 2066these default specifications. 2067 2068.. _langref_triple: 2069 2070Target Triple 2071------------- 2072 2073A module may specify a target triple string that describes the target 2074host. The syntax for the target triple is simply: 2075 2076.. code-block:: llvm 2077 2078 target triple = "x86_64-apple-macosx10.7.0" 2079 2080The *target triple* string consists of a series of identifiers delimited 2081by the minus sign character ('-'). The canonical forms are: 2082 2083:: 2084 2085 ARCHITECTURE-VENDOR-OPERATING_SYSTEM 2086 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT 2087 2088This information is passed along to the backend so that it generates 2089code for the proper architecture. It's possible to override this on the 2090command line with the ``-mtriple`` command line option. 2091 2092.. _pointeraliasing: 2093 2094Pointer Aliasing Rules 2095---------------------- 2096 2097Any memory access must be done through a pointer value associated with 2098an address range of the memory access, otherwise the behavior is 2099undefined. Pointer values are associated with address ranges according 2100to the following rules: 2101 2102- A pointer value is associated with the addresses associated with any 2103 value it is *based* on. 2104- An address of a global variable is associated with the address range 2105 of the variable's storage. 2106- The result value of an allocation instruction is associated with the 2107 address range of the allocated storage. 2108- A null pointer in the default address-space is associated with no 2109 address. 2110- An integer constant other than zero or a pointer value returned from 2111 a function not defined within LLVM may be associated with address 2112 ranges allocated through mechanisms other than those provided by 2113 LLVM. Such ranges shall not overlap with any ranges of addresses 2114 allocated by mechanisms provided by LLVM. 2115 2116A pointer value is *based* on another pointer value according to the 2117following rules: 2118 2119- A pointer value formed from a scalar ``getelementptr`` operation is *based* on 2120 the pointer-typed operand of the ``getelementptr``. 2121- The pointer in lane *l* of the result of a vector ``getelementptr`` operation 2122 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand 2123 of the ``getelementptr``. 2124- The result value of a ``bitcast`` is *based* on the operand of the 2125 ``bitcast``. 2126- A pointer value formed by an ``inttoptr`` is *based* on all pointer 2127 values that contribute (directly or indirectly) to the computation of 2128 the pointer's value. 2129- The "*based* on" relationship is transitive. 2130 2131Note that this definition of *"based"* is intentionally similar to the 2132definition of *"based"* in C99, though it is slightly weaker. 2133 2134LLVM IR does not associate types with memory. The result type of a 2135``load`` merely indicates the size and alignment of the memory from 2136which to load, as well as the interpretation of the value. The first 2137operand type of a ``store`` similarly only indicates the size and 2138alignment of the store. 2139 2140Consequently, type-based alias analysis, aka TBAA, aka 2141``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. 2142:ref:`Metadata <metadata>` may be used to encode additional information 2143which specialized optimization passes may use to implement type-based 2144alias analysis. 2145 2146.. _volatile: 2147 2148Volatile Memory Accesses 2149------------------------ 2150 2151Certain memory accesses, such as :ref:`load <i_load>`'s, 2152:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be 2153marked ``volatile``. The optimizers must not change the number of 2154volatile operations or change their order of execution relative to other 2155volatile operations. The optimizers *may* change the order of volatile 2156operations relative to non-volatile operations. This is not Java's 2157"volatile" and has no cross-thread synchronization behavior. 2158 2159IR-level volatile loads and stores cannot safely be optimized into 2160llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are 2161flagged volatile. Likewise, the backend should never split or merge 2162target-legal volatile load/store instructions. 2163 2164.. admonition:: Rationale 2165 2166 Platforms may rely on volatile loads and stores of natively supported 2167 data width to be executed as single instruction. For example, in C 2168 this holds for an l-value of volatile primitive type with native 2169 hardware support, but not necessarily for aggregate types. The 2170 frontend upholds these expectations, which are intentionally 2171 unspecified in the IR. The rules above ensure that IR transformations 2172 do not violate the frontend's contract with the language. 2173 2174.. _memmodel: 2175 2176Memory Model for Concurrent Operations 2177-------------------------------------- 2178 2179The LLVM IR does not define any way to start parallel threads of 2180execution or to register signal handlers. Nonetheless, there are 2181platform-specific ways to create them, and we define LLVM IR's behavior 2182in their presence. This model is inspired by the C++0x memory model. 2183 2184For a more informal introduction to this model, see the :doc:`Atomics`. 2185 2186We define a *happens-before* partial order as the least partial order 2187that 2188 2189- Is a superset of single-thread program order, and 2190- When a *synchronizes-with* ``b``, includes an edge from ``a`` to 2191 ``b``. *Synchronizes-with* pairs are introduced by platform-specific 2192 techniques, like pthread locks, thread creation, thread joining, 2193 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering 2194 Constraints <ordering>`). 2195 2196Note that program order does not introduce *happens-before* edges 2197between a thread and signals executing inside that thread. 2198 2199Every (defined) read operation (load instructions, memcpy, atomic 2200loads/read-modify-writes, etc.) R reads a series of bytes written by 2201(defined) write operations (store instructions, atomic 2202stores/read-modify-writes, memcpy, etc.). For the purposes of this 2203section, initialized globals are considered to have a write of the 2204initializer which is atomic and happens before any other read or write 2205of the memory in question. For each byte of a read R, R\ :sub:`byte` 2206may see any write to the same byte, except: 2207 2208- If write\ :sub:`1` happens before write\ :sub:`2`, and 2209 write\ :sub:`2` happens before R\ :sub:`byte`, then 2210 R\ :sub:`byte` does not see write\ :sub:`1`. 2211- If R\ :sub:`byte` happens before write\ :sub:`3`, then 2212 R\ :sub:`byte` does not see write\ :sub:`3`. 2213 2214Given that definition, R\ :sub:`byte` is defined as follows: 2215 2216- If R is volatile, the result is target-dependent. (Volatile is 2217 supposed to give guarantees which can support ``sig_atomic_t`` in 2218 C/C++, and may be used for accesses to addresses that do not behave 2219 like normal memory. It does not generally provide cross-thread 2220 synchronization.) 2221- Otherwise, if there is no write to the same byte that happens before 2222 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. 2223- Otherwise, if R\ :sub:`byte` may see exactly one write, 2224 R\ :sub:`byte` returns the value written by that write. 2225- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may 2226 see are atomic, it chooses one of the values written. See the :ref:`Atomic 2227 Memory Ordering Constraints <ordering>` section for additional 2228 constraints on how the choice is made. 2229- Otherwise R\ :sub:`byte` returns ``undef``. 2230 2231R returns the value composed of the series of bytes it read. This 2232implies that some bytes within the value may be ``undef`` **without** 2233the entire value being ``undef``. Note that this only defines the 2234semantics of the operation; it doesn't mean that targets will emit more 2235than one instruction to read the series of bytes. 2236 2237Note that in cases where none of the atomic intrinsics are used, this 2238model places only one restriction on IR transformations on top of what 2239is required for single-threaded execution: introducing a store to a byte 2240which might not otherwise be stored is not allowed in general. 2241(Specifically, in the case where another thread might write to and read 2242from an address, introducing a store can change a load that may see 2243exactly one write into a load that may see multiple writes.) 2244 2245.. _ordering: 2246 2247Atomic Memory Ordering Constraints 2248---------------------------------- 2249 2250Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, 2251:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, 2252:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take 2253ordering parameters that determine which other atomic instructions on 2254the same address they *synchronize with*. These semantics are borrowed 2255from Java and C++0x, but are somewhat more colloquial. If these 2256descriptions aren't precise enough, check those specs (see spec 2257references in the :doc:`atomics guide <Atomics>`). 2258:ref:`fence <i_fence>` instructions treat these orderings somewhat 2259differently since they don't take an address. See that instruction's 2260documentation for details. 2261 2262For a simpler introduction to the ordering constraints, see the 2263:doc:`Atomics`. 2264 2265``unordered`` 2266 The set of values that can be read is governed by the happens-before 2267 partial order. A value cannot be read unless some operation wrote 2268 it. This is intended to provide a guarantee strong enough to model 2269 Java's non-volatile shared variables. This ordering cannot be 2270 specified for read-modify-write operations; it is not strong enough 2271 to make them atomic in any interesting way. 2272``monotonic`` 2273 In addition to the guarantees of ``unordered``, there is a single 2274 total order for modifications by ``monotonic`` operations on each 2275 address. All modification orders must be compatible with the 2276 happens-before order. There is no guarantee that the modification 2277 orders can be combined to a global total order for the whole program 2278 (and this often will not be possible). The read in an atomic 2279 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and 2280 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification 2281 order immediately before the value it writes. If one atomic read 2282 happens before another atomic read of the same address, the later 2283 read must see the same value or a later value in the address's 2284 modification order. This disallows reordering of ``monotonic`` (or 2285 stronger) operations on the same address. If an address is written 2286 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally 2287 read that address repeatedly, the other threads must eventually see 2288 the write. This corresponds to the C++0x/C1x 2289 ``memory_order_relaxed``. 2290``acquire`` 2291 In addition to the guarantees of ``monotonic``, a 2292 *synchronizes-with* edge may be formed with a ``release`` operation. 2293 This is intended to model C++'s ``memory_order_acquire``. 2294``release`` 2295 In addition to the guarantees of ``monotonic``, if this operation 2296 writes a value which is subsequently read by an ``acquire`` 2297 operation, it *synchronizes-with* that operation. (This isn't a 2298 complete description; see the C++0x definition of a release 2299 sequence.) This corresponds to the C++0x/C1x 2300 ``memory_order_release``. 2301``acq_rel`` (acquire+release) 2302 Acts as both an ``acquire`` and ``release`` operation on its 2303 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. 2304``seq_cst`` (sequentially consistent) 2305 In addition to the guarantees of ``acq_rel`` (``acquire`` for an 2306 operation that only reads, ``release`` for an operation that only 2307 writes), there is a global total order on all 2308 sequentially-consistent operations on all addresses, which is 2309 consistent with the *happens-before* partial order and with the 2310 modification orders of all the affected addresses. Each 2311 sequentially-consistent read sees the last preceding write to the 2312 same address in this global order. This corresponds to the C++0x/C1x 2313 ``memory_order_seq_cst`` and Java volatile. 2314 2315.. _syncscope: 2316 2317If an atomic operation is marked ``syncscope("singlethread")``, it only 2318*synchronizes with* and only participates in the seq\_cst total orderings of 2319other operations running in the same thread (for example, in signal handlers). 2320 2321If an atomic operation is marked ``syncscope("<target-scope>")``, where 2322``<target-scope>`` is a target specific synchronization scope, then it is target 2323dependent if it *synchronizes with* and participates in the seq\_cst total 2324orderings of other operations. 2325 2326Otherwise, an atomic operation that is not marked ``syncscope("singlethread")`` 2327or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the 2328seq\_cst total orderings of other operations that are not marked 2329``syncscope("singlethread")`` or ``syncscope("<target-scope>")``. 2330 2331.. _floatenv: 2332 2333Floating-Point Environment 2334-------------------------- 2335 2336The default LLVM floating-point environment assumes that floating-point 2337instructions do not have side effects. Results assume the round-to-nearest 2338rounding mode. No floating-point exception state is maintained in this 2339environment. Therefore, there is no attempt to create or preserve invalid 2340operation (SNaN) or division-by-zero exceptions in these examples: 2341 2342.. code-block:: llvm 2343 2344 %A = fdiv 0x7ff0000000000001, %X ; 64-bit SNaN hex value 2345 %B = fdiv %X, 0.0 2346 Safe: 2347 %A = NaN 2348 %B = NaN 2349 2350The benefit of this exception-free assumption is that floating-point 2351operations may be speculated freely without any other fast-math relaxations 2352to the floating-point model. 2353 2354Code that requires different behavior than this should use the 2355:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`. 2356 2357.. _fastmath: 2358 2359Fast-Math Flags 2360--------------- 2361 2362LLVM IR floating-point operations (:ref:`fadd <i_fadd>`, 2363:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, 2364:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>` 2365may use the following flags to enable otherwise unsafe 2366floating-point transformations. 2367 2368``nnan`` 2369 No NaNs - Allow optimizations to assume the arguments and result are not 2370 NaN. If an argument is a nan, or the result would be a nan, it produces 2371 a :ref:`poison value <poisonvalues>` instead. 2372 2373``ninf`` 2374 No Infs - Allow optimizations to assume the arguments and result are not 2375 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it 2376 produces a :ref:`poison value <poisonvalues>` instead. 2377 2378``nsz`` 2379 No Signed Zeros - Allow optimizations to treat the sign of a zero 2380 argument or result as insignificant. 2381 2382``arcp`` 2383 Allow Reciprocal - Allow optimizations to use the reciprocal of an 2384 argument rather than perform division. 2385 2386``contract`` 2387 Allow floating-point contraction (e.g. fusing a multiply followed by an 2388 addition into a fused multiply-and-add). 2389 2390``afn`` 2391 Approximate functions - Allow substitution of approximate calculations for 2392 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions 2393 for places where this can apply to LLVM's intrinsic math functions. 2394 2395``reassoc`` 2396 Allow reassociation transformations for floating-point instructions. 2397 This may dramatically change results in floating-point. 2398 2399``fast`` 2400 This flag implies all of the others. 2401 2402.. _uselistorder: 2403 2404Use-list Order Directives 2405------------------------- 2406 2407Use-list directives encode the in-memory order of each use-list, allowing the 2408order to be recreated. ``<order-indexes>`` is a comma-separated list of 2409indexes that are assigned to the referenced value's uses. The referenced 2410value's use-list is immediately sorted by these indexes. 2411 2412Use-list directives may appear at function scope or global scope. They are not 2413instructions, and have no effect on the semantics of the IR. When they're at 2414function scope, they must appear after the terminator of the final basic block. 2415 2416If basic blocks have their address taken via ``blockaddress()`` expressions, 2417``uselistorder_bb`` can be used to reorder their use-lists from outside their 2418function's scope. 2419 2420:Syntax: 2421 2422:: 2423 2424 uselistorder <ty> <value>, { <order-indexes> } 2425 uselistorder_bb @function, %block { <order-indexes> } 2426 2427:Examples: 2428 2429:: 2430 2431 define void @foo(i32 %arg1, i32 %arg2) { 2432 entry: 2433 ; ... instructions ... 2434 bb: 2435 ; ... instructions ... 2436 2437 ; At function scope. 2438 uselistorder i32 %arg1, { 1, 0, 2 } 2439 uselistorder label %bb, { 1, 0 } 2440 } 2441 2442 ; At global scope. 2443 uselistorder i32* @global, { 1, 2, 0 } 2444 uselistorder i32 7, { 1, 0 } 2445 uselistorder i32 (i32) @bar, { 1, 0 } 2446 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } 2447 2448.. _source_filename: 2449 2450Source Filename 2451--------------- 2452 2453The *source filename* string is set to the original module identifier, 2454which will be the name of the compiled source file when compiling from 2455source through the clang front end, for example. It is then preserved through 2456the IR and bitcode. 2457 2458This is currently necessary to generate a consistent unique global 2459identifier for local functions used in profile data, which prepends the 2460source file name to the local function name. 2461 2462The syntax for the source file name is simply: 2463 2464.. code-block:: text 2465 2466 source_filename = "/path/to/source.c" 2467 2468.. _typesystem: 2469 2470Type System 2471=========== 2472 2473The LLVM type system is one of the most important features of the 2474intermediate representation. Being typed enables a number of 2475optimizations to be performed on the intermediate representation 2476directly, without having to do extra analyses on the side before the 2477transformation. A strong type system makes it easier to read the 2478generated code and enables novel analyses and transformations that are 2479not feasible to perform on normal three address code representations. 2480 2481.. _t_void: 2482 2483Void Type 2484--------- 2485 2486:Overview: 2487 2488 2489The void type does not represent any value and has no size. 2490 2491:Syntax: 2492 2493 2494:: 2495 2496 void 2497 2498 2499.. _t_function: 2500 2501Function Type 2502------------- 2503 2504:Overview: 2505 2506 2507The function type can be thought of as a function signature. It consists of a 2508return type and a list of formal parameter types. The return type of a function 2509type is a void type or first class type --- except for :ref:`label <t_label>` 2510and :ref:`metadata <t_metadata>` types. 2511 2512:Syntax: 2513 2514:: 2515 2516 <returntype> (<parameter list>) 2517 2518...where '``<parameter list>``' is a comma-separated list of type 2519specifiers. Optionally, the parameter list may include a type ``...``, which 2520indicates that the function takes a variable number of arguments. Variable 2521argument functions can access their arguments with the :ref:`variable argument 2522handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type 2523except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. 2524 2525:Examples: 2526 2527+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2528| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | 2529+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2530| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | 2531+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2532| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | 2533+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2534| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | 2535+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2536 2537.. _t_firstclass: 2538 2539First Class Types 2540----------------- 2541 2542The :ref:`first class <t_firstclass>` types are perhaps the most important. 2543Values of these types are the only ones which can be produced by 2544instructions. 2545 2546.. _t_single_value: 2547 2548Single Value Types 2549^^^^^^^^^^^^^^^^^^ 2550 2551These are the types that are valid in registers from CodeGen's perspective. 2552 2553.. _t_integer: 2554 2555Integer Type 2556"""""""""""" 2557 2558:Overview: 2559 2560The integer type is a very simple type that simply specifies an 2561arbitrary bit width for the integer type desired. Any bit width from 1 2562bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified. 2563 2564:Syntax: 2565 2566:: 2567 2568 iN 2569 2570The number of bits the integer will occupy is specified by the ``N`` 2571value. 2572 2573Examples: 2574********* 2575 2576+----------------+------------------------------------------------+ 2577| ``i1`` | a single-bit integer. | 2578+----------------+------------------------------------------------+ 2579| ``i32`` | a 32-bit integer. | 2580+----------------+------------------------------------------------+ 2581| ``i1942652`` | a really big integer of over 1 million bits. | 2582+----------------+------------------------------------------------+ 2583 2584.. _t_floating: 2585 2586Floating-Point Types 2587"""""""""""""""""""" 2588 2589.. list-table:: 2590 :header-rows: 1 2591 2592 * - Type 2593 - Description 2594 2595 * - ``half`` 2596 - 16-bit floating-point value 2597 2598 * - ``float`` 2599 - 32-bit floating-point value 2600 2601 * - ``double`` 2602 - 64-bit floating-point value 2603 2604 * - ``fp128`` 2605 - 128-bit floating-point value (112-bit mantissa) 2606 2607 * - ``x86_fp80`` 2608 - 80-bit floating-point value (X87) 2609 2610 * - ``ppc_fp128`` 2611 - 128-bit floating-point value (two 64-bits) 2612 2613The binary format of half, float, double, and fp128 correspond to the 2614IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 2615respectively. 2616 2617X86_mmx Type 2618"""""""""""" 2619 2620:Overview: 2621 2622The x86_mmx type represents a value held in an MMX register on an x86 2623machine. The operations allowed on it are quite limited: parameters and 2624return values, load and store, and bitcast. User-specified MMX 2625instructions are represented as intrinsic or asm calls with arguments 2626and/or results of this type. There are no arrays, vectors or constants 2627of this type. 2628 2629:Syntax: 2630 2631:: 2632 2633 x86_mmx 2634 2635 2636.. _t_pointer: 2637 2638Pointer Type 2639"""""""""""" 2640 2641:Overview: 2642 2643The pointer type is used to specify memory locations. Pointers are 2644commonly used to reference objects in memory. 2645 2646Pointer types may have an optional address space attribute defining the 2647numbered address space where the pointed-to object resides. The default 2648address space is number zero. The semantics of non-zero address spaces 2649are target-specific. 2650 2651Note that LLVM does not permit pointers to void (``void*``) nor does it 2652permit pointers to labels (``label*``). Use ``i8*`` instead. 2653 2654:Syntax: 2655 2656:: 2657 2658 <type> * 2659 2660:Examples: 2661 2662+-------------------------+--------------------------------------------------------------------------------------------------------------+ 2663| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | 2664+-------------------------+--------------------------------------------------------------------------------------------------------------+ 2665| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | 2666+-------------------------+--------------------------------------------------------------------------------------------------------------+ 2667| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. | 2668+-------------------------+--------------------------------------------------------------------------------------------------------------+ 2669 2670.. _t_vector: 2671 2672Vector Type 2673""""""""""" 2674 2675:Overview: 2676 2677A vector type is a simple derived type that represents a vector of 2678elements. Vector types are used when multiple primitive data are 2679operated in parallel using a single instruction (SIMD). A vector type 2680requires a size (number of elements) and an underlying primitive data 2681type. Vector types are considered :ref:`first class <t_firstclass>`. 2682 2683:Syntax: 2684 2685:: 2686 2687 < <# elements> x <elementtype> > 2688 2689The number of elements is a constant integer value larger than 0; 2690elementtype may be any integer, floating-point or pointer type. Vectors 2691of size zero are not allowed. 2692 2693:Examples: 2694 2695+-------------------+--------------------------------------------------+ 2696| ``<4 x i32>`` | Vector of 4 32-bit integer values. | 2697+-------------------+--------------------------------------------------+ 2698| ``<8 x float>`` | Vector of 8 32-bit floating-point values. | 2699+-------------------+--------------------------------------------------+ 2700| ``<2 x i64>`` | Vector of 2 64-bit integer values. | 2701+-------------------+--------------------------------------------------+ 2702| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | 2703+-------------------+--------------------------------------------------+ 2704 2705.. _t_label: 2706 2707Label Type 2708^^^^^^^^^^ 2709 2710:Overview: 2711 2712The label type represents code labels. 2713 2714:Syntax: 2715 2716:: 2717 2718 label 2719 2720.. _t_token: 2721 2722Token Type 2723^^^^^^^^^^ 2724 2725:Overview: 2726 2727The token type is used when a value is associated with an instruction 2728but all uses of the value must not attempt to introspect or obscure it. 2729As such, it is not appropriate to have a :ref:`phi <i_phi>` or 2730:ref:`select <i_select>` of type token. 2731 2732:Syntax: 2733 2734:: 2735 2736 token 2737 2738 2739 2740.. _t_metadata: 2741 2742Metadata Type 2743^^^^^^^^^^^^^ 2744 2745:Overview: 2746 2747The metadata type represents embedded metadata. No derived types may be 2748created from metadata except for :ref:`function <t_function>` arguments. 2749 2750:Syntax: 2751 2752:: 2753 2754 metadata 2755 2756.. _t_aggregate: 2757 2758Aggregate Types 2759^^^^^^^^^^^^^^^ 2760 2761Aggregate Types are a subset of derived types that can contain multiple 2762member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are 2763aggregate types. :ref:`Vectors <t_vector>` are not considered to be 2764aggregate types. 2765 2766.. _t_array: 2767 2768Array Type 2769"""""""""" 2770 2771:Overview: 2772 2773The array type is a very simple derived type that arranges elements 2774sequentially in memory. The array type requires a size (number of 2775elements) and an underlying data type. 2776 2777:Syntax: 2778 2779:: 2780 2781 [<# elements> x <elementtype>] 2782 2783The number of elements is a constant integer value; ``elementtype`` may 2784be any type with a size. 2785 2786:Examples: 2787 2788+------------------+--------------------------------------+ 2789| ``[40 x i32]`` | Array of 40 32-bit integer values. | 2790+------------------+--------------------------------------+ 2791| ``[41 x i32]`` | Array of 41 32-bit integer values. | 2792+------------------+--------------------------------------+ 2793| ``[4 x i8]`` | Array of 4 8-bit integer values. | 2794+------------------+--------------------------------------+ 2795 2796Here are some examples of multidimensional arrays: 2797 2798+-----------------------------+----------------------------------------------------------+ 2799| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | 2800+-----------------------------+----------------------------------------------------------+ 2801| ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. | 2802+-----------------------------+----------------------------------------------------------+ 2803| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | 2804+-----------------------------+----------------------------------------------------------+ 2805 2806There is no restriction on indexing beyond the end of the array implied 2807by a static type (though there are restrictions on indexing beyond the 2808bounds of an allocated object in some cases). This means that 2809single-dimension 'variable sized array' addressing can be implemented in 2810LLVM with a zero length array type. An implementation of 'pascal style 2811arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for 2812example. 2813 2814.. _t_struct: 2815 2816Structure Type 2817"""""""""""""" 2818 2819:Overview: 2820 2821The structure type is used to represent a collection of data members 2822together in memory. The elements of a structure may be any type that has 2823a size. 2824 2825Structures in memory are accessed using '``load``' and '``store``' by 2826getting a pointer to a field with the '``getelementptr``' instruction. 2827Structures in registers are accessed using the '``extractvalue``' and 2828'``insertvalue``' instructions. 2829 2830Structures may optionally be "packed" structures, which indicate that 2831the alignment of the struct is one byte, and that there is no padding 2832between the elements. In non-packed structs, padding between field types 2833is inserted as defined by the DataLayout string in the module, which is 2834required to match what the underlying code generator expects. 2835 2836Structures can either be "literal" or "identified". A literal structure 2837is defined inline with other types (e.g. ``{i32, i32}*``) whereas 2838identified types are always defined at the top level with a name. 2839Literal types are uniqued by their contents and can never be recursive 2840or opaque since there is no way to write one. Identified types can be 2841recursive, can be opaqued, and are never uniqued. 2842 2843:Syntax: 2844 2845:: 2846 2847 %T1 = type { <type list> } ; Identified normal struct type 2848 %T2 = type <{ <type list> }> ; Identified packed struct type 2849 2850:Examples: 2851 2852+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2853| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | 2854+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2855| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | 2856+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2857| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | 2858+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2859 2860.. _t_opaque: 2861 2862Opaque Structure Types 2863"""""""""""""""""""""" 2864 2865:Overview: 2866 2867Opaque structure types are used to represent named structure types that 2868do not have a body specified. This corresponds (for example) to the C 2869notion of a forward declared structure. 2870 2871:Syntax: 2872 2873:: 2874 2875 %X = type opaque 2876 %52 = type opaque 2877 2878:Examples: 2879 2880+--------------+-------------------+ 2881| ``opaque`` | An opaque type. | 2882+--------------+-------------------+ 2883 2884.. _constants: 2885 2886Constants 2887========= 2888 2889LLVM has several different basic types of constants. This section 2890describes them all and their syntax. 2891 2892Simple Constants 2893---------------- 2894 2895**Boolean constants** 2896 The two strings '``true``' and '``false``' are both valid constants 2897 of the ``i1`` type. 2898**Integer constants** 2899 Standard integers (such as '4') are constants of the 2900 :ref:`integer <t_integer>` type. Negative numbers may be used with 2901 integer types. 2902**Floating-point constants** 2903 Floating-point constants use standard decimal notation (e.g. 2904 123.421), exponential notation (e.g. 1.23421e+2), or a more precise 2905 hexadecimal notation (see below). The assembler requires the exact 2906 decimal value of a floating-point constant. For example, the 2907 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating 2908 decimal in binary. Floating-point constants must have a 2909 :ref:`floating-point <t_floating>` type. 2910**Null pointer constants** 2911 The identifier '``null``' is recognized as a null pointer constant 2912 and must be of :ref:`pointer type <t_pointer>`. 2913**Token constants** 2914 The identifier '``none``' is recognized as an empty token constant 2915 and must be of :ref:`token type <t_token>`. 2916 2917The one non-intuitive notation for constants is the hexadecimal form of 2918floating-point constants. For example, the form 2919'``double 0x432ff973cafa8000``' is equivalent to (but harder to read 2920than) '``double 4.5e+15``'. The only time hexadecimal floating-point 2921constants are required (and the only time that they are generated by the 2922disassembler) is when a floating-point constant must be emitted but it 2923cannot be represented as a decimal floating-point number in a reasonable 2924number of digits. For example, NaN's, infinities, and other special 2925values are represented in their IEEE hexadecimal format so that assembly 2926and disassembly do not cause any bits to change in the constants. 2927 2928When using the hexadecimal form, constants of types half, float, and 2929double are represented using the 16-digit form shown above (which 2930matches the IEEE754 representation for double); half and float values 2931must, however, be exactly representable as IEEE 754 half and single 2932precision, respectively. Hexadecimal format is always used for long 2933double, and there are three forms of long double. The 80-bit format used 2934by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The 2935128-bit format used by PowerPC (two adjacent doubles) is represented by 2936``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is 2937represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles 2938will only work if they match the long double format on your target. 2939The IEEE 16-bit format (half precision) is represented by ``0xH`` 2940followed by 4 hexadecimal digits. All hexadecimal formats are big-endian 2941(sign bit at the left). 2942 2943There are no constants of type x86_mmx. 2944 2945.. _complexconstants: 2946 2947Complex Constants 2948----------------- 2949 2950Complex constants are a (potentially recursive) combination of simple 2951constants and smaller complex constants. 2952 2953**Structure constants** 2954 Structure constants are represented with notation similar to 2955 structure type definitions (a comma separated list of elements, 2956 surrounded by braces (``{}``)). For example: 2957 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as 2958 "``@G = external global i32``". Structure constants must have 2959 :ref:`structure type <t_struct>`, and the number and types of elements 2960 must match those specified by the type. 2961**Array constants** 2962 Array constants are represented with notation similar to array type 2963 definitions (a comma separated list of elements, surrounded by 2964 square brackets (``[]``)). For example: 2965 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have 2966 :ref:`array type <t_array>`, and the number and types of elements must 2967 match those specified by the type. As a special case, character array 2968 constants may also be represented as a double-quoted string using the ``c`` 2969 prefix. For example: "``c"Hello World\0A\00"``". 2970**Vector constants** 2971 Vector constants are represented with notation similar to vector 2972 type definitions (a comma separated list of elements, surrounded by 2973 less-than/greater-than's (``<>``)). For example: 2974 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants 2975 must have :ref:`vector type <t_vector>`, and the number and types of 2976 elements must match those specified by the type. 2977**Zero initialization** 2978 The string '``zeroinitializer``' can be used to zero initialize a 2979 value to zero of *any* type, including scalar and 2980 :ref:`aggregate <t_aggregate>` types. This is often used to avoid 2981 having to print large zero initializers (e.g. for large arrays) and 2982 is always exactly equivalent to using explicit zero initializers. 2983**Metadata node** 2984 A metadata node is a constant tuple without types. For example: 2985 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values, 2986 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``". 2987 Unlike other typed constants that are meant to be interpreted as part of 2988 the instruction stream, metadata is a place to attach additional 2989 information such as debug info. 2990 2991Global Variable and Function Addresses 2992-------------------------------------- 2993 2994The addresses of :ref:`global variables <globalvars>` and 2995:ref:`functions <functionstructure>` are always implicitly valid 2996(link-time) constants. These constants are explicitly referenced when 2997the :ref:`identifier for the global <identifiers>` is used and always have 2998:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM 2999file: 3000 3001.. code-block:: llvm 3002 3003 @X = global i32 17 3004 @Y = global i32 42 3005 @Z = global [2 x i32*] [ i32* @X, i32* @Y ] 3006 3007.. _undefvalues: 3008 3009Undefined Values 3010---------------- 3011 3012The string '``undef``' can be used anywhere a constant is expected, and 3013indicates that the user of the value may receive an unspecified 3014bit-pattern. Undefined values may be of any type (other than '``label``' 3015or '``void``') and be used anywhere a constant is permitted. 3016 3017Undefined values are useful because they indicate to the compiler that 3018the program is well defined no matter what value is used. This gives the 3019compiler more freedom to optimize. Here are some examples of 3020(potentially surprising) transformations that are valid (in pseudo IR): 3021 3022.. code-block:: llvm 3023 3024 %A = add %X, undef 3025 %B = sub %X, undef 3026 %C = xor %X, undef 3027 Safe: 3028 %A = undef 3029 %B = undef 3030 %C = undef 3031 3032This is safe because all of the output bits are affected by the undef 3033bits. Any output bit can have a zero or one depending on the input bits. 3034 3035.. code-block:: llvm 3036 3037 %A = or %X, undef 3038 %B = and %X, undef 3039 Safe: 3040 %A = -1 3041 %B = 0 3042 Safe: 3043 %A = %X ;; By choosing undef as 0 3044 %B = %X ;; By choosing undef as -1 3045 Unsafe: 3046 %A = undef 3047 %B = undef 3048 3049These logical operations have bits that are not always affected by the 3050input. For example, if ``%X`` has a zero bit, then the output of the 3051'``and``' operation will always be a zero for that bit, no matter what 3052the corresponding bit from the '``undef``' is. As such, it is unsafe to 3053optimize or assume that the result of the '``and``' is '``undef``'. 3054However, it is safe to assume that all bits of the '``undef``' could be 30550, and optimize the '``and``' to 0. Likewise, it is safe to assume that 3056all the bits of the '``undef``' operand to the '``or``' could be set, 3057allowing the '``or``' to be folded to -1. 3058 3059.. code-block:: llvm 3060 3061 %A = select undef, %X, %Y 3062 %B = select undef, 42, %Y 3063 %C = select %X, %Y, undef 3064 Safe: 3065 %A = %X (or %Y) 3066 %B = 42 (or %Y) 3067 %C = %Y 3068 Unsafe: 3069 %A = undef 3070 %B = undef 3071 %C = undef 3072 3073This set of examples shows that undefined '``select``' (and conditional 3074branch) conditions can go *either way*, but they have to come from one 3075of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were 3076both known to have a clear low bit, then ``%A`` would have to have a 3077cleared low bit. However, in the ``%C`` example, the optimizer is 3078allowed to assume that the '``undef``' operand could be the same as 3079``%Y``, allowing the whole '``select``' to be eliminated. 3080 3081.. code-block:: text 3082 3083 %A = xor undef, undef 3084 3085 %B = undef 3086 %C = xor %B, %B 3087 3088 %D = undef 3089 %E = icmp slt %D, 4 3090 %F = icmp gte %D, 4 3091 3092 Safe: 3093 %A = undef 3094 %B = undef 3095 %C = undef 3096 %D = undef 3097 %E = undef 3098 %F = undef 3099 3100This example points out that two '``undef``' operands are not 3101necessarily the same. This can be surprising to people (and also matches 3102C semantics) where they assume that "``X^X``" is always zero, even if 3103``X`` is undefined. This isn't true for a number of reasons, but the 3104short answer is that an '``undef``' "variable" can arbitrarily change 3105its value over its "live range". This is true because the variable 3106doesn't actually *have a live range*. Instead, the value is logically 3107read from arbitrary registers that happen to be around when needed, so 3108the value is not necessarily consistent over time. In fact, ``%A`` and 3109``%C`` need to have the same semantics or the core LLVM "replace all 3110uses with" concept would not hold. 3111 3112.. code-block:: llvm 3113 3114 %A = sdiv undef, %X 3115 %B = sdiv %X, undef 3116 Safe: 3117 %A = 0 3118 b: unreachable 3119 3120These examples show the crucial difference between an *undefined value* 3121and *undefined behavior*. An undefined value (like '``undef``') is 3122allowed to have an arbitrary bit-pattern. This means that the ``%A`` 3123operation can be constant folded to '``0``', because the '``undef``' 3124could be zero, and zero divided by any value is zero. 3125However, in the second example, we can make a more aggressive 3126assumption: because the ``undef`` is allowed to be an arbitrary value, 3127we are allowed to assume that it could be zero. Since a divide by zero 3128has *undefined behavior*, we are allowed to assume that the operation 3129does not execute at all. This allows us to delete the divide and all 3130code after it. Because the undefined operation "can't happen", the 3131optimizer can assume that it occurs in dead code. 3132 3133.. code-block:: text 3134 3135 a: store undef -> %X 3136 b: store %X -> undef 3137 Safe: 3138 a: <deleted> 3139 b: unreachable 3140 3141A store *of* an undefined value can be assumed to not have any effect; 3142we can assume that the value is overwritten with bits that happen to 3143match what was already there. However, a store *to* an undefined 3144location could clobber arbitrary memory, therefore, it has undefined 3145behavior. 3146 3147.. _poisonvalues: 3148 3149Poison Values 3150------------- 3151 3152Poison values are similar to :ref:`undef values <undefvalues>`, however 3153they also represent the fact that an instruction or constant expression 3154that cannot evoke side effects has nevertheless detected a condition 3155that results in undefined behavior. 3156 3157There is currently no way of representing a poison value in the IR; they 3158only exist when produced by operations such as :ref:`add <i_add>` with 3159the ``nsw`` flag. 3160 3161Poison value behavior is defined in terms of value *dependence*: 3162 3163- Values other than :ref:`phi <i_phi>` nodes depend on their operands. 3164- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to 3165 their dynamic predecessor basic block. 3166- Function arguments depend on the corresponding actual argument values 3167 in the dynamic callers of their functions. 3168- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` 3169 instructions that dynamically transfer control back to them. 3170- :ref:`Invoke <i_invoke>` instructions depend on the 3171 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing 3172 call instructions that dynamically transfer control back to them. 3173- Non-volatile loads and stores depend on the most recent stores to all 3174 of the referenced memory addresses, following the order in the IR 3175 (including loads and stores implied by intrinsics such as 3176 :ref:`@llvm.memcpy <int_memcpy>`.) 3177- An instruction with externally visible side effects depends on the 3178 most recent preceding instruction with externally visible side 3179 effects, following the order in the IR. (This includes :ref:`volatile 3180 operations <volatile>`.) 3181- An instruction *control-depends* on a :ref:`terminator 3182 instruction <terminators>` if the terminator instruction has 3183 multiple successors and the instruction is always executed when 3184 control transfers to one of the successors, and may not be executed 3185 when control is transferred to another. 3186- Additionally, an instruction also *control-depends* on a terminator 3187 instruction if the set of instructions it otherwise depends on would 3188 be different if the terminator had transferred control to a different 3189 successor. 3190- Dependence is transitive. 3191 3192Poison values have the same behavior as :ref:`undef values <undefvalues>`, 3193with the additional effect that any instruction that has a *dependence* 3194on a poison value has undefined behavior. 3195 3196Here are some examples: 3197 3198.. code-block:: llvm 3199 3200 entry: 3201 %poison = sub nuw i32 0, 1 ; Results in a poison value. 3202 %still_poison = and i32 %poison, 0 ; 0, but also poison. 3203 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison 3204 store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned 3205 3206 store i32 %poison, i32* @g ; Poison value stored to memory. 3207 %poison2 = load i32, i32* @g ; Poison value loaded back from memory. 3208 3209 store volatile i32 %poison, i32* @g ; External observation; undefined behavior. 3210 3211 %narrowaddr = bitcast i32* @g to i16* 3212 %wideaddr = bitcast i32* @g to i64* 3213 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value. 3214 %poison4 = load i64, i64* %wideaddr ; Returns a poison value. 3215 3216 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. 3217 br i1 %cmp, label %true, label %end ; Branch to either destination. 3218 3219 true: 3220 store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so 3221 ; it has undefined behavior. 3222 br label %end 3223 3224 end: 3225 %p = phi i32 [ 0, %entry ], [ 1, %true ] 3226 ; Both edges into this PHI are 3227 ; control-dependent on %cmp, so this 3228 ; always results in a poison value. 3229 3230 store volatile i32 0, i32* @g ; This would depend on the store in %true 3231 ; if %cmp is true, or the store in %entry 3232 ; otherwise, so this is undefined behavior. 3233 3234 br i1 %cmp, label %second_true, label %second_end 3235 ; The same branch again, but this time the 3236 ; true block doesn't have side effects. 3237 3238 second_true: 3239 ; No side effects! 3240 ret void 3241 3242 second_end: 3243 store volatile i32 0, i32* @g ; This time, the instruction always depends 3244 ; on the store in %end. Also, it is 3245 ; control-equivalent to %end, so this is 3246 ; well-defined (ignoring earlier undefined 3247 ; behavior in this example). 3248 3249.. _blockaddress: 3250 3251Addresses of Basic Blocks 3252------------------------- 3253 3254``blockaddress(@function, %block)`` 3255 3256The '``blockaddress``' constant computes the address of the specified 3257basic block in the specified function, and always has an ``i8*`` type. 3258Taking the address of the entry block is illegal. 3259 3260This value only has defined behavior when used as an operand to the 3261':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons 3262against null. Pointer equality tests between labels addresses results in 3263undefined behavior --- though, again, comparison against null is ok, and 3264no label is equal to the null pointer. This may be passed around as an 3265opaque pointer sized value as long as the bits are not inspected. This 3266allows ``ptrtoint`` and arithmetic to be performed on these values so 3267long as the original value is reconstituted before the ``indirectbr`` 3268instruction. 3269 3270Finally, some targets may provide defined semantics when using the value 3271as the operand to an inline assembly, but that is target specific. 3272 3273.. _constantexprs: 3274 3275Constant Expressions 3276-------------------- 3277 3278Constant expressions are used to allow expressions involving other 3279constants to be used as constants. Constant expressions may be of any 3280:ref:`first class <t_firstclass>` type and may involve any LLVM operation 3281that does not have side effects (e.g. load and call are not supported). 3282The following is the syntax for constant expressions: 3283 3284``trunc (CST to TYPE)`` 3285 Perform the :ref:`trunc operation <i_trunc>` on constants. 3286``zext (CST to TYPE)`` 3287 Perform the :ref:`zext operation <i_zext>` on constants. 3288``sext (CST to TYPE)`` 3289 Perform the :ref:`sext operation <i_sext>` on constants. 3290``fptrunc (CST to TYPE)`` 3291 Truncate a floating-point constant to another floating-point type. 3292 The size of CST must be larger than the size of TYPE. Both types 3293 must be floating-point. 3294``fpext (CST to TYPE)`` 3295 Floating-point extend a constant to another type. The size of CST 3296 must be smaller or equal to the size of TYPE. Both types must be 3297 floating-point. 3298``fptoui (CST to TYPE)`` 3299 Convert a floating-point constant to the corresponding unsigned 3300 integer constant. TYPE must be a scalar or vector integer type. CST 3301 must be of scalar or vector floating-point type. Both CST and TYPE 3302 must be scalars, or vectors of the same number of elements. If the 3303 value won't fit in the integer type, the result is a 3304 :ref:`poison value <poisonvalues>`. 3305``fptosi (CST to TYPE)`` 3306 Convert a floating-point constant to the corresponding signed 3307 integer constant. TYPE must be a scalar or vector integer type. CST 3308 must be of scalar or vector floating-point type. Both CST and TYPE 3309 must be scalars, or vectors of the same number of elements. If the 3310 value won't fit in the integer type, the result is a 3311 :ref:`poison value <poisonvalues>`. 3312``uitofp (CST to TYPE)`` 3313 Convert an unsigned integer constant to the corresponding 3314 floating-point constant. TYPE must be a scalar or vector floating-point 3315 type. CST must be of scalar or vector integer type. Both CST and TYPE must 3316 be scalars, or vectors of the same number of elements. 3317``sitofp (CST to TYPE)`` 3318 Convert a signed integer constant to the corresponding floating-point 3319 constant. TYPE must be a scalar or vector floating-point type. 3320 CST must be of scalar or vector integer type. Both CST and TYPE must 3321 be scalars, or vectors of the same number of elements. 3322``ptrtoint (CST to TYPE)`` 3323 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants. 3324``inttoptr (CST to TYPE)`` 3325 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants. 3326 This one is *really* dangerous! 3327``bitcast (CST to TYPE)`` 3328 Convert a constant, CST, to another TYPE. 3329 The constraints of the operands are the same as those for the 3330 :ref:`bitcast instruction <i_bitcast>`. 3331``addrspacecast (CST to TYPE)`` 3332 Convert a constant pointer or constant vector of pointer, CST, to another 3333 TYPE in a different address space. The constraints of the operands are the 3334 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. 3335``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)`` 3336 Perform the :ref:`getelementptr operation <i_getelementptr>` on 3337 constants. As with the :ref:`getelementptr <i_getelementptr>` 3338 instruction, the index list may have one or more indexes, which are 3339 required to make sense for the type of "pointer to TY". 3340``select (COND, VAL1, VAL2)`` 3341 Perform the :ref:`select operation <i_select>` on constants. 3342``icmp COND (VAL1, VAL2)`` 3343 Perform the :ref:`icmp operation <i_icmp>` on constants. 3344``fcmp COND (VAL1, VAL2)`` 3345 Perform the :ref:`fcmp operation <i_fcmp>` on constants. 3346``extractelement (VAL, IDX)`` 3347 Perform the :ref:`extractelement operation <i_extractelement>` on 3348 constants. 3349``insertelement (VAL, ELT, IDX)`` 3350 Perform the :ref:`insertelement operation <i_insertelement>` on 3351 constants. 3352``shufflevector (VEC1, VEC2, IDXMASK)`` 3353 Perform the :ref:`shufflevector operation <i_shufflevector>` on 3354 constants. 3355``extractvalue (VAL, IDX0, IDX1, ...)`` 3356 Perform the :ref:`extractvalue operation <i_extractvalue>` on 3357 constants. The index list is interpreted in a similar manner as 3358 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At 3359 least one index value must be specified. 3360``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` 3361 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. 3362 The index list is interpreted in a similar manner as indices in a 3363 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index 3364 value must be specified. 3365``OPCODE (LHS, RHS)`` 3366 Perform the specified operation of the LHS and RHS constants. OPCODE 3367 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise 3368 binary <bitwiseops>` operations. The constraints on operands are 3369 the same as those for the corresponding instruction (e.g. no bitwise 3370 operations on floating-point values are allowed). 3371 3372Other Values 3373============ 3374 3375.. _inlineasmexprs: 3376 3377Inline Assembler Expressions 3378---------------------------- 3379 3380LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level 3381Inline Assembly <moduleasm>`) through the use of a special value. This value 3382represents the inline assembler as a template string (containing the 3383instructions to emit), a list of operand constraints (stored as a string), a 3384flag that indicates whether or not the inline asm expression has side effects, 3385and a flag indicating whether the function containing the asm needs to align its 3386stack conservatively. 3387 3388The template string supports argument substitution of the operands using "``$``" 3389followed by a number, to indicate substitution of the given register/memory 3390location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also 3391be used, where ``MODIFIER`` is a target-specific annotation for how to print the 3392operand (See :ref:`inline-asm-modifiers`). 3393 3394A literal "``$``" may be included by using "``$$``" in the template. To include 3395other special characters into the output, the usual "``\XX``" escapes may be 3396used, just as in other strings. Note that after template substitution, the 3397resulting assembly string is parsed by LLVM's integrated assembler unless it is 3398disabled -- even when emitting a ``.s`` file -- and thus must contain assembly 3399syntax known to LLVM. 3400 3401LLVM also supports a few more substitions useful for writing inline assembly: 3402 3403- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob. 3404 This substitution is useful when declaring a local label. Many standard 3405 compiler optimizations, such as inlining, may duplicate an inline asm blob. 3406 Adding a blob-unique identifier ensures that the two labels will not conflict 3407 during assembly. This is used to implement `GCC's %= special format 3408 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_. 3409- ``${:comment}``: Expands to the comment character of the current target's 3410 assembly dialect. This is usually ``#``, but many targets use other strings, 3411 such as ``;``, ``//``, or ``!``. 3412- ``${:private}``: Expands to the assembler private label prefix. Labels with 3413 this prefix will not appear in the symbol table of the assembled object. 3414 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is 3415 relatively popular. 3416 3417LLVM's support for inline asm is modeled closely on the requirements of Clang's 3418GCC-compatible inline-asm support. Thus, the feature-set and the constraint and 3419modifier codes listed here are similar or identical to those in GCC's inline asm 3420support. However, to be clear, the syntax of the template and constraint strings 3421described here is *not* the same as the syntax accepted by GCC and Clang, and, 3422while most constraint letters are passed through as-is by Clang, some get 3423translated to other codes when converting from the C source to the LLVM 3424assembly. 3425 3426An example inline assembler expression is: 3427 3428.. code-block:: llvm 3429 3430 i32 (i32) asm "bswap $0", "=r,r" 3431 3432Inline assembler expressions may **only** be used as the callee operand 3433of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. 3434Thus, typically we have: 3435 3436.. code-block:: llvm 3437 3438 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) 3439 3440Inline asms with side effects not visible in the constraint list must be 3441marked as having side effects. This is done through the use of the 3442'``sideeffect``' keyword, like so: 3443 3444.. code-block:: llvm 3445 3446 call void asm sideeffect "eieio", ""() 3447 3448In some cases inline asms will contain code that will not work unless 3449the stack is aligned in some way, such as calls or SSE instructions on 3450x86, yet will not contain code that does that alignment within the asm. 3451The compiler should make conservative assumptions about what the asm 3452might contain and should generate its usual stack alignment code in the 3453prologue if the '``alignstack``' keyword is present: 3454 3455.. code-block:: llvm 3456 3457 call void asm alignstack "eieio", ""() 3458 3459Inline asms also support using non-standard assembly dialects. The 3460assumed dialect is ATT. When the '``inteldialect``' keyword is present, 3461the inline asm is using the Intel dialect. Currently, ATT and Intel are 3462the only supported dialects. An example is: 3463 3464.. code-block:: llvm 3465 3466 call void asm inteldialect "eieio", ""() 3467 3468If multiple keywords appear the '``sideeffect``' keyword must come 3469first, the '``alignstack``' keyword second and the '``inteldialect``' 3470keyword last. 3471 3472Inline Asm Constraint String 3473^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3474 3475The constraint list is a comma-separated string, each element containing one or 3476more constraint codes. 3477 3478For each element in the constraint list an appropriate register or memory 3479operand will be chosen, and it will be made available to assembly template 3480string expansion as ``$0`` for the first constraint in the list, ``$1`` for the 3481second, etc. 3482 3483There are three different types of constraints, which are distinguished by a 3484prefix symbol in front of the constraint code: Output, Input, and Clobber. The 3485constraints must always be given in that order: outputs first, then inputs, then 3486clobbers. They cannot be intermingled. 3487 3488There are also three different categories of constraint codes: 3489 3490- Register constraint. This is either a register class, or a fixed physical 3491 register. This kind of constraint will allocate a register, and if necessary, 3492 bitcast the argument or result to the appropriate type. 3493- Memory constraint. This kind of constraint is for use with an instruction 3494 taking a memory operand. Different constraints allow for different addressing 3495 modes used by the target. 3496- Immediate value constraint. This kind of constraint is for an integer or other 3497 immediate value which can be rendered directly into an instruction. The 3498 various target-specific constraints allow the selection of a value in the 3499 proper range for the instruction you wish to use it with. 3500 3501Output constraints 3502"""""""""""""""""" 3503 3504Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This 3505indicates that the assembly will write to this operand, and the operand will 3506then be made available as a return value of the ``asm`` expression. Output 3507constraints do not consume an argument from the call instruction. (Except, see 3508below about indirect outputs). 3509 3510Normally, it is expected that no output locations are written to by the assembly 3511expression until *all* of the inputs have been read. As such, LLVM may assign 3512the same register to an output and an input. If this is not safe (e.g. if the 3513assembly contains two instructions, where the first writes to one output, and 3514the second reads an input and writes to a second output), then the "``&``" 3515modifier must be used (e.g. "``=&r``") to specify that the output is an 3516"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM 3517will not use the same register for any inputs (other than an input tied to this 3518output). 3519 3520Input constraints 3521""""""""""""""""" 3522 3523Input constraints do not have a prefix -- just the constraint codes. Each input 3524constraint will consume one argument from the call instruction. It is not 3525permitted for the asm to write to any input register or memory location (unless 3526that input is tied to an output). Note also that multiple inputs may all be 3527assigned to the same register, if LLVM can determine that they necessarily all 3528contain the same value. 3529 3530Instead of providing a Constraint Code, input constraints may also "tie" 3531themselves to an output constraint, by providing an integer as the constraint 3532string. Tied inputs still consume an argument from the call instruction, and 3533take up a position in the asm template numbering as is usual -- they will simply 3534be constrained to always use the same register as the output they've been tied 3535to. For example, a constraint string of "``=r,0``" says to assign a register for 3536output, and use that register as an input as well (it being the 0'th 3537constraint). 3538 3539It is permitted to tie an input to an "early-clobber" output. In that case, no 3540*other* input may share the same register as the input tied to the early-clobber 3541(even when the other input has the same value). 3542 3543You may only tie an input to an output which has a register constraint, not a 3544memory constraint. Only a single input may be tied to an output. 3545 3546There is also an "interesting" feature which deserves a bit of explanation: if a 3547register class constraint allocates a register which is too small for the value 3548type operand provided as input, the input value will be split into multiple 3549registers, and all of them passed to the inline asm. 3550 3551However, this feature is often not as useful as you might think. 3552 3553Firstly, the registers are *not* guaranteed to be consecutive. So, on those 3554architectures that have instructions which operate on multiple consecutive 3555instructions, this is not an appropriate way to support them. (e.g. the 32-bit 3556SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The 3557hardware then loads into both the named register, and the next register. This 3558feature of inline asm would not be useful to support that.) 3559 3560A few of the targets provide a template string modifier allowing explicit access 3561to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and 3562``D``). On such an architecture, you can actually access the second allocated 3563register (yet, still, not any subsequent ones). But, in that case, you're still 3564probably better off simply splitting the value into two separate operands, for 3565clarity. (e.g. see the description of the ``A`` constraint on X86, which, 3566despite existing only for use with this feature, is not really a good idea to 3567use) 3568 3569Indirect inputs and outputs 3570""""""""""""""""""""""""""" 3571 3572Indirect output or input constraints can be specified by the "``*``" modifier 3573(which goes after the "``=``" in case of an output). This indicates that the asm 3574will write to or read from the contents of an *address* provided as an input 3575argument. (Note that in this way, indirect outputs act more like an *input* than 3576an output: just like an input, they consume an argument of the call expression, 3577rather than producing a return value. An indirect output constraint is an 3578"output" only in that the asm is expected to write to the contents of the input 3579memory location, instead of just read from it). 3580 3581This is most typically used for memory constraint, e.g. "``=*m``", to pass the 3582address of a variable as a value. 3583 3584It is also possible to use an indirect *register* constraint, but only on output 3585(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output 3586value normally, and then, separately emit a store to the address provided as 3587input, after the provided inline asm. (It's not clear what value this 3588functionality provides, compared to writing the store explicitly after the asm 3589statement, and it can only produce worse code, since it bypasses many 3590optimization passes. I would recommend not using it.) 3591 3592 3593Clobber constraints 3594""""""""""""""""""" 3595 3596A clobber constraint is indicated by a "``~``" prefix. A clobber does not 3597consume an input operand, nor generate an output. Clobbers cannot use any of the 3598general constraint code letters -- they may use only explicit register 3599constraints, e.g. "``~{eax}``". The one exception is that a clobber string of 3600"``~{memory}``" indicates that the assembly writes to arbitrary undeclared 3601memory locations -- not only the memory pointed to by a declared indirect 3602output. 3603 3604Note that clobbering named registers that are also present in output 3605constraints is not legal. 3606 3607 3608Constraint Codes 3609"""""""""""""""" 3610After a potential prefix comes constraint code, or codes. 3611 3612A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character 3613followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``" 3614(e.g. "``{eax}``"). 3615 3616The one and two letter constraint codes are typically chosen to be the same as 3617GCC's constraint codes. 3618 3619A single constraint may include one or more than constraint code in it, leaving 3620it up to LLVM to choose which one to use. This is included mainly for 3621compatibility with the translation of GCC inline asm coming from clang. 3622 3623There are two ways to specify alternatives, and either or both may be used in an 3624inline asm constraint list: 3625 36261) Append the codes to each other, making a constraint code set. E.g. "``im``" 3627 or "``{eax}m``". This means "choose any of the options in the set". The 3628 choice of constraint is made independently for each constraint in the 3629 constraint list. 3630 36312) Use "``|``" between constraint code sets, creating alternatives. Every 3632 constraint in the constraint list must have the same number of alternative 3633 sets. With this syntax, the same alternative in *all* of the items in the 3634 constraint list will be chosen together. 3635 3636Putting those together, you might have a two operand constraint string like 3637``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then 3638operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1 3639may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m. 3640 3641However, the use of either of the alternatives features is *NOT* recommended, as 3642LLVM is not able to make an intelligent choice about which one to use. (At the 3643point it currently needs to choose, not enough information is available to do so 3644in a smart way.) Thus, it simply tries to make a choice that's most likely to 3645compile, not one that will be optimal performance. (e.g., given "``rm``", it'll 3646always choose to use memory, not registers). And, if given multiple registers, 3647or multiple register classes, it will simply choose the first one. (In fact, it 3648doesn't currently even ensure explicitly specified physical registers are 3649unique, so specifying multiple physical registers as alternatives, like 3650``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was 3651intended.) 3652 3653Supported Constraint Code List 3654"""""""""""""""""""""""""""""" 3655 3656The constraint codes are, in general, expected to behave the same way they do in 3657GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 3658inline asm code which was supported by GCC. A mismatch in behavior between LLVM 3659and GCC likely indicates a bug in LLVM. 3660 3661Some constraint codes are typically supported by all targets: 3662 3663- ``r``: A register in the target's general purpose register class. 3664- ``m``: A memory address operand. It is target-specific what addressing modes 3665 are supported, typical examples are register, or register + register offset, 3666 or register + immediate offset (of some target-specific size). 3667- ``i``: An integer constant (of target-specific width). Allows either a simple 3668 immediate, or a relocatable value. 3669- ``n``: An integer constant -- *not* including relocatable values. 3670- ``s``: An integer constant, but allowing *only* relocatable values. 3671- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically 3672 useful to pass a label for an asm branch or call. 3673 3674 .. FIXME: but that surely isn't actually okay to jump out of an asm 3675 block without telling llvm about the control transfer???) 3676 3677- ``{register-name}``: Requires exactly the named physical register. 3678 3679Other constraints are target-specific: 3680 3681AArch64: 3682 3683- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate. 3684- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction, 3685 i.e. 0 to 4095 with optional shift by 12. 3686- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or 3687 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12. 3688- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a 3689 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register. 3690- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a 3691 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register. 3692- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a 3693 32-bit register. This is a superset of ``K``: in addition to the bitmask 3694 immediate, also allows immediate integers which can be loaded with a single 3695 ``MOVZ`` or ``MOVL`` instruction. 3696- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a 3697 64-bit register. This is a superset of ``L``. 3698- ``Q``: Memory address operand must be in a single register (no 3699 offsets). (However, LLVM currently does this for the ``m`` constraint as 3700 well.) 3701- ``r``: A 32 or 64-bit integer register (W* or X*). 3702- ``w``: A 32, 64, or 128-bit floating-point/SIMD register. 3703- ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``). 3704 3705AMDGPU: 3706 3707- ``r``: A 32 or 64-bit integer register. 3708- ``[0-9]v``: The 32-bit VGPR register, number 0-9. 3709- ``[0-9]s``: The 32-bit SGPR register, number 0-9. 3710 3711 3712All ARM modes: 3713 3714- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address 3715 operand. Treated the same as operand ``m``, at the moment. 3716 3717ARM and ARM's Thumb2 mode: 3718 3719- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``) 3720- ``I``: An immediate integer valid for a data-processing instruction. 3721- ``J``: An immediate integer between -4095 and 4095. 3722- ``K``: An immediate integer whose bitwise inverse is valid for a 3723 data-processing instruction. (Can be used with template modifier "``B``" to 3724 print the inverted value). 3725- ``L``: An immediate integer whose negation is valid for a data-processing 3726 instruction. (Can be used with template modifier "``n``" to print the negated 3727 value). 3728- ``M``: A power of two or a integer between 0 and 32. 3729- ``N``: Invalid immediate constraint. 3730- ``O``: Invalid immediate constraint. 3731- ``r``: A general-purpose 32-bit integer register (``r0-r15``). 3732- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same 3733 as ``r``. 3734- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode, 3735 invalid. 3736- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``, 3737 ``d0-d31``, or ``q0-q15``. 3738- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``, 3739 ``d0-d7``, or ``q0-q3``. 3740- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or 3741 ``q0-q8``. 3742 3743ARM's Thumb1 mode: 3744 3745- ``I``: An immediate integer between 0 and 255. 3746- ``J``: An immediate integer between -255 and -1. 3747- ``K``: An immediate integer between 0 and 255, with optional left-shift by 3748 some amount. 3749- ``L``: An immediate integer between -7 and 7. 3750- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020. 3751- ``N``: An immediate integer between 0 and 31. 3752- ``O``: An immediate integer which is a multiple of 4 between -508 and 508. 3753- ``r``: A low 32-bit GPR register (``r0-r7``). 3754- ``l``: A low 32-bit GPR register (``r0-r7``). 3755- ``h``: A high GPR register (``r0-r7``). 3756- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``, 3757 ``d0-d31``, or ``q0-q15``. 3758- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``, 3759 ``d0-d7``, or ``q0-q3``. 3760- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or 3761 ``q0-q8``. 3762 3763 3764Hexagon: 3765 3766- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``, 3767 at the moment. 3768- ``r``: A 32 or 64-bit register. 3769 3770MSP430: 3771 3772- ``r``: An 8 or 16-bit register. 3773 3774MIPS: 3775 3776- ``I``: An immediate signed 16-bit integer. 3777- ``J``: An immediate integer zero. 3778- ``K``: An immediate unsigned 16-bit integer. 3779- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0. 3780- ``N``: An immediate integer between -65535 and -1. 3781- ``O``: An immediate signed 15-bit integer. 3782- ``P``: An immediate integer between 1 and 65535. 3783- ``m``: A memory address operand. In MIPS-SE mode, allows a base address 3784 register plus 16-bit immediate offset. In MIPS mode, just a base register. 3785- ``R``: A memory address operand. In MIPS-SE mode, allows a base address 3786 register plus a 9-bit signed offset. In MIPS mode, the same as constraint 3787 ``m``. 3788- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or 3789 ``sc`` instruction on the given subtarget (details vary). 3790- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register. 3791- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register 3792 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w`` 3793 argument modifier for compatibility with GCC. 3794- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always 3795 ``25``). 3796- ``l``: The ``lo`` register, 32 or 64-bit. 3797- ``x``: Invalid. 3798 3799NVPTX: 3800 3801- ``b``: A 1-bit integer register. 3802- ``c`` or ``h``: A 16-bit integer register. 3803- ``r``: A 32-bit integer register. 3804- ``l`` or ``N``: A 64-bit integer register. 3805- ``f``: A 32-bit float register. 3806- ``d``: A 64-bit float register. 3807 3808 3809PowerPC: 3810 3811- ``I``: An immediate signed 16-bit integer. 3812- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits. 3813- ``K``: An immediate unsigned 16-bit integer. 3814- ``L``: An immediate signed 16-bit integer, shifted left 16 bits. 3815- ``M``: An immediate integer greater than 31. 3816- ``N``: An immediate integer that is an exact power of 2. 3817- ``O``: The immediate integer constant 0. 3818- ``P``: An immediate integer constant whose negation is a signed 16-bit 3819 constant. 3820- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently 3821 treated the same as ``m``. 3822- ``r``: A 32 or 64-bit integer register. 3823- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is: 3824 ``R1-R31``). 3825- ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a 3826 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers). 3827- ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a 3828 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit 3829 altivec vector register (``V0-V31``). 3830 3831 .. FIXME: is this a bug that v accepts QPX registers? I think this 3832 is supposed to only use the altivec vector registers? 3833 3834- ``y``: Condition register (``CR0-CR7``). 3835- ``wc``: An individual CR bit in a CR register. 3836- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX 3837 register set (overlapping both the floating-point and vector register files). 3838- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register 3839 set. 3840 3841Sparc: 3842 3843- ``I``: An immediate 13-bit signed integer. 3844- ``r``: A 32-bit integer register. 3845- ``f``: Any floating-point register on SparcV8, or a floating-point 3846 register in the "low" half of the registers on SparcV9. 3847- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.) 3848 3849SystemZ: 3850 3851- ``I``: An immediate unsigned 8-bit integer. 3852- ``J``: An immediate unsigned 12-bit integer. 3853- ``K``: An immediate signed 16-bit integer. 3854- ``L``: An immediate signed 20-bit integer. 3855- ``M``: An immediate integer 0x7fffffff. 3856- ``Q``: A memory address operand with a base address and a 12-bit immediate 3857 unsigned displacement. 3858- ``R``: A memory address operand with a base address, a 12-bit immediate 3859 unsigned displacement, and an index register. 3860- ``S``: A memory address operand with a base address and a 20-bit immediate 3861 signed displacement. 3862- ``T``: A memory address operand with a base address, a 20-bit immediate 3863 signed displacement, and an index register. 3864- ``r`` or ``d``: A 32, 64, or 128-bit integer register. 3865- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an 3866 address context evaluates as zero). 3867- ``h``: A 32-bit value in the high part of a 64bit data register 3868 (LLVM-specific) 3869- ``f``: A 32, 64, or 128-bit floating-point register. 3870 3871X86: 3872 3873- ``I``: An immediate integer between 0 and 31. 3874- ``J``: An immediate integer between 0 and 64. 3875- ``K``: An immediate signed 8-bit integer. 3876- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only) 3877 0xffffffff. 3878- ``M``: An immediate integer between 0 and 3. 3879- ``N``: An immediate unsigned 8-bit integer. 3880- ``O``: An immediate integer between 0 and 127. 3881- ``e``: An immediate 32-bit signed integer. 3882- ``Z``: An immediate 32-bit unsigned integer. 3883- ``o``, ``v``: Treated the same as ``m``, at the moment. 3884- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 3885 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d`` 3886 registers, and on X86-64, it is all of the integer registers. 3887- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit 3888 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers. 3889- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. 3890- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has 3891 existed since i386, and can be accessed without the REX prefix. 3892- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register. 3893- ``y``: A 64-bit MMX register, if MMX is enabled. 3894- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector 3895 operand in a SSE register. If AVX is also enabled, can also be a 256-bit 3896 vector operand in an AVX register. If AVX-512 is also enabled, can also be a 3897 512-bit vector operand in an AVX512 register, Otherwise, an error. 3898- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error. 3899- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in 3900 32-bit mode, a 64-bit integer operand will get split into two registers). It 3901 is not recommended to use this constraint, as in 64-bit mode, the 64-bit 3902 operand will get allocated only to RAX -- if two 32-bit operands are needed, 3903 you're better off splitting it yourself, before passing it to the asm 3904 statement. 3905 3906XCore: 3907 3908- ``r``: A 32-bit integer register. 3909 3910 3911.. _inline-asm-modifiers: 3912 3913Asm template argument modifiers 3914^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3915 3916In the asm template string, modifiers can be used on the operand reference, like 3917"``${0:n}``". 3918 3919The modifiers are, in general, expected to behave the same way they do in 3920GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C 3921inline asm code which was supported by GCC. A mismatch in behavior between LLVM 3922and GCC likely indicates a bug in LLVM. 3923 3924Target-independent: 3925 3926- ``c``: Print an immediate integer constant unadorned, without 3927 the target-specific immediate punctuation (e.g. no ``$`` prefix). 3928- ``n``: Negate and print immediate integer constant unadorned, without the 3929 target-specific immediate punctuation (e.g. no ``$`` prefix). 3930- ``l``: Print as an unadorned label, without the target-specific label 3931 punctuation (e.g. no ``$`` prefix). 3932 3933AArch64: 3934 3935- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g., 3936 instead of ``x30``, print ``w30``. 3937- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow). 3938- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a 3939 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of 3940 ``v*``. 3941 3942AMDGPU: 3943 3944- ``r``: No effect. 3945 3946ARM: 3947 3948- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a 3949 register). 3950- ``P``: No effect. 3951- ``q``: No effect. 3952- ``y``: Print a VFP single-precision register as an indexed double (e.g. print 3953 as ``d4[1]`` instead of ``s9``) 3954- ``B``: Bitwise invert and print an immediate integer constant without ``#`` 3955 prefix. 3956- ``L``: Print the low 16-bits of an immediate integer constant. 3957- ``M``: Print as a register set suitable for ldm/stm. Also prints *all* 3958 register operands subsequent to the specified one (!), so use carefully. 3959- ``Q``: Print the low-order register of a register-pair, or the low-order 3960 register of a two-register operand. 3961- ``R``: Print the high-order register of a register-pair, or the high-order 3962 register of a two-register operand. 3963- ``H``: Print the second register of a register-pair. (On a big-endian system, 3964 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent 3965 to ``R``.) 3966 3967 .. FIXME: H doesn't currently support printing the second register 3968 of a two-register operand. 3969 3970- ``e``: Print the low doubleword register of a NEON quad register. 3971- ``f``: Print the high doubleword register of a NEON quad register. 3972- ``m``: Print the base register of a memory operand without the ``[`` and ``]`` 3973 adornment. 3974 3975Hexagon: 3976 3977- ``L``: Print the second register of a two-register operand. Requires that it 3978 has been allocated consecutively to the first. 3979 3980 .. FIXME: why is it restricted to consecutive ones? And there's 3981 nothing that ensures that happens, is there? 3982 3983- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 3984 nothing. Used to print 'addi' vs 'add' instructions. 3985 3986MSP430: 3987 3988No additional modifiers. 3989 3990MIPS: 3991 3992- ``X``: Print an immediate integer as hexadecimal 3993- ``x``: Print the low 16 bits of an immediate integer as hexadecimal. 3994- ``d``: Print an immediate integer as decimal. 3995- ``m``: Subtract one and print an immediate integer as decimal. 3996- ``z``: Print $0 if an immediate zero, otherwise print normally. 3997- ``L``: Print the low-order register of a two-register operand, or prints the 3998 address of the low-order word of a double-word memory operand. 3999 4000 .. FIXME: L seems to be missing memory operand support. 4001 4002- ``M``: Print the high-order register of a two-register operand, or prints the 4003 address of the high-order word of a double-word memory operand. 4004 4005 .. FIXME: M seems to be missing memory operand support. 4006 4007- ``D``: Print the second register of a two-register operand, or prints the 4008 second word of a double-word memory operand. (On a big-endian system, ``D`` is 4009 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to 4010 ``M``.) 4011- ``w``: No effect. Provided for compatibility with GCC which requires this 4012 modifier in order to print MSA registers (``W0-W31``) with the ``f`` 4013 constraint. 4014 4015NVPTX: 4016 4017- ``r``: No effect. 4018 4019PowerPC: 4020 4021- ``L``: Print the second register of a two-register operand. Requires that it 4022 has been allocated consecutively to the first. 4023 4024 .. FIXME: why is it restricted to consecutive ones? And there's 4025 nothing that ensures that happens, is there? 4026 4027- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise 4028 nothing. Used to print 'addi' vs 'add' instructions. 4029- ``y``: For a memory operand, prints formatter for a two-register X-form 4030 instruction. (Currently always prints ``r0,OPERAND``). 4031- ``U``: Prints 'u' if the memory operand is an update form, and nothing 4032 otherwise. (NOTE: LLVM does not support update form, so this will currently 4033 always print nothing) 4034- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does 4035 not support indexed form, so this will currently always print nothing) 4036 4037Sparc: 4038 4039- ``r``: No effect. 4040 4041SystemZ: 4042 4043SystemZ implements only ``n``, and does *not* support any of the other 4044target-independent modifiers. 4045 4046X86: 4047 4048- ``c``: Print an unadorned integer or symbol name. (The latter is 4049 target-specific behavior for this typically target-independent modifier). 4050- ``A``: Print a register name with a '``*``' before it. 4051- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory 4052 operand. 4053- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a 4054 memory operand. 4055- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory 4056 operand. 4057- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory 4058 operand. 4059- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are 4060 available, otherwise the 32-bit register name; do nothing on a memory operand. 4061- ``n``: Negate and print an unadorned integer, or, for operands other than an 4062 immediate integer (e.g. a relocatable symbol expression), print a '-' before 4063 the operand. (The behavior for relocatable symbol expressions is a 4064 target-specific behavior for this typically target-independent modifier) 4065- ``H``: Print a memory reference with additional offset +8. 4066- ``P``: Print a memory reference or operand for use as the argument of a call 4067 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.) 4068 4069XCore: 4070 4071No additional modifiers. 4072 4073 4074Inline Asm Metadata 4075^^^^^^^^^^^^^^^^^^^ 4076 4077The call instructions that wrap inline asm nodes may have a 4078"``!srcloc``" MDNode attached to it that contains a list of constant 4079integers. If present, the code generator will use the integer as the 4080location cookie value when report errors through the ``LLVMContext`` 4081error reporting mechanisms. This allows a front-end to correlate backend 4082errors that occur with inline asm back to the source code that produced 4083it. For example: 4084 4085.. code-block:: llvm 4086 4087 call void asm sideeffect "something bad", ""(), !srcloc !42 4088 ... 4089 !42 = !{ i32 1234567 } 4090 4091It is up to the front-end to make sense of the magic numbers it places 4092in the IR. If the MDNode contains multiple constants, the code generator 4093will use the one that corresponds to the line of the asm that the error 4094occurs on. 4095 4096.. _metadata: 4097 4098Metadata 4099======== 4100 4101LLVM IR allows metadata to be attached to instructions in the program 4102that can convey extra information about the code to the optimizers and 4103code generator. One example application of metadata is source-level 4104debug information. There are two metadata primitives: strings and nodes. 4105 4106Metadata does not have a type, and is not a value. If referenced from a 4107``call`` instruction, it uses the ``metadata`` type. 4108 4109All metadata are identified in syntax by a exclamation point ('``!``'). 4110 4111.. _metadata-string: 4112 4113Metadata Nodes and Metadata Strings 4114----------------------------------- 4115 4116A metadata string is a string surrounded by double quotes. It can 4117contain any character by escaping non-printable characters with 4118"``\xx``" where "``xx``" is the two digit hex code. For example: 4119"``!"test\00"``". 4120 4121Metadata nodes are represented with notation similar to structure 4122constants (a comma separated list of elements, surrounded by braces and 4123preceded by an exclamation point). Metadata nodes can have any values as 4124their operand. For example: 4125 4126.. code-block:: llvm 4127 4128 !{ !"test\00", i32 10} 4129 4130Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example: 4131 4132.. code-block:: text 4133 4134 !0 = distinct !{!"test\00", i32 10} 4135 4136``distinct`` nodes are useful when nodes shouldn't be merged based on their 4137content. They can also occur when transformations cause uniquing collisions 4138when metadata operands change. 4139 4140A :ref:`named metadata <namedmetadatastructure>` is a collection of 4141metadata nodes, which can be looked up in the module symbol table. For 4142example: 4143 4144.. code-block:: llvm 4145 4146 !foo = !{!4, !3} 4147 4148Metadata can be used as function arguments. Here the ``llvm.dbg.value`` 4149intrinsic is using three metadata arguments: 4150 4151.. code-block:: llvm 4152 4153 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26) 4154 4155Metadata can be attached to an instruction. Here metadata ``!21`` is attached 4156to the ``add`` instruction using the ``!dbg`` identifier: 4157 4158.. code-block:: llvm 4159 4160 %indvar.next = add i64 %indvar, 1, !dbg !21 4161 4162Metadata can also be attached to a function or a global variable. Here metadata 4163``!22`` is attached to the ``f1`` and ``f2 functions, and the globals ``g1`` 4164and ``g2`` using the ``!dbg`` identifier: 4165 4166.. code-block:: llvm 4167 4168 declare !dbg !22 void @f1() 4169 define void @f2() !dbg !22 { 4170 ret void 4171 } 4172 4173 @g1 = global i32 0, !dbg !22 4174 @g2 = external global i32, !dbg !22 4175 4176A transformation is required to drop any metadata attachment that it does not 4177know or know it can't preserve. Currently there is an exception for metadata 4178attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be 4179unconditionally dropped unless the global is itself deleted. 4180 4181Metadata attached to a module using named metadata may not be dropped, with 4182the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``). 4183 4184More information about specific metadata nodes recognized by the 4185optimizers and code generator is found below. 4186 4187.. _specialized-metadata: 4188 4189Specialized Metadata Nodes 4190^^^^^^^^^^^^^^^^^^^^^^^^^^ 4191 4192Specialized metadata nodes are custom data structures in metadata (as opposed 4193to generic tuples). Their fields are labelled, and can be specified in any 4194order. 4195 4196These aren't inherently debug info centric, but currently all the specialized 4197metadata nodes are related to debug info. 4198 4199.. _DICompileUnit: 4200 4201DICompileUnit 4202""""""""""""" 4203 4204``DICompileUnit`` nodes represent a compile unit. The ``enums:``, 4205``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples 4206containing the debug info to be emitted along with the compile unit, regardless 4207of code optimizations (some nodes are only emitted if there are references to 4208them from instructions). The ``debugInfoForProfiling:`` field is a boolean 4209indicating whether or not line-table discriminators are updated to provide 4210more-accurate debug info for profiling results. 4211 4212.. code-block:: text 4213 4214 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang", 4215 isOptimized: true, flags: "-O2", runtimeVersion: 2, 4216 splitDebugFilename: "abc.debug", emissionKind: FullDebug, 4217 enums: !2, retainedTypes: !3, globals: !4, imports: !5, 4218 macros: !6, dwoId: 0x0abcd) 4219 4220Compile unit descriptors provide the root scope for objects declared in a 4221specific compilation unit. File descriptors are defined using this scope. These 4222descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep 4223track of global variables, type information, and imported entities (declarations 4224and namespaces). 4225 4226.. _DIFile: 4227 4228DIFile 4229"""""" 4230 4231``DIFile`` nodes represent files. The ``filename:`` can include slashes. 4232 4233.. code-block:: none 4234 4235 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir", 4236 checksumkind: CSK_MD5, 4237 checksum: "000102030405060708090a0b0c0d0e0f") 4238 4239Files are sometimes used in ``scope:`` fields, and are the only valid target 4240for ``file:`` fields. 4241Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1} 4242 4243.. _DIBasicType: 4244 4245DIBasicType 4246""""""""""" 4247 4248``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and 4249``float``. ``tag:`` defaults to ``DW_TAG_base_type``. 4250 4251.. code-block:: text 4252 4253 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 4254 encoding: DW_ATE_unsigned_char) 4255 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)") 4256 4257The ``encoding:`` describes the details of the type. Usually it's one of the 4258following: 4259 4260.. code-block:: text 4261 4262 DW_ATE_address = 1 4263 DW_ATE_boolean = 2 4264 DW_ATE_float = 4 4265 DW_ATE_signed = 5 4266 DW_ATE_signed_char = 6 4267 DW_ATE_unsigned = 7 4268 DW_ATE_unsigned_char = 8 4269 4270.. _DISubroutineType: 4271 4272DISubroutineType 4273"""""""""""""""" 4274 4275``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field 4276refers to a tuple; the first operand is the return type, while the rest are the 4277types of the formal arguments in order. If the first operand is ``null``, that 4278represents a function with no return value (such as ``void foo() {}`` in C++). 4279 4280.. code-block:: text 4281 4282 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed) 4283 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char) 4284 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char) 4285 4286.. _DIDerivedType: 4287 4288DIDerivedType 4289""""""""""""" 4290 4291``DIDerivedType`` nodes represent types derived from other types, such as 4292qualified types. 4293 4294.. code-block:: text 4295 4296 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8, 4297 encoding: DW_ATE_unsigned_char) 4298 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32, 4299 align: 32) 4300 4301The following ``tag:`` values are valid: 4302 4303.. code-block:: text 4304 4305 DW_TAG_member = 13 4306 DW_TAG_pointer_type = 15 4307 DW_TAG_reference_type = 16 4308 DW_TAG_typedef = 22 4309 DW_TAG_inheritance = 28 4310 DW_TAG_ptr_to_member_type = 31 4311 DW_TAG_const_type = 38 4312 DW_TAG_friend = 42 4313 DW_TAG_volatile_type = 53 4314 DW_TAG_restrict_type = 55 4315 DW_TAG_atomic_type = 71 4316 4317.. _DIDerivedTypeMember: 4318 4319``DW_TAG_member`` is used to define a member of a :ref:`composite type 4320<DICompositeType>`. The type of the member is the ``baseType:``. The 4321``offset:`` is the member's bit offset. If the composite type has an ODR 4322``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is 4323uniqued based only on its ``name:`` and ``scope:``. 4324 4325``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:`` 4326field of :ref:`composite types <DICompositeType>` to describe parents and 4327friends. 4328 4329``DW_TAG_typedef`` is used to provide a name for the ``baseType:``. 4330 4331``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``, 4332``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type`` 4333are used to qualify the ``baseType:``. 4334 4335Note that the ``void *`` type is expressed as a type derived from NULL. 4336 4337.. _DICompositeType: 4338 4339DICompositeType 4340""""""""""""""" 4341 4342``DICompositeType`` nodes represent types composed of other types, like 4343structures and unions. ``elements:`` points to a tuple of the composed types. 4344 4345If the source language supports ODR, the ``identifier:`` field gives the unique 4346identifier used for type merging between modules. When specified, 4347:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member 4348derived types <DIDerivedTypeMember>` that reference the ODR-type in their 4349``scope:`` change uniquing rules. 4350 4351For a given ``identifier:``, there should only be a single composite type that 4352does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules 4353together will unique such definitions at parse time via the ``identifier:`` 4354field, even if the nodes are ``distinct``. 4355 4356.. code-block:: text 4357 4358 !0 = !DIEnumerator(name: "SixKind", value: 7) 4359 !1 = !DIEnumerator(name: "SevenKind", value: 7) 4360 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 4361 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12, 4362 line: 2, size: 32, align: 32, identifier: "_M4Enum", 4363 elements: !{!0, !1, !2}) 4364 4365The following ``tag:`` values are valid: 4366 4367.. code-block:: text 4368 4369 DW_TAG_array_type = 1 4370 DW_TAG_class_type = 2 4371 DW_TAG_enumeration_type = 4 4372 DW_TAG_structure_type = 19 4373 DW_TAG_union_type = 23 4374 4375For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange 4376descriptors <DISubrange>`, each representing the range of subscripts at that 4377level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an 4378array type is a native packed vector. 4379 4380For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator 4381descriptors <DIEnumerator>`, each representing the definition of an enumeration 4382value for the set. All enumeration type descriptors are collected in the 4383``enums:`` field of the :ref:`compile unit <DICompileUnit>`. 4384 4385For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and 4386``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types 4387<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or 4388``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with 4389``isDefinition: false``. 4390 4391.. _DISubrange: 4392 4393DISubrange 4394"""""""""" 4395 4396``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of 4397:ref:`DICompositeType`. 4398 4399- ``count: -1`` indicates an empty array. 4400- ``count: !9`` describes the count with a :ref:`DILocalVariable`. 4401- ``count: !11`` describes the count with a :ref:`DIGlobalVariable`. 4402 4403.. code-block:: llvm 4404 4405 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0 4406 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1 4407 !2 = !DISubrange(count: -1) ; empty array. 4408 4409 ; Scopes used in rest of example 4410 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file") 4411 !7 = distinct !DICompileUnit(language: DW_LANG_C99, ... 4412 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5, ... 4413 4414 ; Use of local variable as count value 4415 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) 4416 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9) 4417 !11 = !DISubrange(count !10, lowerBound: 0) 4418 4419 ; Use of global variable as count value 4420 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9) 4421 !13 = !DISubrange(count !12, lowerBound: 0) 4422 4423.. _DIEnumerator: 4424 4425DIEnumerator 4426"""""""""""" 4427 4428``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type`` 4429variants of :ref:`DICompositeType`. 4430 4431.. code-block:: llvm 4432 4433 !0 = !DIEnumerator(name: "SixKind", value: 7) 4434 !1 = !DIEnumerator(name: "SevenKind", value: 7) 4435 !2 = !DIEnumerator(name: "NegEightKind", value: -8) 4436 4437DITemplateTypeParameter 4438""""""""""""""""""""""" 4439 4440``DITemplateTypeParameter`` nodes represent type parameters to generic source 4441language constructs. They are used (optionally) in :ref:`DICompositeType` and 4442:ref:`DISubprogram` ``templateParams:`` fields. 4443 4444.. code-block:: llvm 4445 4446 !0 = !DITemplateTypeParameter(name: "Ty", type: !1) 4447 4448DITemplateValueParameter 4449"""""""""""""""""""""""" 4450 4451``DITemplateValueParameter`` nodes represent value parameters to generic source 4452language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``, 4453but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or 4454``DW_TAG_GNU_template_param_pack``. They are used (optionally) in 4455:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields. 4456 4457.. code-block:: llvm 4458 4459 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7) 4460 4461DINamespace 4462""""""""""" 4463 4464``DINamespace`` nodes represent namespaces in the source language. 4465 4466.. code-block:: llvm 4467 4468 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7) 4469 4470.. _DIGlobalVariable: 4471 4472DIGlobalVariable 4473"""""""""""""""" 4474 4475``DIGlobalVariable`` nodes represent global variables in the source language. 4476 4477.. code-block:: llvm 4478 4479 !0 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !1, 4480 file: !2, line: 7, type: !3, isLocal: true, 4481 isDefinition: false, variable: i32* @foo, 4482 declaration: !4) 4483 4484All global variables should be referenced by the `globals:` field of a 4485:ref:`compile unit <DICompileUnit>`. 4486 4487.. _DISubprogram: 4488 4489DISubprogram 4490"""""""""""" 4491 4492``DISubprogram`` nodes represent functions from the source language. A 4493``DISubprogram`` may be attached to a function definition using ``!dbg`` 4494metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>` 4495that must be retained, even if their IR counterparts are optimized out of 4496the IR. The ``type:`` field must point at an :ref:`DISubroutineType`. 4497 4498.. _DISubprogramDeclaration: 4499 4500When ``isDefinition: false``, subprograms describe a declaration in the type 4501tree as opposed to a definition of a function. If the scope is a composite 4502type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, 4503then the subprogram declaration is uniqued based only on its ``linkageName:`` 4504and ``scope:``. 4505 4506.. code-block:: text 4507 4508 define void @_Z3foov() !dbg !0 { 4509 ... 4510 } 4511 4512 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1, 4513 file: !2, line: 7, type: !3, isLocal: true, 4514 isDefinition: true, scopeLine: 8, 4515 containingType: !4, 4516 virtuality: DW_VIRTUALITY_pure_virtual, 4517 virtualIndex: 10, flags: DIFlagPrototyped, 4518 isOptimized: true, unit: !5, templateParams: !6, 4519 declaration: !7, variables: !8, thrownTypes: !9) 4520 4521.. _DILexicalBlock: 4522 4523DILexicalBlock 4524"""""""""""""" 4525 4526``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram 4527<DISubprogram>`. The line number and column numbers are used to distinguish 4528two lexical blocks at same depth. They are valid targets for ``scope:`` 4529fields. 4530 4531.. code-block:: text 4532 4533 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35) 4534 4535Usually lexical blocks are ``distinct`` to prevent node merging based on 4536operands. 4537 4538.. _DILexicalBlockFile: 4539 4540DILexicalBlockFile 4541"""""""""""""""""" 4542 4543``DILexicalBlockFile`` nodes are used to discriminate between sections of a 4544:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to 4545indicate textual inclusion, or the ``discriminator:`` field can be used to 4546discriminate between control flow within a single block in the source language. 4547 4548.. code-block:: llvm 4549 4550 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35) 4551 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0) 4552 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1) 4553 4554.. _DILocation: 4555 4556DILocation 4557"""""""""" 4558 4559``DILocation`` nodes represent source debug locations. The ``scope:`` field is 4560mandatory, and points at an :ref:`DILexicalBlockFile`, an 4561:ref:`DILexicalBlock`, or an :ref:`DISubprogram`. 4562 4563.. code-block:: llvm 4564 4565 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2) 4566 4567.. _DILocalVariable: 4568 4569DILocalVariable 4570""""""""""""""" 4571 4572``DILocalVariable`` nodes represent local variables in the source language. If 4573the ``arg:`` field is set to non-zero, then this variable is a subprogram 4574parameter, and it will be included in the ``variables:`` field of its 4575:ref:`DISubprogram`. 4576 4577.. code-block:: text 4578 4579 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7, 4580 type: !3, flags: DIFlagArtificial) 4581 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7, 4582 type: !3) 4583 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) 4584 4585DIExpression 4586"""""""""""" 4587 4588``DIExpression`` nodes represent expressions that are inspired by the DWARF 4589expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>` 4590(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the 4591referenced LLVM variable relates to the source language variable. Debug 4592intrinsics are interpreted left-to-right: start by pushing the value/address 4593operand of the intrinsic onto a stack, then repeatedly push and evaluate 4594opcodes from the DIExpression until the final variable description is produced. 4595 4596The current supported opcode vocabulary is limited: 4597 4598- ``DW_OP_deref`` dereferences the top of the expression stack. 4599- ``DW_OP_plus`` pops the last two entries from the expression stack, adds 4600 them together and appends the result to the expression stack. 4601- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts 4602 the last entry from the second last entry and appends the result to the 4603 expression stack. 4604- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. 4605- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` 4606 here, respectively) of the variable fragment from the working expression. Note 4607 that contrary to DW_OP_bit_piece, the offset is describing the location 4608 within the described source variable. 4609- ``DW_OP_swap`` swaps top two stack entries. 4610- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top 4611 of the stack is treated as an address. The second stack entry is treated as an 4612 address space identifier. 4613- ``DW_OP_stack_value`` marks a constant value. 4614 4615DWARF specifies three kinds of simple location descriptions: Register, memory, 4616and implicit location descriptions. Note that a location description is 4617defined over certain ranges of a program, i.e the location of a variable may 4618change over the course of the program. Register and memory location 4619descriptions describe the *concrete location* of a source variable (in the 4620sense that a debugger might modify its value), whereas *implicit locations* 4621describe merely the actual *value* of a source variable which might not exist 4622in registers or in memory (see ``DW_OP_stack_value``). 4623 4624A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect 4625value (the address) of a source variable. The first operand of the intrinsic 4626must be an address of some kind. A DIExpression attached to the intrinsic 4627refines this address to produce a concrete location for the source variable. 4628 4629A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable. 4630The first operand of the intrinsic may be a direct or indirect value. A 4631DIExpresion attached to the intrinsic refines the first operand to produce a 4632direct value. For example, if the first operand is an indirect value, it may be 4633necessary to insert ``DW_OP_deref`` into the DIExpresion in order to produce a 4634valid debug intrinsic. 4635 4636.. note:: 4637 4638 A DIExpression is interpreted in the same way regardless of which kind of 4639 debug intrinsic it's attached to. 4640 4641.. code-block:: text 4642 4643 !0 = !DIExpression(DW_OP_deref) 4644 !1 = !DIExpression(DW_OP_plus_uconst, 3) 4645 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus) 4646 !2 = !DIExpression(DW_OP_bit_piece, 3, 7) 4647 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7) 4648 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef) 4649 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value) 4650 4651DIObjCProperty 4652"""""""""""""" 4653 4654``DIObjCProperty`` nodes represent Objective-C property nodes. 4655 4656.. code-block:: llvm 4657 4658 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo", 4659 getter: "getFoo", attributes: 7, type: !2) 4660 4661DIImportedEntity 4662"""""""""""""""" 4663 4664``DIImportedEntity`` nodes represent entities (such as modules) imported into a 4665compile unit. 4666 4667.. code-block:: text 4668 4669 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0, 4670 entity: !1, line: 7) 4671 4672DIMacro 4673""""""" 4674 4675``DIMacro`` nodes represent definition or undefinition of a macro identifiers. 4676The ``name:`` field is the macro identifier, followed by macro parameters when 4677defining a function-like macro, and the ``value`` field is the token-string 4678used to expand the macro identifier. 4679 4680.. code-block:: text 4681 4682 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)", 4683 value: "((x) + 1)") 4684 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo") 4685 4686DIMacroFile 4687""""""""""" 4688 4689``DIMacroFile`` nodes represent inclusion of source files. 4690The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that 4691appear in the included source file. 4692 4693.. code-block:: text 4694 4695 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2, 4696 nodes: !3) 4697 4698'``tbaa``' Metadata 4699^^^^^^^^^^^^^^^^^^^ 4700 4701In LLVM IR, memory does not have types, so LLVM's own type system is not 4702suitable for doing type based alias analysis (TBAA). Instead, metadata is 4703added to the IR to describe a type system of a higher level language. This 4704can be used to implement C/C++ strict type aliasing rules, but it can also 4705be used to implement custom alias analysis behavior for other languages. 4706 4707This description of LLVM's TBAA system is broken into two parts: 4708:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and 4709:ref:`Representation<tbaa_node_representation>` talks about the metadata 4710encoding of various entities. 4711 4712It is always possible to trace any TBAA node to a "root" TBAA node (details 4713in the :ref:`Representation<tbaa_node_representation>` section). TBAA 4714nodes with different roots have an unknown aliasing relationship, and LLVM 4715conservatively infers ``MayAlias`` between them. The rules mentioned in 4716this section only pertain to TBAA nodes living under the same root. 4717 4718.. _tbaa_node_semantics: 4719 4720Semantics 4721""""""""" 4722 4723The TBAA metadata system, referred to as "struct path TBAA" (not to be 4724confused with ``tbaa.struct``), consists of the following high level 4725concepts: *Type Descriptors*, further subdivided into scalar type 4726descriptors and struct type descriptors; and *Access Tags*. 4727 4728**Type descriptors** describe the type system of the higher level language 4729being compiled. **Scalar type descriptors** describe types that do not 4730contain other types. Each scalar type has a parent type, which must also 4731be a scalar type or the TBAA root. Via this parent relation, scalar types 4732within a TBAA root form a tree. **Struct type descriptors** denote types 4733that contain a sequence of other type descriptors, at known offsets. These 4734contained type descriptors can either be struct type descriptors themselves 4735or scalar type descriptors. 4736 4737**Access tags** are metadata nodes attached to load and store instructions. 4738Access tags use type descriptors to describe the *location* being accessed 4739in terms of the type system of the higher level language. Access tags are 4740tuples consisting of a base type, an access type and an offset. The base 4741type is a scalar type descriptor or a struct type descriptor, the access 4742type is a scalar type descriptor, and the offset is a constant integer. 4743 4744The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two 4745things: 4746 4747 * If ``BaseTy`` is a struct type, the tag describes a memory access (load 4748 or store) of a value of type ``AccessTy`` contained in the struct type 4749 ``BaseTy`` at offset ``Offset``. 4750 4751 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and 4752 ``AccessTy`` must be the same; and the access tag describes a scalar 4753 access with scalar type ``AccessTy``. 4754 4755We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)`` 4756tuples this way: 4757 4758 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is 4759 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as 4760 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is 4761 undefined if ``Offset`` is non-zero. 4762 4763 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)`` 4764 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in 4765 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted 4766 to be relative within that inner type. 4767 4768A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)`` 4769aliases a memory access with an access tag ``(BaseTy2, AccessTy2, 4770Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2, 4771Offset2)`` via the ``Parent`` relation or vice versa. 4772 4773As a concrete example, the type descriptor graph for the following program 4774 4775.. code-block:: c 4776 4777 struct Inner { 4778 int i; // offset 0 4779 float f; // offset 4 4780 }; 4781 4782 struct Outer { 4783 float f; // offset 0 4784 double d; // offset 4 4785 struct Inner inner_a; // offset 12 4786 }; 4787 4788 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) { 4789 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0) 4790 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12) 4791 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16) 4792 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0) 4793 } 4794 4795is (note that in C and C++, ``char`` can be used to access any arbitrary 4796type): 4797 4798.. code-block:: text 4799 4800 Root = "TBAA Root" 4801 CharScalarTy = ("char", Root, 0) 4802 FloatScalarTy = ("float", CharScalarTy, 0) 4803 DoubleScalarTy = ("double", CharScalarTy, 0) 4804 IntScalarTy = ("int", CharScalarTy, 0) 4805 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)} 4806 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4), 4807 (InnerStructTy, 12)} 4808 4809 4810with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy, 48110)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and 4812``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``. 4813 4814.. _tbaa_node_representation: 4815 4816Representation 4817"""""""""""""" 4818 4819The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or 4820with exactly one ``MDString`` operand. 4821 4822Scalar type descriptors are represented as an ``MDNode`` s with two 4823operands. The first operand is an ``MDString`` denoting the name of the 4824struct type. LLVM does not assign meaning to the value of this operand, it 4825only cares about it being an ``MDString``. The second operand is an 4826``MDNode`` which points to the parent for said scalar type descriptor, 4827which is either another scalar type descriptor or the TBAA root. Scalar 4828type descriptors can have an optional third argument, but that must be the 4829constant integer zero. 4830 4831Struct type descriptors are represented as ``MDNode`` s with an odd number 4832of operands greater than 1. The first operand is an ``MDString`` denoting 4833the name of the struct type. Like in scalar type descriptors the actual 4834value of this name operand is irrelevant to LLVM. After the name operand, 4835the struct type descriptors have a sequence of alternating ``MDNode`` and 4836``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand, 4837an ``MDNode``, denotes a contained field, and the 2N th operand, a 4838``ConstantInt``, is the offset of the said contained field. The offsets 4839must be in non-decreasing order. 4840 4841Access tags are represented as ``MDNode`` s with either 3 or 4 operands. 4842The first operand is an ``MDNode`` pointing to the node representing the 4843base type. The second operand is an ``MDNode`` pointing to the node 4844representing the access type. The third operand is a ``ConstantInt`` that 4845states the offset of the access. If a fourth field is present, it must be 4846a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states 4847that the location being accessed is "constant" (meaning 4848``pointsToConstantMemory`` should return true; see `other useful 4849AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of 4850the access type and the base type of an access tag must be the same, and 4851that is the TBAA root of the access tag. 4852 4853'``tbaa.struct``' Metadata 4854^^^^^^^^^^^^^^^^^^^^^^^^^^ 4855 4856The :ref:`llvm.memcpy <int_memcpy>` is often used to implement 4857aggregate assignment operations in C and similar languages, however it 4858is defined to copy a contiguous region of memory, which is more than 4859strictly necessary for aggregate types which contain holes due to 4860padding. Also, it doesn't contain any TBAA information about the fields 4861of the aggregate. 4862 4863``!tbaa.struct`` metadata can describe which memory subregions in a 4864memcpy are padding and what the TBAA tags of the struct are. 4865 4866The current metadata format is very simple. ``!tbaa.struct`` metadata 4867nodes are a list of operands which are in conceptual groups of three. 4868For each group of three, the first operand gives the byte offset of a 4869field in bytes, the second gives its size in bytes, and the third gives 4870its tbaa tag. e.g.: 4871 4872.. code-block:: llvm 4873 4874 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 } 4875 4876This describes a struct with two fields. The first is at offset 0 bytes 4877with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes 4878and has size 4 bytes and has tbaa tag !2. 4879 4880Note that the fields need not be contiguous. In this example, there is a 48814 byte gap between the two fields. This gap represents padding which 4882does not carry useful data and need not be preserved. 4883 4884'``noalias``' and '``alias.scope``' Metadata 4885^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 4886 4887``noalias`` and ``alias.scope`` metadata provide the ability to specify generic 4888noalias memory-access sets. This means that some collection of memory access 4889instructions (loads, stores, memory-accessing calls, etc.) that carry 4890``noalias`` metadata can specifically be specified not to alias with some other 4891collection of memory access instructions that carry ``alias.scope`` metadata. 4892Each type of metadata specifies a list of scopes where each scope has an id and 4893a domain. 4894 4895When evaluating an aliasing query, if for some domain, the set 4896of scopes with that domain in one instruction's ``alias.scope`` list is a 4897subset of (or equal to) the set of scopes for that domain in another 4898instruction's ``noalias`` list, then the two memory accesses are assumed not to 4899alias. 4900 4901Because scopes in one domain don't affect scopes in other domains, separate 4902domains can be used to compose multiple independent noalias sets. This is 4903used for example during inlining. As the noalias function parameters are 4904turned into noalias scope metadata, a new domain is used every time the 4905function is inlined. 4906 4907The metadata identifying each domain is itself a list containing one or two 4908entries. The first entry is the name of the domain. Note that if the name is a 4909string then it can be combined across functions and translation units. A 4910self-reference can be used to create globally unique domain names. A 4911descriptive string may optionally be provided as a second list entry. 4912 4913The metadata identifying each scope is also itself a list containing two or 4914three entries. The first entry is the name of the scope. Note that if the name 4915is a string then it can be combined across functions and translation units. A 4916self-reference can be used to create globally unique scope names. A metadata 4917reference to the scope's domain is the second entry. A descriptive string may 4918optionally be provided as a third list entry. 4919 4920For example, 4921 4922.. code-block:: llvm 4923 4924 ; Two scope domains: 4925 !0 = !{!0} 4926 !1 = !{!1} 4927 4928 ; Some scopes in these domains: 4929 !2 = !{!2, !0} 4930 !3 = !{!3, !0} 4931 !4 = !{!4, !1} 4932 4933 ; Some scope lists: 4934 !5 = !{!4} ; A list containing only scope !4 4935 !6 = !{!4, !3, !2} 4936 !7 = !{!3} 4937 4938 ; These two instructions don't alias: 4939 %0 = load float, float* %c, align 4, !alias.scope !5 4940 store float %0, float* %arrayidx.i, align 4, !noalias !5 4941 4942 ; These two instructions also don't alias (for domain !1, the set of scopes 4943 ; in the !alias.scope equals that in the !noalias list): 4944 %2 = load float, float* %c, align 4, !alias.scope !5 4945 store float %2, float* %arrayidx.i2, align 4, !noalias !6 4946 4947 ; These two instructions may alias (for domain !0, the set of scopes in 4948 ; the !noalias list is not a superset of, or equal to, the scopes in the 4949 ; !alias.scope list): 4950 %2 = load float, float* %c, align 4, !alias.scope !6 4951 store float %0, float* %arrayidx.i, align 4, !noalias !7 4952 4953'``fpmath``' Metadata 4954^^^^^^^^^^^^^^^^^^^^^ 4955 4956``fpmath`` metadata may be attached to any instruction of floating-point 4957type. It can be used to express the maximum acceptable error in the 4958result of that instruction, in ULPs, thus potentially allowing the 4959compiler to use a more efficient but less accurate method of computing 4960it. ULP is defined as follows: 4961 4962 If ``x`` is a real number that lies between two finite consecutive 4963 floating-point numbers ``a`` and ``b``, without being equal to one 4964 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the 4965 distance between the two non-equal finite floating-point numbers 4966 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. 4967 4968The metadata node shall consist of a single positive float type number 4969representing the maximum relative error, for example: 4970 4971.. code-block:: llvm 4972 4973 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs 4974 4975.. _range-metadata: 4976 4977'``range``' Metadata 4978^^^^^^^^^^^^^^^^^^^^ 4979 4980``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of 4981integer types. It expresses the possible ranges the loaded value or the value 4982returned by the called function at this call site is in. If the loaded or 4983returned value is not in the specified range, the behavior is undefined. The 4984ranges are represented with a flattened list of integers. The loaded value or 4985the value returned is known to be in the union of the ranges defined by each 4986consecutive pair. Each pair has the following properties: 4987 4988- The type must match the type loaded by the instruction. 4989- The pair ``a,b`` represents the range ``[a,b)``. 4990- Both ``a`` and ``b`` are constants. 4991- The range is allowed to wrap. 4992- The range should not represent the full or empty set. That is, 4993 ``a!=b``. 4994 4995In addition, the pairs must be in signed order of the lower bound and 4996they must be non-contiguous. 4997 4998Examples: 4999 5000.. code-block:: llvm 5001 5002 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1 5003 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 5004 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 5005 %d = invoke i8 @bar() to label %cont 5006 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 5007 ... 5008 !0 = !{ i8 0, i8 2 } 5009 !1 = !{ i8 255, i8 2 } 5010 !2 = !{ i8 0, i8 2, i8 3, i8 6 } 5011 !3 = !{ i8 -2, i8 0, i8 3, i8 6 } 5012 5013'``absolute_symbol``' Metadata 5014^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5015 5016``absolute_symbol`` metadata may be attached to a global variable 5017declaration. It marks the declaration as a reference to an absolute symbol, 5018which causes the backend to use absolute relocations for the symbol even 5019in position independent code, and expresses the possible ranges that the 5020global variable's *address* (not its value) is in, in the same format as 5021``range`` metadata, with the extension that the pair ``all-ones,all-ones`` 5022may be used to represent the full set. 5023 5024Example (assuming 64-bit pointers): 5025 5026.. code-block:: llvm 5027 5028 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256) 5029 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64) 5030 5031 ... 5032 !0 = !{ i64 0, i64 256 } 5033 !1 = !{ i64 -1, i64 -1 } 5034 5035'``callees``' Metadata 5036^^^^^^^^^^^^^^^^^^^^^^ 5037 5038``callees`` metadata may be attached to indirect call sites. If ``callees`` 5039metadata is attached to a call site, and any callee is not among the set of 5040functions provided by the metadata, the behavior is undefined. The intent of 5041this metadata is to facilitate optimizations such as indirect-call promotion. 5042For example, in the code below, the call instruction may only target the 5043``add`` or ``sub`` functions: 5044 5045.. code-block:: llvm 5046 5047 %result = call i64 %binop(i64 %x, i64 %y), !callees !0 5048 5049 ... 5050 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub} 5051 5052'``unpredictable``' Metadata 5053^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5054 5055``unpredictable`` metadata may be attached to any branch or switch 5056instruction. It can be used to express the unpredictability of control 5057flow. Similar to the llvm.expect intrinsic, it may be used to alter 5058optimizations related to compare and branch instructions. The metadata 5059is treated as a boolean value; if it exists, it signals that the branch 5060or switch that it is attached to is completely unpredictable. 5061 5062'``llvm.loop``' 5063^^^^^^^^^^^^^^^ 5064 5065It is sometimes useful to attach information to loop constructs. Currently, 5066loop metadata is implemented as metadata attached to the branch instruction 5067in the loop latch block. This type of metadata refer to a metadata node that is 5068guaranteed to be separate for each loop. The loop identifier metadata is 5069specified with the name ``llvm.loop``. 5070 5071The loop identifier metadata is implemented using a metadata that refers to 5072itself to avoid merging it with any other identifier metadata, e.g., 5073during module linkage or function inlining. That is, each loop should refer 5074to their own identification metadata even if they reside in separate functions. 5075The following example contains loop identifier metadata for two separate loop 5076constructs: 5077 5078.. code-block:: llvm 5079 5080 !0 = !{!0} 5081 !1 = !{!1} 5082 5083The loop identifier metadata can be used to specify additional 5084per-loop metadata. Any operands after the first operand can be treated 5085as user-defined metadata. For example the ``llvm.loop.unroll.count`` 5086suggests an unroll factor to the loop unroller: 5087 5088.. code-block:: llvm 5089 5090 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 5091 ... 5092 !0 = !{!0, !1} 5093 !1 = !{!"llvm.loop.unroll.count", i32 4} 5094 5095'``llvm.loop.vectorize``' and '``llvm.loop.interleave``' 5096^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5097 5098Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are 5099used to control per-loop vectorization and interleaving parameters such as 5100vectorization width and interleave count. These metadata should be used in 5101conjunction with ``llvm.loop`` loop identification metadata. The 5102``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only 5103optimization hints and the optimizer will only interleave and vectorize loops if 5104it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata 5105which contains information about loop-carried memory dependencies can be helpful 5106in determining the safety of these transformations. 5107 5108'``llvm.loop.interleave.count``' Metadata 5109^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5110 5111This metadata suggests an interleave count to the loop interleaver. 5112The first operand is the string ``llvm.loop.interleave.count`` and the 5113second operand is an integer specifying the interleave count. For 5114example: 5115 5116.. code-block:: llvm 5117 5118 !0 = !{!"llvm.loop.interleave.count", i32 4} 5119 5120Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving 5121multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0 5122then the interleave count will be determined automatically. 5123 5124'``llvm.loop.vectorize.enable``' Metadata 5125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5126 5127This metadata selectively enables or disables vectorization for the loop. The 5128first operand is the string ``llvm.loop.vectorize.enable`` and the second operand 5129is a bit. If the bit operand value is 1 vectorization is enabled. A value of 51300 disables vectorization: 5131 5132.. code-block:: llvm 5133 5134 !0 = !{!"llvm.loop.vectorize.enable", i1 0} 5135 !1 = !{!"llvm.loop.vectorize.enable", i1 1} 5136 5137'``llvm.loop.vectorize.width``' Metadata 5138^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5139 5140This metadata sets the target width of the vectorizer. The first 5141operand is the string ``llvm.loop.vectorize.width`` and the second 5142operand is an integer specifying the width. For example: 5143 5144.. code-block:: llvm 5145 5146 !0 = !{!"llvm.loop.vectorize.width", i32 4} 5147 5148Note that setting ``llvm.loop.vectorize.width`` to 1 disables 5149vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to 51500 or if the loop does not have this metadata the width will be 5151determined automatically. 5152 5153'``llvm.loop.unroll``' 5154^^^^^^^^^^^^^^^^^^^^^^ 5155 5156Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling 5157optimization hints such as the unroll factor. ``llvm.loop.unroll`` 5158metadata should be used in conjunction with ``llvm.loop`` loop 5159identification metadata. The ``llvm.loop.unroll`` metadata are only 5160optimization hints and the unrolling will only be performed if the 5161optimizer believes it is safe to do so. 5162 5163'``llvm.loop.unroll.count``' Metadata 5164^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5165 5166This metadata suggests an unroll factor to the loop unroller. The 5167first operand is the string ``llvm.loop.unroll.count`` and the second 5168operand is a positive integer specifying the unroll factor. For 5169example: 5170 5171.. code-block:: llvm 5172 5173 !0 = !{!"llvm.loop.unroll.count", i32 4} 5174 5175If the trip count of the loop is less than the unroll count the loop 5176will be partially unrolled. 5177 5178'``llvm.loop.unroll.disable``' Metadata 5179^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5180 5181This metadata disables loop unrolling. The metadata has a single operand 5182which is the string ``llvm.loop.unroll.disable``. For example: 5183 5184.. code-block:: llvm 5185 5186 !0 = !{!"llvm.loop.unroll.disable"} 5187 5188'``llvm.loop.unroll.runtime.disable``' Metadata 5189^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5190 5191This metadata disables runtime loop unrolling. The metadata has a single 5192operand which is the string ``llvm.loop.unroll.runtime.disable``. For example: 5193 5194.. code-block:: llvm 5195 5196 !0 = !{!"llvm.loop.unroll.runtime.disable"} 5197 5198'``llvm.loop.unroll.enable``' Metadata 5199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5200 5201This metadata suggests that the loop should be fully unrolled if the trip count 5202is known at compile time and partially unrolled if the trip count is not known 5203at compile time. The metadata has a single operand which is the string 5204``llvm.loop.unroll.enable``. For example: 5205 5206.. code-block:: llvm 5207 5208 !0 = !{!"llvm.loop.unroll.enable"} 5209 5210'``llvm.loop.unroll.full``' Metadata 5211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5212 5213This metadata suggests that the loop should be unrolled fully. The 5214metadata has a single operand which is the string ``llvm.loop.unroll.full``. 5215For example: 5216 5217.. code-block:: llvm 5218 5219 !0 = !{!"llvm.loop.unroll.full"} 5220 5221'``llvm.loop.unroll_and_jam``' 5222^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5223 5224This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata 5225above, but affect the unroll and jam pass. In addition any loop with 5226``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will 5227disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the 5228unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam 5229too.) 5230 5231The metadata for unroll and jam otherwise is the same as for ``unroll``. 5232``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and 5233``llvm.loop.unroll_and_jam.count`` do the same as for unroll. 5234``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints 5235and the normal safety checks will still be performed. 5236 5237'``llvm.loop.unroll_and_jam.count``' Metadata 5238^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5239 5240This metadata suggests an unroll and jam factor to use, similarly to 5241``llvm.loop.unroll.count``. The first operand is the string 5242``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer 5243specifying the unroll factor. For example: 5244 5245.. code-block:: llvm 5246 5247 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4} 5248 5249If the trip count of the loop is less than the unroll count the loop 5250will be partially unroll and jammed. 5251 5252'``llvm.loop.unroll_and_jam.disable``' Metadata 5253^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5254 5255This metadata disables loop unroll and jamming. The metadata has a single 5256operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example: 5257 5258.. code-block:: llvm 5259 5260 !0 = !{!"llvm.loop.unroll_and_jam.disable"} 5261 5262'``llvm.loop.unroll_and_jam.enable``' Metadata 5263^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5264 5265This metadata suggests that the loop should be fully unroll and jammed if the 5266trip count is known at compile time and partially unrolled if the trip count is 5267not known at compile time. The metadata has a single operand which is the 5268string ``llvm.loop.unroll_and_jam.enable``. For example: 5269 5270.. code-block:: llvm 5271 5272 !0 = !{!"llvm.loop.unroll_and_jam.enable"} 5273 5274'``llvm.loop.licm_versioning.disable``' Metadata 5275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5276 5277This metadata indicates that the loop should not be versioned for the purpose 5278of enabling loop-invariant code motion (LICM). The metadata has a single operand 5279which is the string ``llvm.loop.licm_versioning.disable``. For example: 5280 5281.. code-block:: llvm 5282 5283 !0 = !{!"llvm.loop.licm_versioning.disable"} 5284 5285'``llvm.loop.distribute.enable``' Metadata 5286^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5287 5288Loop distribution allows splitting a loop into multiple loops. Currently, 5289this is only performed if the entire loop cannot be vectorized due to unsafe 5290memory dependencies. The transformation will attempt to isolate the unsafe 5291dependencies into their own loop. 5292 5293This metadata can be used to selectively enable or disable distribution of the 5294loop. The first operand is the string ``llvm.loop.distribute.enable`` and the 5295second operand is a bit. If the bit operand value is 1 distribution is 5296enabled. A value of 0 disables distribution: 5297 5298.. code-block:: llvm 5299 5300 !0 = !{!"llvm.loop.distribute.enable", i1 0} 5301 !1 = !{!"llvm.loop.distribute.enable", i1 1} 5302 5303This metadata should be used in conjunction with ``llvm.loop`` loop 5304identification metadata. 5305 5306'``llvm.mem``' 5307^^^^^^^^^^^^^^^ 5308 5309Metadata types used to annotate memory accesses with information helpful 5310for optimizations are prefixed with ``llvm.mem``. 5311 5312'``llvm.mem.parallel_loop_access``' Metadata 5313^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5314 5315The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier, 5316or metadata containing a list of loop identifiers for nested loops. 5317The metadata is attached to memory accessing instructions and denotes that 5318no loop carried memory dependence exist between it and other instructions denoted 5319with the same loop identifier. The metadata on memory reads also implies that 5320if conversion (i.e. speculative execution within a loop iteration) is safe. 5321 5322Precisely, given two instructions ``m1`` and ``m2`` that both have the 5323``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the 5324set of loops associated with that metadata, respectively, then there is no loop 5325carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and 5326``L2``. 5327 5328As a special case, if all memory accessing instructions in a loop have 5329``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the 5330loop has no loop carried memory dependences and is considered to be a parallel 5331loop. 5332 5333Note that if not all memory access instructions have such metadata referring to 5334the loop, then the loop is considered not being trivially parallel. Additional 5335memory dependence analysis is required to make that determination. As a fail 5336safe mechanism, this causes loops that were originally parallel to be considered 5337sequential (if optimization passes that are unaware of the parallel semantics 5338insert new memory instructions into the loop body). 5339 5340Example of a loop that is considered parallel due to its correct use of 5341both ``llvm.loop`` and ``llvm.mem.parallel_loop_access`` 5342metadata types that refer to the same loop identifier metadata. 5343 5344.. code-block:: llvm 5345 5346 for.body: 5347 ... 5348 %val0 = load i32, i32* %arrayidx, !llvm.mem.parallel_loop_access !0 5349 ... 5350 store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0 5351 ... 5352 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 5353 5354 for.end: 5355 ... 5356 !0 = !{!0} 5357 5358It is also possible to have nested parallel loops. In that case the 5359memory accesses refer to a list of loop identifier metadata nodes instead of 5360the loop identifier metadata node directly: 5361 5362.. code-block:: llvm 5363 5364 outer.for.body: 5365 ... 5366 %val1 = load i32, i32* %arrayidx3, !llvm.mem.parallel_loop_access !2 5367 ... 5368 br label %inner.for.body 5369 5370 inner.for.body: 5371 ... 5372 %val0 = load i32, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0 5373 ... 5374 store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0 5375 ... 5376 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 5377 5378 inner.for.end: 5379 ... 5380 store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2 5381 ... 5382 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 5383 5384 outer.for.end: ; preds = %for.body 5385 ... 5386 !0 = !{!1, !2} ; a list of loop identifiers 5387 !1 = !{!1} ; an identifier for the inner loop 5388 !2 = !{!2} ; an identifier for the outer loop 5389 5390'``irr_loop``' Metadata 5391^^^^^^^^^^^^^^^^^^^^^^^ 5392 5393``irr_loop`` metadata may be attached to the terminator instruction of a basic 5394block that's an irreducible loop header (note that an irreducible loop has more 5395than once header basic blocks.) If ``irr_loop`` metadata is attached to the 5396terminator instruction of a basic block that is not really an irreducible loop 5397header, the behavior is undefined. The intent of this metadata is to improve the 5398accuracy of the block frequency propagation. For example, in the code below, the 5399block ``header0`` may have a loop header weight (relative to the other headers of 5400the irreducible loop) of 100: 5401 5402.. code-block:: llvm 5403 5404 header0: 5405 ... 5406 br i1 %cmp, label %t1, label %t2, !irr_loop !0 5407 5408 ... 5409 !0 = !{"loop_header_weight", i64 100} 5410 5411Irreducible loop header weights are typically based on profile data. 5412 5413'``invariant.group``' Metadata 5414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 5415 5416The experimental ``invariant.group`` metadata may be attached to 5417``load``/``store`` instructions referencing a single metadata with no entries. 5418The existence of the ``invariant.group`` metadata on the instruction tells 5419the optimizer that every ``load`` and ``store`` to the same pointer operand 5420can be assumed to load or store the same 5421value (but see the ``llvm.launder.invariant.group`` intrinsic which affects 5422when two pointers are considered the same). Pointers returned by bitcast or 5423getelementptr with only zero indices are considered the same. 5424 5425Examples: 5426 5427.. code-block:: llvm 5428 5429 @unknownPtr = external global i8 5430 ... 5431 %ptr = alloca i8 5432 store i8 42, i8* %ptr, !invariant.group !0 5433 call void @foo(i8* %ptr) 5434 5435 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change 5436 call void @foo(i8* %ptr) 5437 5438 %newPtr = call i8* @getPointer(i8* %ptr) 5439 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr 5440 5441 %unknownValue = load i8, i8* @unknownPtr 5442 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42 5443 5444 call void @foo(i8* %ptr) 5445 %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr) 5446 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr 5447 5448 ... 5449 declare void @foo(i8*) 5450 declare i8* @getPointer(i8*) 5451 declare i8* @llvm.launder.invariant.group(i8*) 5452 5453 !0 = !{} 5454 5455The invariant.group metadata must be dropped when replacing one pointer by 5456another based on aliasing information. This is because invariant.group is tied 5457to the SSA value of the pointer operand. 5458 5459.. code-block:: llvm 5460 5461 %v = load i8, i8* %x, !invariant.group !0 5462 ; if %x mustalias %y then we can replace the above instruction with 5463 %v = load i8, i8* %y 5464 5465Note that this is an experimental feature, which means that its semantics might 5466change in the future. 5467 5468'``type``' Metadata 5469^^^^^^^^^^^^^^^^^^^ 5470 5471See :doc:`TypeMetadata`. 5472 5473'``associated``' Metadata 5474^^^^^^^^^^^^^^^^^^^^^^^^^ 5475 5476The ``associated`` metadata may be attached to a global object 5477declaration with a single argument that references another global object. 5478 5479This metadata prevents discarding of the global object in linker GC 5480unless the referenced object is also discarded. The linker support for 5481this feature is spotty. For best compatibility, globals carrying this 5482metadata may also: 5483 5484- Be in a comdat with the referenced global. 5485- Be in @llvm.compiler.used. 5486- Have an explicit section with a name which is a valid C identifier. 5487 5488It does not have any effect on non-ELF targets. 5489 5490Example: 5491 5492.. code-block:: text 5493 5494 $a = comdat any 5495 @a = global i32 1, comdat $a 5496 @b = internal global i32 2, comdat $a, section "abc", !associated !0 5497 !0 = !{i32* @a} 5498 5499 5500'``prof``' Metadata 5501^^^^^^^^^^^^^^^^^^^ 5502 5503The ``prof`` metadata is used to record profile data in the IR. 5504The first operand of the metadata node indicates the profile metadata 5505type. There are currently 3 types: 5506:ref:`branch_weights<prof_node_branch_weights>`, 5507:ref:`function_entry_count<prof_node_function_entry_count>`, and 5508:ref:`VP<prof_node_VP>`. 5509 5510.. _prof_node_branch_weights: 5511 5512branch_weights 5513"""""""""""""" 5514 5515Branch weight metadata attached to a branch, select, switch or call instruction 5516represents the likeliness of the associated branch being taken. 5517For more information, see :doc:`BranchWeightMetadata`. 5518 5519.. _prof_node_function_entry_count: 5520 5521function_entry_count 5522"""""""""""""""""""" 5523 5524Function entry count metadata can be attached to function definitions 5525to record the number of times the function is called. Used with BFI 5526information, it is also used to derive the basic block profile count. 5527For more information, see :doc:`BranchWeightMetadata`. 5528 5529.. _prof_node_VP: 5530 5531VP 5532"" 5533 5534VP (value profile) metadata can be attached to instructions that have 5535value profile information. Currently this is indirect calls (where it 5536records the hottest callees) and calls to memory intrinsics such as memcpy, 5537memmove, and memset (where it records the hottest byte lengths). 5538 5539Each VP metadata node contains "VP" string, then a uint32_t value for the value 5540profiling kind, a uint64_t value for the total number of times the instruction 5541is executed, followed by uint64_t value and execution count pairs. 5542The value profiling kind is 0 for indirect call targets and 1 for memory 5543operations. For indirect call targets, each profile value is a hash 5544of the callee function name, and for memory operations each value is the 5545byte length. 5546 5547Note that the value counts do not need to add up to the total count 5548listed in the third operand (in practice only the top hottest values 5549are tracked and reported). 5550 5551Indirect call example: 5552 5553.. code-block:: llvm 5554 5555 call void %f(), !prof !1 5556 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410} 5557 5558Note that the VP type is 0 (the second operand), which indicates this is 5559an indirect call value profile data. The third operand indicates that the 5560indirect call executed 1600 times. The 4th and 6th operands give the 5561hashes of the 2 hottest target functions' names (this is the same hash used 5562to represent function names in the profile database), and the 5th and 7th 5563operands give the execution count that each of the respective prior target 5564functions was called. 5565 5566Module Flags Metadata 5567===================== 5568 5569Information about the module as a whole is difficult to convey to LLVM's 5570subsystems. The LLVM IR isn't sufficient to transmit this information. 5571The ``llvm.module.flags`` named metadata exists in order to facilitate 5572this. These flags are in the form of key / value pairs --- much like a 5573dictionary --- making it easy for any subsystem who cares about a flag to 5574look it up. 5575 5576The ``llvm.module.flags`` metadata contains a list of metadata triplets. 5577Each triplet has the following form: 5578 5579- The first element is a *behavior* flag, which specifies the behavior 5580 when two (or more) modules are merged together, and it encounters two 5581 (or more) metadata with the same ID. The supported behaviors are 5582 described below. 5583- The second element is a metadata string that is a unique ID for the 5584 metadata. Each module may only have one flag entry for each unique ID (not 5585 including entries with the **Require** behavior). 5586- The third element is the value of the flag. 5587 5588When two (or more) modules are merged together, the resulting 5589``llvm.module.flags`` metadata is the union of the modules' flags. That is, for 5590each unique metadata ID string, there will be exactly one entry in the merged 5591modules ``llvm.module.flags`` metadata table, and the value for that entry will 5592be determined by the merge behavior flag, as described below. The only exception 5593is that entries with the *Require* behavior are always preserved. 5594 5595The following behaviors are supported: 5596 5597.. list-table:: 5598 :header-rows: 1 5599 :widths: 10 90 5600 5601 * - Value 5602 - Behavior 5603 5604 * - 1 5605 - **Error** 5606 Emits an error if two values disagree, otherwise the resulting value 5607 is that of the operands. 5608 5609 * - 2 5610 - **Warning** 5611 Emits a warning if two values disagree. The result value will be the 5612 operand for the flag from the first module being linked. 5613 5614 * - 3 5615 - **Require** 5616 Adds a requirement that another module flag be present and have a 5617 specified value after linking is performed. The value must be a 5618 metadata pair, where the first element of the pair is the ID of the 5619 module flag to be restricted, and the second element of the pair is 5620 the value the module flag should be restricted to. This behavior can 5621 be used to restrict the allowable results (via triggering of an 5622 error) of linking IDs with the **Override** behavior. 5623 5624 * - 4 5625 - **Override** 5626 Uses the specified value, regardless of the behavior or value of the 5627 other module. If both modules specify **Override**, but the values 5628 differ, an error will be emitted. 5629 5630 * - 5 5631 - **Append** 5632 Appends the two values, which are required to be metadata nodes. 5633 5634 * - 6 5635 - **AppendUnique** 5636 Appends the two values, which are required to be metadata 5637 nodes. However, duplicate entries in the second list are dropped 5638 during the append operation. 5639 5640 * - 7 5641 - **Max** 5642 Takes the max of the two values, which are required to be integers. 5643 5644It is an error for a particular unique flag ID to have multiple behaviors, 5645except in the case of **Require** (which adds restrictions on another metadata 5646value) or **Override**. 5647 5648An example of module flags: 5649 5650.. code-block:: llvm 5651 5652 !0 = !{ i32 1, !"foo", i32 1 } 5653 !1 = !{ i32 4, !"bar", i32 37 } 5654 !2 = !{ i32 2, !"qux", i32 42 } 5655 !3 = !{ i32 3, !"qux", 5656 !{ 5657 !"foo", i32 1 5658 } 5659 } 5660 !llvm.module.flags = !{ !0, !1, !2, !3 } 5661 5662- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior 5663 if two or more ``!"foo"`` flags are seen is to emit an error if their 5664 values are not equal. 5665 5666- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The 5667 behavior if two or more ``!"bar"`` flags are seen is to use the value 5668 '37'. 5669 5670- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The 5671 behavior if two or more ``!"qux"`` flags are seen is to emit a 5672 warning if their values are not equal. 5673 5674- Metadata ``!3`` has the ID ``!"qux"`` and the value: 5675 5676 :: 5677 5678 !{ !"foo", i32 1 } 5679 5680 The behavior is to emit an error if the ``llvm.module.flags`` does not 5681 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is 5682 performed. 5683 5684Objective-C Garbage Collection Module Flags Metadata 5685---------------------------------------------------- 5686 5687On the Mach-O platform, Objective-C stores metadata about garbage 5688collection in a special section called "image info". The metadata 5689consists of a version number and a bitmask specifying what types of 5690garbage collection are supported (if any) by the file. If two or more 5691modules are linked together their garbage collection metadata needs to 5692be merged rather than appended together. 5693 5694The Objective-C garbage collection module flags metadata consists of the 5695following key-value pairs: 5696 5697.. list-table:: 5698 :header-rows: 1 5699 :widths: 30 70 5700 5701 * - Key 5702 - Value 5703 5704 * - ``Objective-C Version`` 5705 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. 5706 5707 * - ``Objective-C Image Info Version`` 5708 - **[Required]** --- The version of the image info section. Currently 5709 always 0. 5710 5711 * - ``Objective-C Image Info Section`` 5712 - **[Required]** --- The section to place the metadata. Valid values are 5713 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and 5714 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for 5715 Objective-C ABI version 2. 5716 5717 * - ``Objective-C Garbage Collection`` 5718 - **[Required]** --- Specifies whether garbage collection is supported or 5719 not. Valid values are 0, for no garbage collection, and 2, for garbage 5720 collection supported. 5721 5722 * - ``Objective-C GC Only`` 5723 - **[Optional]** --- Specifies that only garbage collection is supported. 5724 If present, its value must be 6. This flag requires that the 5725 ``Objective-C Garbage Collection`` flag have the value 2. 5726 5727Some important flag interactions: 5728 5729- If a module with ``Objective-C Garbage Collection`` set to 0 is 5730 merged with a module with ``Objective-C Garbage Collection`` set to 5731 2, then the resulting module has the 5732 ``Objective-C Garbage Collection`` flag set to 0. 5733- A module with ``Objective-C Garbage Collection`` set to 0 cannot be 5734 merged with a module with ``Objective-C GC Only`` set to 6. 5735 5736C type width Module Flags Metadata 5737---------------------------------- 5738 5739The ARM backend emits a section into each generated object file describing the 5740options that it was compiled with (in a compiler-independent way) to prevent 5741linking incompatible objects, and to allow automatic library selection. Some 5742of these options are not visible at the IR level, namely wchar_t width and enum 5743width. 5744 5745To pass this information to the backend, these options are encoded in module 5746flags metadata, using the following key-value pairs: 5747 5748.. list-table:: 5749 :header-rows: 1 5750 :widths: 30 70 5751 5752 * - Key 5753 - Value 5754 5755 * - short_wchar 5756 - * 0 --- sizeof(wchar_t) == 4 5757 * 1 --- sizeof(wchar_t) == 2 5758 5759 * - short_enum 5760 - * 0 --- Enums are at least as large as an ``int``. 5761 * 1 --- Enums are stored in the smallest integer type which can 5762 represent all of its values. 5763 5764For example, the following metadata section specifies that the module was 5765compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an 5766enum is the smallest type which can represent all of its values:: 5767 5768 !llvm.module.flags = !{!0, !1} 5769 !0 = !{i32 1, !"short_wchar", i32 1} 5770 !1 = !{i32 1, !"short_enum", i32 0} 5771 5772Automatic Linker Flags Named Metadata 5773===================================== 5774 5775Some targets support embedding flags to the linker inside individual object 5776files. Typically this is used in conjunction with language extensions which 5777allow source files to explicitly declare the libraries they depend on, and have 5778these automatically be transmitted to the linker via object files. 5779 5780These flags are encoded in the IR using named metadata with the name 5781``!llvm.linker.options``. Each operand is expected to be a metadata node 5782which should be a list of other metadata nodes, each of which should be a 5783list of metadata strings defining linker options. 5784 5785For example, the following metadata section specifies two separate sets of 5786linker options, presumably to link against ``libz`` and the ``Cocoa`` 5787framework:: 5788 5789 !0 = !{ !"-lz" }, 5790 !1 = !{ !"-framework", !"Cocoa" } } } 5791 !llvm.linker.options = !{ !0, !1 } 5792 5793The metadata encoding as lists of lists of options, as opposed to a collapsed 5794list of options, is chosen so that the IR encoding can use multiple option 5795strings to specify e.g., a single library, while still having that specifier be 5796preserved as an atomic element that can be recognized by a target specific 5797assembly writer or object file emitter. 5798 5799Each individual option is required to be either a valid option for the target's 5800linker, or an option that is reserved by the target specific assembly writer or 5801object file emitter. No other aspect of these options is defined by the IR. 5802 5803.. _summary: 5804 5805ThinLTO Summary 5806=============== 5807 5808Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_ 5809causes the building of a compact summary of the module that is emitted into 5810the bitcode. The summary is emitted into the LLVM assembly and identified 5811in syntax by a caret ('``^``'). 5812 5813*Note that temporarily the summary entries are skipped when parsing the 5814assembly, although the parsing support is actively being implemented. The 5815following describes when the summary entries will be parsed once implemented.* 5816The summary will be parsed into a ModuleSummaryIndex object under the 5817same conditions where summary index is currently built from bitcode. 5818Specifically, tools that test the Thin Link portion of a ThinLTO compile 5819(i.e. llvm-lto and llvm-lto2), or when parsing a combined index 5820for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag. 5821Additionally, it will be parsed into a bitcode output, along with the Module 5822IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes 5823of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the 5824summary entries (just as they currently ignore summary entries in a bitcode 5825input file). 5826 5827There are currently 3 types of summary entries in the LLVM assembly: 5828:ref:`module paths<module_path_summary>`, 5829:ref:`global values<gv_summary>`, and 5830:ref:`type identifiers<typeid_summary>`. 5831 5832.. _module_path_summary: 5833 5834Module Path Summary Entry 5835------------------------- 5836 5837Each module path summary entry lists a module containing global values included 5838in the summary. For a single IR module there will be one such entry, but 5839in a combined summary index produced during the thin link, there will be 5840one module path entry per linked module with summary. 5841 5842Example: 5843 5844.. code-block:: llvm 5845 5846 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418)) 5847 5848The ``path`` field is a string path to the bitcode file, and the ``hash`` 5849field is the 160-bit SHA-1 hash of the IR bitcode contents, used for 5850incremental builds and caching. 5851 5852.. _gv_summary: 5853 5854Global Value Summary Entry 5855-------------------------- 5856 5857Each global value summary entry corresponds to a global value defined or 5858referenced by a summarized module. 5859 5860Example: 5861 5862.. code-block:: llvm 5863 5864 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831 5865 5866For declarations, there will not be a summary list. For definitions, a 5867global value will contain a list of summaries, one per module containing 5868a definition. There can be multiple entries in a combined summary index 5869for symbols with weak linkage. 5870 5871Each ``Summary`` format will depend on whether the global value is a 5872:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or 5873:ref:`alias<alias_summary>`. 5874 5875.. _function_summary: 5876 5877Function Summary 5878^^^^^^^^^^^^^^^^ 5879 5880If the global value is a function, the ``Summary`` entry will look like: 5881 5882.. code-block:: llvm 5883 5884 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Refs]? 5885 5886The ``module`` field includes the summary entry id for the module containing 5887this definition, and the ``flags`` field contains information such as 5888the linkage type, a flag indicating whether it is legal to import the 5889definition, whether it is globally live and whether the linker resolved it 5890to a local definition (the latter two are populated during the thin link). 5891The ``insts`` field contains the number of IR instructions in the function. 5892Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`, 5893:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`, 5894:ref:`Refs<refs_summary>`. 5895 5896.. _variable_summary: 5897 5898Global Variable Summary 5899^^^^^^^^^^^^^^^^^^^^^^^ 5900 5901If the global value is a variable, the ``Summary`` entry will look like: 5902 5903.. code-block:: llvm 5904 5905 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]? 5906 5907The variable entry contains a subset of the fields in a 5908:ref:`function summary <function_summary>`, see the descriptions there. 5909 5910.. _alias_summary: 5911 5912Alias Summary 5913^^^^^^^^^^^^^ 5914 5915If the global value is an alias, the ``Summary`` entry will look like: 5916 5917.. code-block:: llvm 5918 5919 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2) 5920 5921The ``module`` and ``flags`` fields are as described for a 5922:ref:`function summary <function_summary>`. The ``aliasee`` field 5923contains a reference to the global value summary entry of the aliasee. 5924 5925.. _funcflags_summary: 5926 5927Function Flags 5928^^^^^^^^^^^^^^ 5929 5930The optional ``FuncFlags`` field looks like: 5931 5932.. code-block:: llvm 5933 5934 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0) 5935 5936If unspecified, flags are assumed to hold the conservative ``false`` value of 5937``0``. 5938 5939.. _calls_summary: 5940 5941Calls 5942^^^^^ 5943 5944The optional ``Calls`` field looks like: 5945 5946.. code-block:: llvm 5947 5948 calls: ((Callee)[, (Callee)]*) 5949 5950where each ``Callee`` looks like: 5951 5952.. code-block:: llvm 5953 5954 callee: ^1[, hotness: None]?[, relbf: 0]? 5955 5956The ``callee`` refers to the summary entry id of the callee. At most one 5957of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``, 5958``Hot``, and ``Critical``), and ``relbf`` (which holds the integer 5959branch frequency relative to the entry frequency, scaled down by 2^8) 5960may be specified. The defaults are ``Unknown`` and ``0``, respectively. 5961 5962.. _refs_summary: 5963 5964Refs 5965^^^^ 5966 5967The optional ``Refs`` field looks like: 5968 5969.. code-block:: llvm 5970 5971 refs: ((Ref)[, (Ref)]*) 5972 5973where each ``Ref`` contains a reference to the summary id of the referenced 5974value (e.g. ``^1``). 5975 5976.. _typeidinfo_summary: 5977 5978TypeIdInfo 5979^^^^^^^^^^ 5980 5981The optional ``TypeIdInfo`` field, used for 5982`Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 5983looks like: 5984 5985.. code-block:: llvm 5986 5987 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]? 5988 5989These optional fields have the following forms: 5990 5991TypeTests 5992""""""""" 5993 5994.. code-block:: llvm 5995 5996 typeTests: (TypeIdRef[, TypeIdRef]*) 5997 5998Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 5999by summary id or ``GUID``. 6000 6001TypeTestAssumeVCalls 6002"""""""""""""""""""" 6003 6004.. code-block:: llvm 6005 6006 typeTestAssumeVCalls: (VFuncId[, VFuncId]*) 6007 6008Where each VFuncId has the format: 6009 6010.. code-block:: llvm 6011 6012 vFuncId: (TypeIdRef, offset: 16) 6013 6014Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>` 6015by summary id or ``GUID`` preceeded by a ``guid:`` tag. 6016 6017TypeCheckedLoadVCalls 6018""""""""""""""""""""" 6019 6020.. code-block:: llvm 6021 6022 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*) 6023 6024Where each VFuncId has the format described for ``TypeTestAssumeVCalls``. 6025 6026TypeTestAssumeConstVCalls 6027""""""""""""""""""""""""" 6028 6029.. code-block:: llvm 6030 6031 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*) 6032 6033Where each ConstVCall has the format: 6034 6035.. code-block:: llvm 6036 6037 VFuncId, args: (Arg[, Arg]*) 6038 6039and where each VFuncId has the format described for ``TypeTestAssumeVCalls``, 6040and each Arg is an integer argument number. 6041 6042TypeCheckedLoadConstVCalls 6043"""""""""""""""""""""""""" 6044 6045.. code-block:: llvm 6046 6047 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*) 6048 6049Where each ConstVCall has the format described for 6050``TypeTestAssumeConstVCalls``. 6051 6052.. _typeid_summary: 6053 6054Type ID Summary Entry 6055--------------------- 6056 6057Each type id summary entry corresponds to a type identifier resolution 6058which is generated during the LTO link portion of the compile when building 6059with `Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, 6060so these are only present in a combined summary index. 6061 6062Example: 6063 6064.. code-block:: llvm 6065 6066 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778 6067 6068The ``typeTestRes`` gives the type test resolution ``kind`` (which may 6069be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and 6070the ``size-1`` bit width. It is followed by optional flags, which default to 0, 6071and an optional WpdResolutions (whole program devirtualization resolution) 6072field that looks like: 6073 6074.. code-block:: llvm 6075 6076 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]* 6077 6078where each entry is a mapping from the given byte offset to the whole-program 6079devirtualization resolution WpdRes, that has one of the following formats: 6080 6081.. code-block:: llvm 6082 6083 wpdRes: (kind: branchFunnel) 6084 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi") 6085 wpdRes: (kind: indir) 6086 6087Additionally, each wpdRes has an optional ``resByArg`` field, which 6088describes the resolutions for calls with all constant integer arguments: 6089 6090.. code-block:: llvm 6091 6092 resByArg: (ResByArg[, ResByArg]*) 6093 6094where ResByArg is: 6095 6096.. code-block:: llvm 6097 6098 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0]) 6099 6100Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal`` 6101or ``VirtualConstProp``. The ``info`` field is only used if the kind 6102is ``UniformRetVal`` (indicates the uniform return value), or 6103``UniqueRetVal`` (holds the return value associated with the unique vtable 6104(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does 6105not support the use of absolute symbols to store constants. 6106 6107.. _intrinsicglobalvariables: 6108 6109Intrinsic Global Variables 6110========================== 6111 6112LLVM has a number of "magic" global variables that contain data that 6113affect code generation or other IR semantics. These are documented here. 6114All globals of this sort should have a section specified as 6115"``llvm.metadata``". This section and all globals that start with 6116"``llvm.``" are reserved for use by LLVM. 6117 6118.. _gv_llvmused: 6119 6120The '``llvm.used``' Global Variable 6121----------------------------------- 6122 6123The ``@llvm.used`` global is an array which has 6124:ref:`appending linkage <linkage_appending>`. This array contains a list of 6125pointers to named global variables, functions and aliases which may optionally 6126have a pointer cast formed of bitcast or getelementptr. For example, a legal 6127use of it is: 6128 6129.. code-block:: llvm 6130 6131 @X = global i8 4 6132 @Y = global i32 123 6133 6134 @llvm.used = appending global [2 x i8*] [ 6135 i8* @X, 6136 i8* bitcast (i32* @Y to i8*) 6137 ], section "llvm.metadata" 6138 6139If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler, 6140and linker are required to treat the symbol as if there is a reference to the 6141symbol that it cannot see (which is why they have to be named). For example, if 6142a variable has internal linkage and no references other than that from the 6143``@llvm.used`` list, it cannot be deleted. This is commonly used to represent 6144references from inline asms and other things the compiler cannot "see", and 6145corresponds to "``attribute((used))``" in GNU C. 6146 6147On some targets, the code generator must emit a directive to the 6148assembler or object file to prevent the assembler and linker from 6149molesting the symbol. 6150 6151.. _gv_llvmcompilerused: 6152 6153The '``llvm.compiler.used``' Global Variable 6154-------------------------------------------- 6155 6156The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` 6157directive, except that it only prevents the compiler from touching the 6158symbol. On targets that support it, this allows an intelligent linker to 6159optimize references to the symbol without being impeded as it would be 6160by ``@llvm.used``. 6161 6162This is a rare construct that should only be used in rare circumstances, 6163and should not be exposed to source languages. 6164 6165.. _gv_llvmglobalctors: 6166 6167The '``llvm.global_ctors``' Global Variable 6168------------------------------------------- 6169 6170.. code-block:: llvm 6171 6172 %0 = type { i32, void ()*, i8* } 6173 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] 6174 6175The ``@llvm.global_ctors`` array contains a list of constructor 6176functions, priorities, and an optional associated global or function. 6177The functions referenced by this array will be called in ascending order 6178of priority (i.e. lowest first) when the module is loaded. The order of 6179functions with the same priority is not defined. 6180 6181If the third field is present, non-null, and points to a global variable 6182or function, the initializer function will only run if the associated 6183data from the current module is not discarded. 6184 6185.. _llvmglobaldtors: 6186 6187The '``llvm.global_dtors``' Global Variable 6188------------------------------------------- 6189 6190.. code-block:: llvm 6191 6192 %0 = type { i32, void ()*, i8* } 6193 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] 6194 6195The ``@llvm.global_dtors`` array contains a list of destructor 6196functions, priorities, and an optional associated global or function. 6197The functions referenced by this array will be called in descending 6198order of priority (i.e. highest first) when the module is unloaded. The 6199order of functions with the same priority is not defined. 6200 6201If the third field is present, non-null, and points to a global variable 6202or function, the destructor function will only run if the associated 6203data from the current module is not discarded. 6204 6205Instruction Reference 6206===================== 6207 6208The LLVM instruction set consists of several different classifications 6209of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary 6210instructions <binaryops>`, :ref:`bitwise binary 6211instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and 6212:ref:`other instructions <otherops>`. 6213 6214.. _terminators: 6215 6216Terminator Instructions 6217----------------------- 6218 6219As mentioned :ref:`previously <functionstructure>`, every basic block in a 6220program ends with a "Terminator" instruction, which indicates which 6221block should be executed after the current block is finished. These 6222terminator instructions typically yield a '``void``' value: they produce 6223control flow, not values (the one exception being the 6224':ref:`invoke <i_invoke>`' instruction). 6225 6226The terminator instructions are: ':ref:`ret <i_ret>`', 6227':ref:`br <i_br>`', ':ref:`switch <i_switch>`', 6228':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', 6229':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`', 6230':ref:`catchret <i_catchret>`', 6231':ref:`cleanupret <i_cleanupret>`', 6232and ':ref:`unreachable <i_unreachable>`'. 6233 6234.. _i_ret: 6235 6236'``ret``' Instruction 6237^^^^^^^^^^^^^^^^^^^^^ 6238 6239Syntax: 6240""""""" 6241 6242:: 6243 6244 ret <type> <value> ; Return a value from a non-void function 6245 ret void ; Return from void function 6246 6247Overview: 6248""""""""" 6249 6250The '``ret``' instruction is used to return control flow (and optionally 6251a value) from a function back to the caller. 6252 6253There are two forms of the '``ret``' instruction: one that returns a 6254value and then causes control flow, and one that just causes control 6255flow to occur. 6256 6257Arguments: 6258"""""""""" 6259 6260The '``ret``' instruction optionally accepts a single argument, the 6261return value. The type of the return value must be a ':ref:`first 6262class <t_firstclass>`' type. 6263 6264A function is not :ref:`well formed <wellformed>` if it it has a non-void 6265return type and contains a '``ret``' instruction with no return value or 6266a return value with a type that does not match its type, or if it has a 6267void return type and contains a '``ret``' instruction with a return 6268value. 6269 6270Semantics: 6271"""""""""" 6272 6273When the '``ret``' instruction is executed, control flow returns back to 6274the calling function's context. If the caller is a 6275":ref:`call <i_call>`" instruction, execution continues at the 6276instruction after the call. If the caller was an 6277":ref:`invoke <i_invoke>`" instruction, execution continues at the 6278beginning of the "normal" destination block. If the instruction returns 6279a value, that value shall set the call or invoke instruction's return 6280value. 6281 6282Example: 6283"""""""" 6284 6285.. code-block:: llvm 6286 6287 ret i32 5 ; Return an integer value of 5 6288 ret void ; Return from a void function 6289 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 6290 6291.. _i_br: 6292 6293'``br``' Instruction 6294^^^^^^^^^^^^^^^^^^^^ 6295 6296Syntax: 6297""""""" 6298 6299:: 6300 6301 br i1 <cond>, label <iftrue>, label <iffalse> 6302 br label <dest> ; Unconditional branch 6303 6304Overview: 6305""""""""" 6306 6307The '``br``' instruction is used to cause control flow to transfer to a 6308different basic block in the current function. There are two forms of 6309this instruction, corresponding to a conditional branch and an 6310unconditional branch. 6311 6312Arguments: 6313"""""""""" 6314 6315The conditional branch form of the '``br``' instruction takes a single 6316'``i1``' value and two '``label``' values. The unconditional form of the 6317'``br``' instruction takes a single '``label``' value as a target. 6318 6319Semantics: 6320"""""""""" 6321 6322Upon execution of a conditional '``br``' instruction, the '``i1``' 6323argument is evaluated. If the value is ``true``, control flows to the 6324'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows 6325to the '``iffalse``' ``label`` argument. 6326 6327Example: 6328"""""""" 6329 6330.. code-block:: llvm 6331 6332 Test: 6333 %cond = icmp eq i32 %a, %b 6334 br i1 %cond, label %IfEqual, label %IfUnequal 6335 IfEqual: 6336 ret i32 1 6337 IfUnequal: 6338 ret i32 0 6339 6340.. _i_switch: 6341 6342'``switch``' Instruction 6343^^^^^^^^^^^^^^^^^^^^^^^^ 6344 6345Syntax: 6346""""""" 6347 6348:: 6349 6350 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] 6351 6352Overview: 6353""""""""" 6354 6355The '``switch``' instruction is used to transfer control flow to one of 6356several different places. It is a generalization of the '``br``' 6357instruction, allowing a branch to occur to one of many possible 6358destinations. 6359 6360Arguments: 6361"""""""""" 6362 6363The '``switch``' instruction uses three parameters: an integer 6364comparison value '``value``', a default '``label``' destination, and an 6365array of pairs of comparison value constants and '``label``'s. The table 6366is not allowed to contain duplicate constant entries. 6367 6368Semantics: 6369"""""""""" 6370 6371The ``switch`` instruction specifies a table of values and destinations. 6372When the '``switch``' instruction is executed, this table is searched 6373for the given value. If the value is found, control flow is transferred 6374to the corresponding destination; otherwise, control flow is transferred 6375to the default destination. 6376 6377Implementation: 6378""""""""""""""" 6379 6380Depending on properties of the target machine and the particular 6381``switch`` instruction, this instruction may be code generated in 6382different ways. For example, it could be generated as a series of 6383chained conditional branches or with a lookup table. 6384 6385Example: 6386"""""""" 6387 6388.. code-block:: llvm 6389 6390 ; Emulate a conditional br instruction 6391 %Val = zext i1 %value to i32 6392 switch i32 %Val, label %truedest [ i32 0, label %falsedest ] 6393 6394 ; Emulate an unconditional br instruction 6395 switch i32 0, label %dest [ ] 6396 6397 ; Implement a jump table: 6398 switch i32 %val, label %otherwise [ i32 0, label %onzero 6399 i32 1, label %onone 6400 i32 2, label %ontwo ] 6401 6402.. _i_indirectbr: 6403 6404'``indirectbr``' Instruction 6405^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6406 6407Syntax: 6408""""""" 6409 6410:: 6411 6412 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] 6413 6414Overview: 6415""""""""" 6416 6417The '``indirectbr``' instruction implements an indirect branch to a 6418label within the current function, whose address is specified by 6419"``address``". Address must be derived from a 6420:ref:`blockaddress <blockaddress>` constant. 6421 6422Arguments: 6423"""""""""" 6424 6425The '``address``' argument is the address of the label to jump to. The 6426rest of the arguments indicate the full set of possible destinations 6427that the address may point to. Blocks are allowed to occur multiple 6428times in the destination list, though this isn't particularly useful. 6429 6430This destination list is required so that dataflow analysis has an 6431accurate understanding of the CFG. 6432 6433Semantics: 6434"""""""""" 6435 6436Control transfers to the block specified in the address argument. All 6437possible destination blocks must be listed in the label list, otherwise 6438this instruction has undefined behavior. This implies that jumps to 6439labels defined in other functions have undefined behavior as well. 6440 6441Implementation: 6442""""""""""""""" 6443 6444This is typically implemented with a jump through a register. 6445 6446Example: 6447"""""""" 6448 6449.. code-block:: llvm 6450 6451 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] 6452 6453.. _i_invoke: 6454 6455'``invoke``' Instruction 6456^^^^^^^^^^^^^^^^^^^^^^^^ 6457 6458Syntax: 6459""""""" 6460 6461:: 6462 6463 <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 6464 [operand bundles] to label <normal label> unwind label <exception label> 6465 6466Overview: 6467""""""""" 6468 6469The '``invoke``' instruction causes control to transfer to a specified 6470function, with the possibility of control flow transfer to either the 6471'``normal``' label or the '``exception``' label. If the callee function 6472returns with the "``ret``" instruction, control flow will return to the 6473"normal" label. If the callee (or any indirect callees) returns via the 6474":ref:`resume <i_resume>`" instruction or other exception handling 6475mechanism, control is interrupted and continued at the dynamically 6476nearest "exception" label. 6477 6478The '``exception``' label is a `landing 6479pad <ExceptionHandling.html#overview>`_ for the exception. As such, 6480'``exception``' label is required to have the 6481":ref:`landingpad <i_landingpad>`" instruction, which contains the 6482information about the behavior of the program after unwinding happens, 6483as its first non-PHI instruction. The restrictions on the 6484"``landingpad``" instruction's tightly couples it to the "``invoke``" 6485instruction, so that the important information contained within the 6486"``landingpad``" instruction can't be lost through normal code motion. 6487 6488Arguments: 6489"""""""""" 6490 6491This instruction requires several arguments: 6492 6493#. The optional "cconv" marker indicates which :ref:`calling 6494 convention <callingconv>` the call should use. If none is 6495 specified, the call defaults to using C calling conventions. 6496#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 6497 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 6498 are valid here. 6499#. '``ty``': the type of the call instruction itself which is also the 6500 type of the return value. Functions that return no value are marked 6501 ``void``. 6502#. '``fnty``': shall be the signature of the function being invoked. The 6503 argument types must match the types implied by this signature. This 6504 type can be omitted if the function is not varargs. 6505#. '``fnptrval``': An LLVM value containing a pointer to a function to 6506 be invoked. In most cases, this is a direct function invocation, but 6507 indirect ``invoke``'s are just as possible, calling an arbitrary pointer 6508 to function value. 6509#. '``function args``': argument list whose types match the function 6510 signature argument types and parameter attributes. All arguments must 6511 be of :ref:`first class <t_firstclass>` type. If the function signature 6512 indicates the function accepts a variable number of arguments, the 6513 extra arguments can be specified. 6514#. '``normal label``': the label reached when the called function 6515 executes a '``ret``' instruction. 6516#. '``exception label``': the label reached when a callee returns via 6517 the :ref:`resume <i_resume>` instruction or other exception handling 6518 mechanism. 6519#. The optional :ref:`function attributes <fnattrs>` list. 6520#. The optional :ref:`operand bundles <opbundles>` list. 6521 6522Semantics: 6523"""""""""" 6524 6525This instruction is designed to operate as a standard '``call``' 6526instruction in most regards. The primary difference is that it 6527establishes an association with a label, which is used by the runtime 6528library to unwind the stack. 6529 6530This instruction is used in languages with destructors to ensure that 6531proper cleanup is performed in the case of either a ``longjmp`` or a 6532thrown exception. Additionally, this is important for implementation of 6533'``catch``' clauses in high-level languages that support them. 6534 6535For the purposes of the SSA form, the definition of the value returned 6536by the '``invoke``' instruction is deemed to occur on the edge from the 6537current block to the "normal" label. If the callee unwinds then no 6538return value is available. 6539 6540Example: 6541"""""""" 6542 6543.. code-block:: llvm 6544 6545 %retval = invoke i32 @Test(i32 15) to label %Continue 6546 unwind label %TestCleanup ; i32:retval set 6547 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue 6548 unwind label %TestCleanup ; i32:retval set 6549 6550.. _i_resume: 6551 6552'``resume``' Instruction 6553^^^^^^^^^^^^^^^^^^^^^^^^ 6554 6555Syntax: 6556""""""" 6557 6558:: 6559 6560 resume <type> <value> 6561 6562Overview: 6563""""""""" 6564 6565The '``resume``' instruction is a terminator instruction that has no 6566successors. 6567 6568Arguments: 6569"""""""""" 6570 6571The '``resume``' instruction requires one argument, which must have the 6572same type as the result of any '``landingpad``' instruction in the same 6573function. 6574 6575Semantics: 6576"""""""""" 6577 6578The '``resume``' instruction resumes propagation of an existing 6579(in-flight) exception whose unwinding was interrupted with a 6580:ref:`landingpad <i_landingpad>` instruction. 6581 6582Example: 6583"""""""" 6584 6585.. code-block:: llvm 6586 6587 resume { i8*, i32 } %exn 6588 6589.. _i_catchswitch: 6590 6591'``catchswitch``' Instruction 6592^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6593 6594Syntax: 6595""""""" 6596 6597:: 6598 6599 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller 6600 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default> 6601 6602Overview: 6603""""""""" 6604 6605The '``catchswitch``' instruction is used by `LLVM's exception handling system 6606<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers 6607that may be executed by the :ref:`EH personality routine <personalityfn>`. 6608 6609Arguments: 6610"""""""""" 6611 6612The ``parent`` argument is the token of the funclet that contains the 6613``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet, 6614this operand may be the token ``none``. 6615 6616The ``default`` argument is the label of another basic block beginning with 6617either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination 6618must be a legal target with respect to the ``parent`` links, as described in 6619the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 6620 6621The ``handlers`` are a nonempty list of successor blocks that each begin with a 6622:ref:`catchpad <i_catchpad>` instruction. 6623 6624Semantics: 6625"""""""""" 6626 6627Executing this instruction transfers control to one of the successors in 6628``handlers``, if appropriate, or continues to unwind via the unwind label if 6629present. 6630 6631The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that 6632it must be both the first non-phi instruction and last instruction in the basic 6633block. Therefore, it must be the only non-phi instruction in the block. 6634 6635Example: 6636"""""""" 6637 6638.. code-block:: text 6639 6640 dispatch1: 6641 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller 6642 dispatch2: 6643 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup 6644 6645.. _i_catchret: 6646 6647'``catchret``' Instruction 6648^^^^^^^^^^^^^^^^^^^^^^^^^^ 6649 6650Syntax: 6651""""""" 6652 6653:: 6654 6655 catchret from <token> to label <normal> 6656 6657Overview: 6658""""""""" 6659 6660The '``catchret``' instruction is a terminator instruction that has a 6661single successor. 6662 6663 6664Arguments: 6665"""""""""" 6666 6667The first argument to a '``catchret``' indicates which ``catchpad`` it 6668exits. It must be a :ref:`catchpad <i_catchpad>`. 6669The second argument to a '``catchret``' specifies where control will 6670transfer to next. 6671 6672Semantics: 6673"""""""""" 6674 6675The '``catchret``' instruction ends an existing (in-flight) exception whose 6676unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The 6677:ref:`personality function <personalityfn>` gets a chance to execute arbitrary 6678code to, for example, destroy the active exception. Control then transfers to 6679``normal``. 6680 6681The ``token`` argument must be a token produced by a ``catchpad`` instruction. 6682If the specified ``catchpad`` is not the most-recently-entered not-yet-exited 6683funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 6684the ``catchret``'s behavior is undefined. 6685 6686Example: 6687"""""""" 6688 6689.. code-block:: text 6690 6691 catchret from %catch label %continue 6692 6693.. _i_cleanupret: 6694 6695'``cleanupret``' Instruction 6696^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6697 6698Syntax: 6699""""""" 6700 6701:: 6702 6703 cleanupret from <value> unwind label <continue> 6704 cleanupret from <value> unwind to caller 6705 6706Overview: 6707""""""""" 6708 6709The '``cleanupret``' instruction is a terminator instruction that has 6710an optional successor. 6711 6712 6713Arguments: 6714"""""""""" 6715 6716The '``cleanupret``' instruction requires one argument, which indicates 6717which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`. 6718If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited 6719funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 6720the ``cleanupret``'s behavior is undefined. 6721 6722The '``cleanupret``' instruction also has an optional successor, ``continue``, 6723which must be the label of another basic block beginning with either a 6724``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must 6725be a legal target with respect to the ``parent`` links, as described in the 6726`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_. 6727 6728Semantics: 6729"""""""""" 6730 6731The '``cleanupret``' instruction indicates to the 6732:ref:`personality function <personalityfn>` that one 6733:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended. 6734It transfers control to ``continue`` or unwinds out of the function. 6735 6736Example: 6737"""""""" 6738 6739.. code-block:: text 6740 6741 cleanupret from %cleanup unwind to caller 6742 cleanupret from %cleanup unwind label %continue 6743 6744.. _i_unreachable: 6745 6746'``unreachable``' Instruction 6747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6748 6749Syntax: 6750""""""" 6751 6752:: 6753 6754 unreachable 6755 6756Overview: 6757""""""""" 6758 6759The '``unreachable``' instruction has no defined semantics. This 6760instruction is used to inform the optimizer that a particular portion of 6761the code is not reachable. This can be used to indicate that the code 6762after a no-return function cannot be reached, and other facts. 6763 6764Semantics: 6765"""""""""" 6766 6767The '``unreachable``' instruction has no defined semantics. 6768 6769.. _binaryops: 6770 6771Binary Operations 6772----------------- 6773 6774Binary operators are used to do most of the computation in a program. 6775They require two operands of the same type, execute an operation on 6776them, and produce a single value. The operands might represent multiple 6777data, as is the case with the :ref:`vector <t_vector>` data type. The 6778result value has the same type as its operands. 6779 6780There are several different binary operators: 6781 6782.. _i_add: 6783 6784'``add``' Instruction 6785^^^^^^^^^^^^^^^^^^^^^ 6786 6787Syntax: 6788""""""" 6789 6790:: 6791 6792 <result> = add <ty> <op1>, <op2> ; yields ty:result 6793 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result 6794 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result 6795 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result 6796 6797Overview: 6798""""""""" 6799 6800The '``add``' instruction returns the sum of its two operands. 6801 6802Arguments: 6803"""""""""" 6804 6805The two arguments to the '``add``' instruction must be 6806:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 6807arguments must have identical types. 6808 6809Semantics: 6810"""""""""" 6811 6812The value produced is the integer sum of the two operands. 6813 6814If the sum has unsigned overflow, the result returned is the 6815mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 6816the result. 6817 6818Because LLVM integers use a two's complement representation, this 6819instruction is appropriate for both signed and unsigned integers. 6820 6821``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 6822respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 6823result value of the ``add`` is a :ref:`poison value <poisonvalues>` if 6824unsigned and/or signed overflow, respectively, occurs. 6825 6826Example: 6827"""""""" 6828 6829.. code-block:: text 6830 6831 <result> = add i32 4, %var ; yields i32:result = 4 + %var 6832 6833.. _i_fadd: 6834 6835'``fadd``' Instruction 6836^^^^^^^^^^^^^^^^^^^^^^ 6837 6838Syntax: 6839""""""" 6840 6841:: 6842 6843 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 6844 6845Overview: 6846""""""""" 6847 6848The '``fadd``' instruction returns the sum of its two operands. 6849 6850Arguments: 6851"""""""""" 6852 6853The two arguments to the '``fadd``' instruction must be 6854:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 6855floating-point values. Both arguments must have identical types. 6856 6857Semantics: 6858"""""""""" 6859 6860The value produced is the floating-point sum of the two operands. 6861This instruction is assumed to execute in the default :ref:`floating-point 6862environment <floatenv>`. 6863This instruction can also take any number of :ref:`fast-math 6864flags <fastmath>`, which are optimization hints to enable otherwise 6865unsafe floating-point optimizations: 6866 6867Example: 6868"""""""" 6869 6870.. code-block:: text 6871 6872 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var 6873 6874'``sub``' Instruction 6875^^^^^^^^^^^^^^^^^^^^^ 6876 6877Syntax: 6878""""""" 6879 6880:: 6881 6882 <result> = sub <ty> <op1>, <op2> ; yields ty:result 6883 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result 6884 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result 6885 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result 6886 6887Overview: 6888""""""""" 6889 6890The '``sub``' instruction returns the difference of its two operands. 6891 6892Note that the '``sub``' instruction is used to represent the '``neg``' 6893instruction present in most other intermediate representations. 6894 6895Arguments: 6896"""""""""" 6897 6898The two arguments to the '``sub``' instruction must be 6899:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 6900arguments must have identical types. 6901 6902Semantics: 6903"""""""""" 6904 6905The value produced is the integer difference of the two operands. 6906 6907If the difference has unsigned overflow, the result returned is the 6908mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of 6909the result. 6910 6911Because LLVM integers use a two's complement representation, this 6912instruction is appropriate for both signed and unsigned integers. 6913 6914``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 6915respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 6916result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if 6917unsigned and/or signed overflow, respectively, occurs. 6918 6919Example: 6920"""""""" 6921 6922.. code-block:: text 6923 6924 <result> = sub i32 4, %var ; yields i32:result = 4 - %var 6925 <result> = sub i32 0, %val ; yields i32:result = -%var 6926 6927.. _i_fsub: 6928 6929'``fsub``' Instruction 6930^^^^^^^^^^^^^^^^^^^^^^ 6931 6932Syntax: 6933""""""" 6934 6935:: 6936 6937 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 6938 6939Overview: 6940""""""""" 6941 6942The '``fsub``' instruction returns the difference of its two operands. 6943 6944Note that the '``fsub``' instruction is used to represent the '``fneg``' 6945instruction present in most other intermediate representations. 6946 6947Arguments: 6948"""""""""" 6949 6950The two arguments to the '``fsub``' instruction must be 6951:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 6952floating-point values. Both arguments must have identical types. 6953 6954Semantics: 6955"""""""""" 6956 6957The value produced is the floating-point difference of the two operands. 6958This instruction is assumed to execute in the default :ref:`floating-point 6959environment <floatenv>`. 6960This instruction can also take any number of :ref:`fast-math 6961flags <fastmath>`, which are optimization hints to enable otherwise 6962unsafe floating-point optimizations: 6963 6964Example: 6965"""""""" 6966 6967.. code-block:: text 6968 6969 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var 6970 <result> = fsub float -0.0, %val ; yields float:result = -%var 6971 6972'``mul``' Instruction 6973^^^^^^^^^^^^^^^^^^^^^ 6974 6975Syntax: 6976""""""" 6977 6978:: 6979 6980 <result> = mul <ty> <op1>, <op2> ; yields ty:result 6981 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result 6982 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result 6983 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result 6984 6985Overview: 6986""""""""" 6987 6988The '``mul``' instruction returns the product of its two operands. 6989 6990Arguments: 6991"""""""""" 6992 6993The two arguments to the '``mul``' instruction must be 6994:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 6995arguments must have identical types. 6996 6997Semantics: 6998"""""""""" 6999 7000The value produced is the integer product of the two operands. 7001 7002If the result of the multiplication has unsigned overflow, the result 7003returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the 7004bit width of the result. 7005 7006Because LLVM integers use a two's complement representation, and the 7007result is the same width as the operands, this instruction returns the 7008correct result for both signed and unsigned integers. If a full product 7009(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be 7010sign-extended or zero-extended as appropriate to the width of the full 7011product. 7012 7013``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", 7014respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the 7015result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if 7016unsigned and/or signed overflow, respectively, occurs. 7017 7018Example: 7019"""""""" 7020 7021.. code-block:: text 7022 7023 <result> = mul i32 4, %var ; yields i32:result = 4 * %var 7024 7025.. _i_fmul: 7026 7027'``fmul``' Instruction 7028^^^^^^^^^^^^^^^^^^^^^^ 7029 7030Syntax: 7031""""""" 7032 7033:: 7034 7035 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 7036 7037Overview: 7038""""""""" 7039 7040The '``fmul``' instruction returns the product of its two operands. 7041 7042Arguments: 7043"""""""""" 7044 7045The two arguments to the '``fmul``' instruction must be 7046:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 7047floating-point values. Both arguments must have identical types. 7048 7049Semantics: 7050"""""""""" 7051 7052The value produced is the floating-point product of the two operands. 7053This instruction is assumed to execute in the default :ref:`floating-point 7054environment <floatenv>`. 7055This instruction can also take any number of :ref:`fast-math 7056flags <fastmath>`, which are optimization hints to enable otherwise 7057unsafe floating-point optimizations: 7058 7059Example: 7060"""""""" 7061 7062.. code-block:: text 7063 7064 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var 7065 7066'``udiv``' Instruction 7067^^^^^^^^^^^^^^^^^^^^^^ 7068 7069Syntax: 7070""""""" 7071 7072:: 7073 7074 <result> = udiv <ty> <op1>, <op2> ; yields ty:result 7075 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result 7076 7077Overview: 7078""""""""" 7079 7080The '``udiv``' instruction returns the quotient of its two operands. 7081 7082Arguments: 7083"""""""""" 7084 7085The two arguments to the '``udiv``' instruction must be 7086:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7087arguments must have identical types. 7088 7089Semantics: 7090"""""""""" 7091 7092The value produced is the unsigned integer quotient of the two operands. 7093 7094Note that unsigned integer division and signed integer division are 7095distinct operations; for signed integer division, use '``sdiv``'. 7096 7097Division by zero is undefined behavior. For vectors, if any element 7098of the divisor is zero, the operation has undefined behavior. 7099 7100 7101If the ``exact`` keyword is present, the result value of the ``udiv`` is 7102a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as 7103such, "((a udiv exact b) mul b) == a"). 7104 7105Example: 7106"""""""" 7107 7108.. code-block:: text 7109 7110 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var 7111 7112'``sdiv``' Instruction 7113^^^^^^^^^^^^^^^^^^^^^^ 7114 7115Syntax: 7116""""""" 7117 7118:: 7119 7120 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result 7121 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result 7122 7123Overview: 7124""""""""" 7125 7126The '``sdiv``' instruction returns the quotient of its two operands. 7127 7128Arguments: 7129"""""""""" 7130 7131The two arguments to the '``sdiv``' instruction must be 7132:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7133arguments must have identical types. 7134 7135Semantics: 7136"""""""""" 7137 7138The value produced is the signed integer quotient of the two operands 7139rounded towards zero. 7140 7141Note that signed integer division and unsigned integer division are 7142distinct operations; for unsigned integer division, use '``udiv``'. 7143 7144Division by zero is undefined behavior. For vectors, if any element 7145of the divisor is zero, the operation has undefined behavior. 7146Overflow also leads to undefined behavior; this is a rare case, but can 7147occur, for example, by doing a 32-bit division of -2147483648 by -1. 7148 7149If the ``exact`` keyword is present, the result value of the ``sdiv`` is 7150a :ref:`poison value <poisonvalues>` if the result would be rounded. 7151 7152Example: 7153"""""""" 7154 7155.. code-block:: text 7156 7157 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var 7158 7159.. _i_fdiv: 7160 7161'``fdiv``' Instruction 7162^^^^^^^^^^^^^^^^^^^^^^ 7163 7164Syntax: 7165""""""" 7166 7167:: 7168 7169 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 7170 7171Overview: 7172""""""""" 7173 7174The '``fdiv``' instruction returns the quotient of its two operands. 7175 7176Arguments: 7177"""""""""" 7178 7179The two arguments to the '``fdiv``' instruction must be 7180:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 7181floating-point values. Both arguments must have identical types. 7182 7183Semantics: 7184"""""""""" 7185 7186The value produced is the floating-point quotient of the two operands. 7187This instruction is assumed to execute in the default :ref:`floating-point 7188environment <floatenv>`. 7189This instruction can also take any number of :ref:`fast-math 7190flags <fastmath>`, which are optimization hints to enable otherwise 7191unsafe floating-point optimizations: 7192 7193Example: 7194"""""""" 7195 7196.. code-block:: text 7197 7198 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var 7199 7200'``urem``' Instruction 7201^^^^^^^^^^^^^^^^^^^^^^ 7202 7203Syntax: 7204""""""" 7205 7206:: 7207 7208 <result> = urem <ty> <op1>, <op2> ; yields ty:result 7209 7210Overview: 7211""""""""" 7212 7213The '``urem``' instruction returns the remainder from the unsigned 7214division of its two arguments. 7215 7216Arguments: 7217"""""""""" 7218 7219The two arguments to the '``urem``' instruction must be 7220:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7221arguments must have identical types. 7222 7223Semantics: 7224"""""""""" 7225 7226This instruction returns the unsigned integer *remainder* of a division. 7227This instruction always performs an unsigned division to get the 7228remainder. 7229 7230Note that unsigned integer remainder and signed integer remainder are 7231distinct operations; for signed integer remainder, use '``srem``'. 7232 7233Taking the remainder of a division by zero is undefined behavior. 7234For vectors, if any element of the divisor is zero, the operation has 7235undefined behavior. 7236 7237Example: 7238"""""""" 7239 7240.. code-block:: text 7241 7242 <result> = urem i32 4, %var ; yields i32:result = 4 % %var 7243 7244'``srem``' Instruction 7245^^^^^^^^^^^^^^^^^^^^^^ 7246 7247Syntax: 7248""""""" 7249 7250:: 7251 7252 <result> = srem <ty> <op1>, <op2> ; yields ty:result 7253 7254Overview: 7255""""""""" 7256 7257The '``srem``' instruction returns the remainder from the signed 7258division of its two operands. This instruction can also take 7259:ref:`vector <t_vector>` versions of the values in which case the elements 7260must be integers. 7261 7262Arguments: 7263"""""""""" 7264 7265The two arguments to the '``srem``' instruction must be 7266:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7267arguments must have identical types. 7268 7269Semantics: 7270"""""""""" 7271 7272This instruction returns the *remainder* of a division (where the result 7273is either zero or has the same sign as the dividend, ``op1``), not the 7274*modulo* operator (where the result is either zero or has the same sign 7275as the divisor, ``op2``) of a value. For more information about the 7276difference, see `The Math 7277Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a 7278table of how this is implemented in various languages, please see 7279`Wikipedia: modulo 7280operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. 7281 7282Note that signed integer remainder and unsigned integer remainder are 7283distinct operations; for unsigned integer remainder, use '``urem``'. 7284 7285Taking the remainder of a division by zero is undefined behavior. 7286For vectors, if any element of the divisor is zero, the operation has 7287undefined behavior. 7288Overflow also leads to undefined behavior; this is a rare case, but can 7289occur, for example, by taking the remainder of a 32-bit division of 7290-2147483648 by -1. (The remainder doesn't actually overflow, but this 7291rule lets srem be implemented using instructions that return both the 7292result of the division and the remainder.) 7293 7294Example: 7295"""""""" 7296 7297.. code-block:: text 7298 7299 <result> = srem i32 4, %var ; yields i32:result = 4 % %var 7300 7301.. _i_frem: 7302 7303'``frem``' Instruction 7304^^^^^^^^^^^^^^^^^^^^^^ 7305 7306Syntax: 7307""""""" 7308 7309:: 7310 7311 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result 7312 7313Overview: 7314""""""""" 7315 7316The '``frem``' instruction returns the remainder from the division of 7317its two operands. 7318 7319Arguments: 7320"""""""""" 7321 7322The two arguments to the '``frem``' instruction must be 7323:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of 7324floating-point values. Both arguments must have identical types. 7325 7326Semantics: 7327"""""""""" 7328 7329The value produced is the floating-point remainder of the two operands. 7330This is the same output as a libm '``fmod``' function, but without any 7331possibility of setting ``errno``. The remainder has the same sign as the 7332dividend. 7333This instruction is assumed to execute in the default :ref:`floating-point 7334environment <floatenv>`. 7335This instruction can also take any number of :ref:`fast-math 7336flags <fastmath>`, which are optimization hints to enable otherwise 7337unsafe floating-point optimizations: 7338 7339Example: 7340"""""""" 7341 7342.. code-block:: text 7343 7344 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var 7345 7346.. _bitwiseops: 7347 7348Bitwise Binary Operations 7349------------------------- 7350 7351Bitwise binary operators are used to do various forms of bit-twiddling 7352in a program. They are generally very efficient instructions and can 7353commonly be strength reduced from other instructions. They require two 7354operands of the same type, execute an operation on them, and produce a 7355single value. The resulting value is the same type as its operands. 7356 7357'``shl``' Instruction 7358^^^^^^^^^^^^^^^^^^^^^ 7359 7360Syntax: 7361""""""" 7362 7363:: 7364 7365 <result> = shl <ty> <op1>, <op2> ; yields ty:result 7366 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result 7367 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result 7368 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result 7369 7370Overview: 7371""""""""" 7372 7373The '``shl``' instruction returns the first operand shifted to the left 7374a specified number of bits. 7375 7376Arguments: 7377"""""""""" 7378 7379Both arguments to the '``shl``' instruction must be the same 7380:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 7381'``op2``' is treated as an unsigned value. 7382 7383Semantics: 7384"""""""""" 7385 7386The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, 7387where ``n`` is the width of the result. If ``op2`` is (statically or 7388dynamically) equal to or larger than the number of bits in 7389``op1``, this instruction returns a :ref:`poison value <poisonvalues>`. 7390If the arguments are vectors, each vector element of ``op1`` is shifted 7391by the corresponding shift amount in ``op2``. 7392 7393If the ``nuw`` keyword is present, then the shift produces a poison 7394value if it shifts out any non-zero bits. 7395If the ``nsw`` keyword is present, then the shift produces a poison 7396value if it shifts out any bits that disagree with the resultant sign bit. 7397 7398Example: 7399"""""""" 7400 7401.. code-block:: text 7402 7403 <result> = shl i32 4, %var ; yields i32: 4 << %var 7404 <result> = shl i32 4, 2 ; yields i32: 16 7405 <result> = shl i32 1, 10 ; yields i32: 1024 7406 <result> = shl i32 1, 32 ; undefined 7407 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> 7408 7409'``lshr``' Instruction 7410^^^^^^^^^^^^^^^^^^^^^^ 7411 7412Syntax: 7413""""""" 7414 7415:: 7416 7417 <result> = lshr <ty> <op1>, <op2> ; yields ty:result 7418 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result 7419 7420Overview: 7421""""""""" 7422 7423The '``lshr``' instruction (logical shift right) returns the first 7424operand shifted to the right a specified number of bits with zero fill. 7425 7426Arguments: 7427"""""""""" 7428 7429Both arguments to the '``lshr``' instruction must be the same 7430:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 7431'``op2``' is treated as an unsigned value. 7432 7433Semantics: 7434"""""""""" 7435 7436This instruction always performs a logical shift right operation. The 7437most significant bits of the result will be filled with zero bits after 7438the shift. If ``op2`` is (statically or dynamically) equal to or larger 7439than the number of bits in ``op1``, this instruction returns a :ref:`poison 7440value <poisonvalues>`. If the arguments are vectors, each vector element 7441of ``op1`` is shifted by the corresponding shift amount in ``op2``. 7442 7443If the ``exact`` keyword is present, the result value of the ``lshr`` is 7444a poison value if any of the bits shifted out are non-zero. 7445 7446Example: 7447"""""""" 7448 7449.. code-block:: text 7450 7451 <result> = lshr i32 4, 1 ; yields i32:result = 2 7452 <result> = lshr i32 4, 2 ; yields i32:result = 1 7453 <result> = lshr i8 4, 3 ; yields i8:result = 0 7454 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F 7455 <result> = lshr i32 1, 32 ; undefined 7456 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> 7457 7458'``ashr``' Instruction 7459^^^^^^^^^^^^^^^^^^^^^^ 7460 7461Syntax: 7462""""""" 7463 7464:: 7465 7466 <result> = ashr <ty> <op1>, <op2> ; yields ty:result 7467 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result 7468 7469Overview: 7470""""""""" 7471 7472The '``ashr``' instruction (arithmetic shift right) returns the first 7473operand shifted to the right a specified number of bits with sign 7474extension. 7475 7476Arguments: 7477"""""""""" 7478 7479Both arguments to the '``ashr``' instruction must be the same 7480:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. 7481'``op2``' is treated as an unsigned value. 7482 7483Semantics: 7484"""""""""" 7485 7486This instruction always performs an arithmetic shift right operation, 7487The most significant bits of the result will be filled with the sign bit 7488of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger 7489than the number of bits in ``op1``, this instruction returns a :ref:`poison 7490value <poisonvalues>`. If the arguments are vectors, each vector element 7491of ``op1`` is shifted by the corresponding shift amount in ``op2``. 7492 7493If the ``exact`` keyword is present, the result value of the ``ashr`` is 7494a poison value if any of the bits shifted out are non-zero. 7495 7496Example: 7497"""""""" 7498 7499.. code-block:: text 7500 7501 <result> = ashr i32 4, 1 ; yields i32:result = 2 7502 <result> = ashr i32 4, 2 ; yields i32:result = 1 7503 <result> = ashr i8 4, 3 ; yields i8:result = 0 7504 <result> = ashr i8 -2, 1 ; yields i8:result = -1 7505 <result> = ashr i32 1, 32 ; undefined 7506 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> 7507 7508'``and``' Instruction 7509^^^^^^^^^^^^^^^^^^^^^ 7510 7511Syntax: 7512""""""" 7513 7514:: 7515 7516 <result> = and <ty> <op1>, <op2> ; yields ty:result 7517 7518Overview: 7519""""""""" 7520 7521The '``and``' instruction returns the bitwise logical and of its two 7522operands. 7523 7524Arguments: 7525"""""""""" 7526 7527The two arguments to the '``and``' instruction must be 7528:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7529arguments must have identical types. 7530 7531Semantics: 7532"""""""""" 7533 7534The truth table used for the '``and``' instruction is: 7535 7536+-----+-----+-----+ 7537| In0 | In1 | Out | 7538+-----+-----+-----+ 7539| 0 | 0 | 0 | 7540+-----+-----+-----+ 7541| 0 | 1 | 0 | 7542+-----+-----+-----+ 7543| 1 | 0 | 0 | 7544+-----+-----+-----+ 7545| 1 | 1 | 1 | 7546+-----+-----+-----+ 7547 7548Example: 7549"""""""" 7550 7551.. code-block:: text 7552 7553 <result> = and i32 4, %var ; yields i32:result = 4 & %var 7554 <result> = and i32 15, 40 ; yields i32:result = 8 7555 <result> = and i32 4, 8 ; yields i32:result = 0 7556 7557'``or``' Instruction 7558^^^^^^^^^^^^^^^^^^^^ 7559 7560Syntax: 7561""""""" 7562 7563:: 7564 7565 <result> = or <ty> <op1>, <op2> ; yields ty:result 7566 7567Overview: 7568""""""""" 7569 7570The '``or``' instruction returns the bitwise logical inclusive or of its 7571two operands. 7572 7573Arguments: 7574"""""""""" 7575 7576The two arguments to the '``or``' instruction must be 7577:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7578arguments must have identical types. 7579 7580Semantics: 7581"""""""""" 7582 7583The truth table used for the '``or``' instruction is: 7584 7585+-----+-----+-----+ 7586| In0 | In1 | Out | 7587+-----+-----+-----+ 7588| 0 | 0 | 0 | 7589+-----+-----+-----+ 7590| 0 | 1 | 1 | 7591+-----+-----+-----+ 7592| 1 | 0 | 1 | 7593+-----+-----+-----+ 7594| 1 | 1 | 1 | 7595+-----+-----+-----+ 7596 7597Example: 7598"""""""" 7599 7600:: 7601 7602 <result> = or i32 4, %var ; yields i32:result = 4 | %var 7603 <result> = or i32 15, 40 ; yields i32:result = 47 7604 <result> = or i32 4, 8 ; yields i32:result = 12 7605 7606'``xor``' Instruction 7607^^^^^^^^^^^^^^^^^^^^^ 7608 7609Syntax: 7610""""""" 7611 7612:: 7613 7614 <result> = xor <ty> <op1>, <op2> ; yields ty:result 7615 7616Overview: 7617""""""""" 7618 7619The '``xor``' instruction returns the bitwise logical exclusive or of 7620its two operands. The ``xor`` is used to implement the "one's 7621complement" operation, which is the "~" operator in C. 7622 7623Arguments: 7624"""""""""" 7625 7626The two arguments to the '``xor``' instruction must be 7627:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both 7628arguments must have identical types. 7629 7630Semantics: 7631"""""""""" 7632 7633The truth table used for the '``xor``' instruction is: 7634 7635+-----+-----+-----+ 7636| In0 | In1 | Out | 7637+-----+-----+-----+ 7638| 0 | 0 | 0 | 7639+-----+-----+-----+ 7640| 0 | 1 | 1 | 7641+-----+-----+-----+ 7642| 1 | 0 | 1 | 7643+-----+-----+-----+ 7644| 1 | 1 | 0 | 7645+-----+-----+-----+ 7646 7647Example: 7648"""""""" 7649 7650.. code-block:: text 7651 7652 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var 7653 <result> = xor i32 15, 40 ; yields i32:result = 39 7654 <result> = xor i32 4, 8 ; yields i32:result = 12 7655 <result> = xor i32 %V, -1 ; yields i32:result = ~%V 7656 7657Vector Operations 7658----------------- 7659 7660LLVM supports several instructions to represent vector operations in a 7661target-independent manner. These instructions cover the element-access 7662and vector-specific operations needed to process vectors effectively. 7663While LLVM does directly support these vector operations, many 7664sophisticated algorithms will want to use target-specific intrinsics to 7665take full advantage of a specific target. 7666 7667.. _i_extractelement: 7668 7669'``extractelement``' Instruction 7670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7671 7672Syntax: 7673""""""" 7674 7675:: 7676 7677 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> 7678 7679Overview: 7680""""""""" 7681 7682The '``extractelement``' instruction extracts a single scalar element 7683from a vector at a specified index. 7684 7685Arguments: 7686"""""""""" 7687 7688The first operand of an '``extractelement``' instruction is a value of 7689:ref:`vector <t_vector>` type. The second operand is an index indicating 7690the position from which to extract the element. The index may be a 7691variable of any integer type. 7692 7693Semantics: 7694"""""""""" 7695 7696The result is a scalar of the same type as the element type of ``val``. 7697Its value is the value at position ``idx`` of ``val``. If ``idx`` 7698exceeds the length of ``val``, the result is a 7699:ref:`poison value <poisonvalues>`. 7700 7701Example: 7702"""""""" 7703 7704.. code-block:: text 7705 7706 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 7707 7708.. _i_insertelement: 7709 7710'``insertelement``' Instruction 7711^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7712 7713Syntax: 7714""""""" 7715 7716:: 7717 7718 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> 7719 7720Overview: 7721""""""""" 7722 7723The '``insertelement``' instruction inserts a scalar element into a 7724vector at a specified index. 7725 7726Arguments: 7727"""""""""" 7728 7729The first operand of an '``insertelement``' instruction is a value of 7730:ref:`vector <t_vector>` type. The second operand is a scalar value whose 7731type must equal the element type of the first operand. The third operand 7732is an index indicating the position at which to insert the value. The 7733index may be a variable of any integer type. 7734 7735Semantics: 7736"""""""""" 7737 7738The result is a vector of the same type as ``val``. Its element values 7739are those of ``val`` except at position ``idx``, where it gets the value 7740``elt``. If ``idx`` exceeds the length of ``val``, the result 7741is a :ref:`poison value <poisonvalues>`. 7742 7743Example: 7744"""""""" 7745 7746.. code-block:: text 7747 7748 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> 7749 7750.. _i_shufflevector: 7751 7752'``shufflevector``' Instruction 7753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7754 7755Syntax: 7756""""""" 7757 7758:: 7759 7760 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> 7761 7762Overview: 7763""""""""" 7764 7765The '``shufflevector``' instruction constructs a permutation of elements 7766from two input vectors, returning a vector with the same element type as 7767the input and length that is the same as the shuffle mask. 7768 7769Arguments: 7770"""""""""" 7771 7772The first two operands of a '``shufflevector``' instruction are vectors 7773with the same type. The third argument is a shuffle mask whose element 7774type is always 'i32'. The result of the instruction is a vector whose 7775length is the same as the shuffle mask and whose element type is the 7776same as the element type of the first two operands. 7777 7778The shuffle mask operand is required to be a constant vector with either 7779constant integer or undef values. 7780 7781Semantics: 7782"""""""""" 7783 7784The elements of the two input vectors are numbered from left to right 7785across both of the vectors. The shuffle mask operand specifies, for each 7786element of the result vector, which element of the two input vectors the 7787result element gets. If the shuffle mask is undef, the result vector is 7788undef. If any element of the mask operand is undef, that element of the 7789result is undef. If the shuffle mask selects an undef element from one 7790of the input vectors, the resulting element is undef. 7791 7792Example: 7793"""""""" 7794 7795.. code-block:: text 7796 7797 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 7798 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> 7799 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, 7800 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. 7801 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, 7802 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> 7803 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, 7804 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> 7805 7806Aggregate Operations 7807-------------------- 7808 7809LLVM supports several instructions for working with 7810:ref:`aggregate <t_aggregate>` values. 7811 7812.. _i_extractvalue: 7813 7814'``extractvalue``' Instruction 7815^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7816 7817Syntax: 7818""""""" 7819 7820:: 7821 7822 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* 7823 7824Overview: 7825""""""""" 7826 7827The '``extractvalue``' instruction extracts the value of a member field 7828from an :ref:`aggregate <t_aggregate>` value. 7829 7830Arguments: 7831"""""""""" 7832 7833The first operand of an '``extractvalue``' instruction is a value of 7834:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are 7835constant indices to specify which value to extract in a similar manner 7836as indices in a '``getelementptr``' instruction. 7837 7838The major differences to ``getelementptr`` indexing are: 7839 7840- Since the value being indexed is not a pointer, the first index is 7841 omitted and assumed to be zero. 7842- At least one index must be specified. 7843- Not only struct indices but also array indices must be in bounds. 7844 7845Semantics: 7846"""""""""" 7847 7848The result is the value at the position in the aggregate specified by 7849the index operands. 7850 7851Example: 7852"""""""" 7853 7854.. code-block:: text 7855 7856 <result> = extractvalue {i32, float} %agg, 0 ; yields i32 7857 7858.. _i_insertvalue: 7859 7860'``insertvalue``' Instruction 7861^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 7862 7863Syntax: 7864""""""" 7865 7866:: 7867 7868 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> 7869 7870Overview: 7871""""""""" 7872 7873The '``insertvalue``' instruction inserts a value into a member field in 7874an :ref:`aggregate <t_aggregate>` value. 7875 7876Arguments: 7877"""""""""" 7878 7879The first operand of an '``insertvalue``' instruction is a value of 7880:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is 7881a first-class value to insert. The following operands are constant 7882indices indicating the position at which to insert the value in a 7883similar manner as indices in a '``extractvalue``' instruction. The value 7884to insert must have the same type as the value identified by the 7885indices. 7886 7887Semantics: 7888"""""""""" 7889 7890The result is an aggregate of the same type as ``val``. Its value is 7891that of ``val`` except that the value at the position specified by the 7892indices is that of ``elt``. 7893 7894Example: 7895"""""""" 7896 7897.. code-block:: llvm 7898 7899 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} 7900 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} 7901 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}} 7902 7903.. _memoryops: 7904 7905Memory Access and Addressing Operations 7906--------------------------------------- 7907 7908A key design point of an SSA-based representation is how it represents 7909memory. In LLVM, no memory locations are in SSA form, which makes things 7910very simple. This section describes how to read, write, and allocate 7911memory in LLVM. 7912 7913.. _i_alloca: 7914 7915'``alloca``' Instruction 7916^^^^^^^^^^^^^^^^^^^^^^^^ 7917 7918Syntax: 7919""""""" 7920 7921:: 7922 7923 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result 7924 7925Overview: 7926""""""""" 7927 7928The '``alloca``' instruction allocates memory on the stack frame of the 7929currently executing function, to be automatically released when this 7930function returns to its caller. The object is always allocated in the 7931address space for allocas indicated in the datalayout. 7932 7933Arguments: 7934"""""""""" 7935 7936The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` 7937bytes of memory on the runtime stack, returning a pointer of the 7938appropriate type to the program. If "NumElements" is specified, it is 7939the number of elements allocated, otherwise "NumElements" is defaulted 7940to be one. If a constant alignment is specified, the value result of the 7941allocation is guaranteed to be aligned to at least that boundary. The 7942alignment may not be greater than ``1 << 29``. If not specified, or if 7943zero, the target can choose to align the allocation on any convenient 7944boundary compatible with the type. 7945 7946'``type``' may be any sized type. 7947 7948Semantics: 7949"""""""""" 7950 7951Memory is allocated; a pointer is returned. The operation is undefined 7952if there is insufficient stack space for the allocation. '``alloca``'d 7953memory is automatically released when the function returns. The 7954'``alloca``' instruction is commonly used to represent automatic 7955variables that must have an address available. When the function returns 7956(either with the ``ret`` or ``resume`` instructions), the memory is 7957reclaimed. Allocating zero bytes is legal, but the returned pointer may not 7958be unique. The order in which memory is allocated (ie., which way the stack 7959grows) is not specified. 7960 7961Example: 7962"""""""" 7963 7964.. code-block:: llvm 7965 7966 %ptr = alloca i32 ; yields i32*:ptr 7967 %ptr = alloca i32, i32 4 ; yields i32*:ptr 7968 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr 7969 %ptr = alloca i32, align 1024 ; yields i32*:ptr 7970 7971.. _i_load: 7972 7973'``load``' Instruction 7974^^^^^^^^^^^^^^^^^^^^^^ 7975 7976Syntax: 7977""""""" 7978 7979:: 7980 7981 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>] 7982 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>] 7983 !<index> = !{ i32 1 } 7984 !<deref_bytes_node> = !{i64 <dereferenceable_bytes>} 7985 !<align_node> = !{ i64 <value_alignment> } 7986 7987Overview: 7988""""""""" 7989 7990The '``load``' instruction is used to read from memory. 7991 7992Arguments: 7993"""""""""" 7994 7995The argument to the ``load`` instruction specifies the memory address from which 7996to load. The type specified must be a :ref:`first class <t_firstclass>` type of 7997known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If 7998the ``load`` is marked as ``volatile``, then the optimizer is not allowed to 7999modify the number or order of execution of this ``load`` with other 8000:ref:`volatile operations <volatile>`. 8001 8002If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering 8003<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 8004``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions. 8005Atomic loads produce :ref:`defined <memmodel>` results when they may see 8006multiple atomic stores. The type of the pointee must be an integer, pointer, or 8007floating-point type whose bit width is a power of two greater than or equal to 8008eight and less than or equal to a target-specific size limit. ``align`` must be 8009explicitly specified on atomic loads, and the load has undefined behavior if the 8010alignment is not set to a value which is at least the size in bytes of the 8011pointee. ``!nontemporal`` does not have any defined semantics for atomic loads. 8012 8013The optional constant ``align`` argument specifies the alignment of the 8014operation (that is, the alignment of the memory address). A value of 0 8015or an omitted ``align`` argument means that the operation has the ABI 8016alignment for the target. It is the responsibility of the code emitter 8017to ensure that the alignment information is correct. Overestimating the 8018alignment results in undefined behavior. Underestimating the alignment 8019may produce less efficient code. An alignment of 1 is always safe. The 8020maximum possible alignment is ``1 << 29``. An alignment value higher 8021than the size of the loaded type implies memory up to the alignment 8022value bytes can be safely loaded without trapping in the default 8023address space. Access of the high bytes can interfere with debugging 8024tools, so should not be accessed if the function has the 8025``sanitize_thread`` or ``sanitize_address`` attributes. 8026 8027The optional ``!nontemporal`` metadata must reference a single 8028metadata name ``<index>`` corresponding to a metadata node with one 8029``i32`` entry of value 1. The existence of the ``!nontemporal`` 8030metadata on the instruction tells the optimizer and code generator 8031that this load is not expected to be reused in the cache. The code 8032generator may select special instructions to save cache bandwidth, such 8033as the ``MOVNT`` instruction on x86. 8034 8035The optional ``!invariant.load`` metadata must reference a single 8036metadata name ``<index>`` corresponding to a metadata node with no 8037entries. If a load instruction tagged with the ``!invariant.load`` 8038metadata is executed, the optimizer may assume the memory location 8039referenced by the load contains the same value at all points in the 8040program where the memory location is known to be dereferenceable; 8041otherwise, the behavior is undefined. 8042 8043The optional ``!invariant.group`` metadata must reference a single metadata name 8044 ``<index>`` corresponding to a metadata node with no entries. 8045 See ``invariant.group`` metadata. 8046 8047The optional ``!nonnull`` metadata must reference a single 8048metadata name ``<index>`` corresponding to a metadata node with no 8049entries. The existence of the ``!nonnull`` metadata on the 8050instruction tells the optimizer that the value loaded is known to 8051never be null. If the value is null at runtime, the behavior is undefined. 8052This is analogous to the ``nonnull`` attribute on parameters and return 8053values. This metadata can only be applied to loads of a pointer type. 8054 8055The optional ``!dereferenceable`` metadata must reference a single metadata 8056name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64`` 8057entry. The existence of the ``!dereferenceable`` metadata on the instruction 8058tells the optimizer that the value loaded is known to be dereferenceable. 8059The number of bytes known to be dereferenceable is specified by the integer 8060value in the metadata node. This is analogous to the ''dereferenceable'' 8061attribute on parameters and return values. This metadata can only be applied 8062to loads of a pointer type. 8063 8064The optional ``!dereferenceable_or_null`` metadata must reference a single 8065metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one 8066``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the 8067instruction tells the optimizer that the value loaded is known to be either 8068dereferenceable or null. 8069The number of bytes known to be dereferenceable is specified by the integer 8070value in the metadata node. This is analogous to the ''dereferenceable_or_null'' 8071attribute on parameters and return values. This metadata can only be applied 8072to loads of a pointer type. 8073 8074The optional ``!align`` metadata must reference a single metadata name 8075``<align_node>`` corresponding to a metadata node with one ``i64`` entry. 8076The existence of the ``!align`` metadata on the instruction tells the 8077optimizer that the value loaded is known to be aligned to a boundary specified 8078by the integer value in the metadata node. The alignment must be a power of 2. 8079This is analogous to the ''align'' attribute on parameters and return values. 8080This metadata can only be applied to loads of a pointer type. If the returned 8081value is not appropriately aligned at runtime, the behavior is undefined. 8082 8083Semantics: 8084"""""""""" 8085 8086The location of memory pointed to is loaded. If the value being loaded 8087is of scalar type then the number of bytes read does not exceed the 8088minimum number of bytes needed to hold all bits of the type. For 8089example, loading an ``i24`` reads at most three bytes. When loading a 8090value of a type like ``i20`` with a size that is not an integral number 8091of bytes, the result is undefined if the value was not originally 8092written using a store of the same type. 8093 8094Examples: 8095""""""""" 8096 8097.. code-block:: llvm 8098 8099 %ptr = alloca i32 ; yields i32*:ptr 8100 store i32 3, i32* %ptr ; yields void 8101 %val = load i32, i32* %ptr ; yields i32:val = i32 3 8102 8103.. _i_store: 8104 8105'``store``' Instruction 8106^^^^^^^^^^^^^^^^^^^^^^^ 8107 8108Syntax: 8109""""""" 8110 8111:: 8112 8113 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.group !<index>] ; yields void 8114 store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void 8115 8116Overview: 8117""""""""" 8118 8119The '``store``' instruction is used to write to memory. 8120 8121Arguments: 8122"""""""""" 8123 8124There are two arguments to the ``store`` instruction: a value to store and an 8125address at which to store it. The type of the ``<pointer>`` operand must be a 8126pointer to the :ref:`first class <t_firstclass>` type of the ``<value>`` 8127operand. If the ``store`` is marked as ``volatile``, then the optimizer is not 8128allowed to modify the number or order of execution of this ``store`` with other 8129:ref:`volatile operations <volatile>`. Only values of :ref:`first class 8130<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque 8131structural type <t_opaque>`) can be stored. 8132 8133If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering 8134<ordering>` and optional ``syncscope("<target-scope>")`` argument. The 8135``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions. 8136Atomic loads produce :ref:`defined <memmodel>` results when they may see 8137multiple atomic stores. The type of the pointee must be an integer, pointer, or 8138floating-point type whose bit width is a power of two greater than or equal to 8139eight and less than or equal to a target-specific size limit. ``align`` must be 8140explicitly specified on atomic stores, and the store has undefined behavior if 8141the alignment is not set to a value which is at least the size in bytes of the 8142pointee. ``!nontemporal`` does not have any defined semantics for atomic stores. 8143 8144The optional constant ``align`` argument specifies the alignment of the 8145operation (that is, the alignment of the memory address). A value of 0 8146or an omitted ``align`` argument means that the operation has the ABI 8147alignment for the target. It is the responsibility of the code emitter 8148to ensure that the alignment information is correct. Overestimating the 8149alignment results in undefined behavior. Underestimating the 8150alignment may produce less efficient code. An alignment of 1 is always 8151safe. The maximum possible alignment is ``1 << 29``. An alignment 8152value higher than the size of the stored type implies memory up to the 8153alignment value bytes can be stored to without trapping in the default 8154address space. Storing to the higher bytes however may result in data 8155races if another thread can access the same address. Introducing a 8156data race is not allowed. Storing to the extra bytes is not allowed 8157even in situations where a data race is known to not exist if the 8158function has the ``sanitize_address`` attribute. 8159 8160The optional ``!nontemporal`` metadata must reference a single metadata 8161name ``<index>`` corresponding to a metadata node with one ``i32`` entry of 8162value 1. The existence of the ``!nontemporal`` metadata on the instruction 8163tells the optimizer and code generator that this load is not expected to 8164be reused in the cache. The code generator may select special 8165instructions to save cache bandwidth, such as the ``MOVNT`` instruction on 8166x86. 8167 8168The optional ``!invariant.group`` metadata must reference a 8169single metadata name ``<index>``. See ``invariant.group`` metadata. 8170 8171Semantics: 8172"""""""""" 8173 8174The contents of memory are updated to contain ``<value>`` at the 8175location specified by the ``<pointer>`` operand. If ``<value>`` is 8176of scalar type then the number of bytes written does not exceed the 8177minimum number of bytes needed to hold all bits of the type. For 8178example, storing an ``i24`` writes at most three bytes. When writing a 8179value of a type like ``i20`` with a size that is not an integral number 8180of bytes, it is unspecified what happens to the extra bits that do not 8181belong to the type, but they will typically be overwritten. 8182 8183Example: 8184"""""""" 8185 8186.. code-block:: llvm 8187 8188 %ptr = alloca i32 ; yields i32*:ptr 8189 store i32 3, i32* %ptr ; yields void 8190 %val = load i32, i32* %ptr ; yields i32:val = i32 3 8191 8192.. _i_fence: 8193 8194'``fence``' Instruction 8195^^^^^^^^^^^^^^^^^^^^^^^ 8196 8197Syntax: 8198""""""" 8199 8200:: 8201 8202 fence [syncscope("<target-scope>")] <ordering> ; yields void 8203 8204Overview: 8205""""""""" 8206 8207The '``fence``' instruction is used to introduce happens-before edges 8208between operations. 8209 8210Arguments: 8211"""""""""" 8212 8213'``fence``' instructions take an :ref:`ordering <ordering>` argument which 8214defines what *synchronizes-with* edges they add. They can only be given 8215``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. 8216 8217Semantics: 8218"""""""""" 8219 8220A fence A which has (at least) ``release`` ordering semantics 8221*synchronizes with* a fence B with (at least) ``acquire`` ordering 8222semantics if and only if there exist atomic operations X and Y, both 8223operating on some atomic object M, such that A is sequenced before X, X 8224modifies M (either directly or through some side effect of a sequence 8225headed by X), Y is sequenced before B, and Y observes M. This provides a 8226*happens-before* dependency between A and B. Rather than an explicit 8227``fence``, one (but not both) of the atomic operations X or Y might 8228provide a ``release`` or ``acquire`` (resp.) ordering constraint and 8229still *synchronize-with* the explicit ``fence`` and establish the 8230*happens-before* edge. 8231 8232A ``fence`` which has ``seq_cst`` ordering, in addition to having both 8233``acquire`` and ``release`` semantics specified above, participates in 8234the global program order of other ``seq_cst`` operations and/or fences. 8235 8236A ``fence`` instruction can also take an optional 8237":ref:`syncscope <syncscope>`" argument. 8238 8239Example: 8240"""""""" 8241 8242.. code-block:: text 8243 8244 fence acquire ; yields void 8245 fence syncscope("singlethread") seq_cst ; yields void 8246 fence syncscope("agent") seq_cst ; yields void 8247 8248.. _i_cmpxchg: 8249 8250'``cmpxchg``' Instruction 8251^^^^^^^^^^^^^^^^^^^^^^^^^ 8252 8253Syntax: 8254""""""" 8255 8256:: 8257 8258 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering> ; yields { ty, i1 } 8259 8260Overview: 8261""""""""" 8262 8263The '``cmpxchg``' instruction is used to atomically modify memory. It 8264loads a value in memory and compares it to a given value. If they are 8265equal, it tries to store a new value into the memory. 8266 8267Arguments: 8268"""""""""" 8269 8270There are three arguments to the '``cmpxchg``' instruction: an address 8271to operate on, a value to compare to the value currently be at that 8272address, and a new value to place at that address if the compared values 8273are equal. The type of '<cmp>' must be an integer or pointer type whose 8274bit width is a power of two greater than or equal to eight and less 8275than or equal to a target-specific size limit. '<cmp>' and '<new>' must 8276have the same type, and the type of '<pointer>' must be a pointer to 8277that type. If the ``cmpxchg`` is marked as ``volatile``, then the 8278optimizer is not allowed to modify the number or order of execution of 8279this ``cmpxchg`` with other :ref:`volatile operations <volatile>`. 8280 8281The success and failure :ref:`ordering <ordering>` arguments specify how this 8282``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters 8283must be at least ``monotonic``, the ordering constraint on failure must be no 8284stronger than that on success, and the failure ordering cannot be either 8285``release`` or ``acq_rel``. 8286 8287A ``cmpxchg`` instruction can also take an optional 8288":ref:`syncscope <syncscope>`" argument. 8289 8290The pointer passed into cmpxchg must have alignment greater than or 8291equal to the size in memory of the operand. 8292 8293Semantics: 8294"""""""""" 8295 8296The contents of memory at the location specified by the '``<pointer>``' operand 8297is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is 8298written to the location. The original value at the location is returned, 8299together with a flag indicating success (true) or failure (false). 8300 8301If the cmpxchg operation is marked as ``weak`` then a spurious failure is 8302permitted: the operation may not write ``<new>`` even if the comparison 8303matched. 8304 8305If the cmpxchg operation is strong (the default), the i1 value is 1 if and only 8306if the value loaded equals ``cmp``. 8307 8308A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of 8309identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic 8310load with an ordering parameter determined the second ordering parameter. 8311 8312Example: 8313"""""""" 8314 8315.. code-block:: llvm 8316 8317 entry: 8318 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32 8319 br label %loop 8320 8321 loop: 8322 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop] 8323 %squared = mul i32 %cmp, %cmp 8324 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } 8325 %value_loaded = extractvalue { i32, i1 } %val_success, 0 8326 %success = extractvalue { i32, i1 } %val_success, 1 8327 br i1 %success, label %done, label %loop 8328 8329 done: 8330 ... 8331 8332.. _i_atomicrmw: 8333 8334'``atomicrmw``' Instruction 8335^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8336 8337Syntax: 8338""""""" 8339 8340:: 8341 8342 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering> ; yields ty 8343 8344Overview: 8345""""""""" 8346 8347The '``atomicrmw``' instruction is used to atomically modify memory. 8348 8349Arguments: 8350"""""""""" 8351 8352There are three arguments to the '``atomicrmw``' instruction: an 8353operation to apply, an address whose value to modify, an argument to the 8354operation. The operation must be one of the following keywords: 8355 8356- xchg 8357- add 8358- sub 8359- and 8360- nand 8361- or 8362- xor 8363- max 8364- min 8365- umax 8366- umin 8367 8368The type of '<value>' must be an integer type whose bit width is a power 8369of two greater than or equal to eight and less than or equal to a 8370target-specific size limit. The type of the '``<pointer>``' operand must 8371be a pointer to that type. If the ``atomicrmw`` is marked as 8372``volatile``, then the optimizer is not allowed to modify the number or 8373order of execution of this ``atomicrmw`` with other :ref:`volatile 8374operations <volatile>`. 8375 8376A ``atomicrmw`` instruction can also take an optional 8377":ref:`syncscope <syncscope>`" argument. 8378 8379Semantics: 8380"""""""""" 8381 8382The contents of memory at the location specified by the '``<pointer>``' 8383operand are atomically read, modified, and written back. The original 8384value at the location is returned. The modification is specified by the 8385operation argument: 8386 8387- xchg: ``*ptr = val`` 8388- add: ``*ptr = *ptr + val`` 8389- sub: ``*ptr = *ptr - val`` 8390- and: ``*ptr = *ptr & val`` 8391- nand: ``*ptr = ~(*ptr & val)`` 8392- or: ``*ptr = *ptr | val`` 8393- xor: ``*ptr = *ptr ^ val`` 8394- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) 8395- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) 8396- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned 8397 comparison) 8398- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned 8399 comparison) 8400 8401Example: 8402"""""""" 8403 8404.. code-block:: llvm 8405 8406 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32 8407 8408.. _i_getelementptr: 8409 8410'``getelementptr``' Instruction 8411^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8412 8413Syntax: 8414""""""" 8415 8416:: 8417 8418 <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 8419 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}* 8420 <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx> 8421 8422Overview: 8423""""""""" 8424 8425The '``getelementptr``' instruction is used to get the address of a 8426subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs 8427address calculation only and does not access memory. The instruction can also 8428be used to calculate a vector of such addresses. 8429 8430Arguments: 8431"""""""""" 8432 8433The first argument is always a type used as the basis for the calculations. 8434The second argument is always a pointer or a vector of pointers, and is the 8435base address to start from. The remaining arguments are indices 8436that indicate which of the elements of the aggregate object are indexed. 8437The interpretation of each index is dependent on the type being indexed 8438into. The first index always indexes the pointer value given as the 8439second argument, the second index indexes a value of the type pointed to 8440(not necessarily the value directly pointed to, since the first index 8441can be non-zero), etc. The first type indexed into must be a pointer 8442value, subsequent types can be arrays, vectors, and structs. Note that 8443subsequent types being indexed into can never be pointers, since that 8444would require loading the pointer before continuing calculation. 8445 8446The type of each index argument depends on the type it is indexing into. 8447When indexing into a (optionally packed) structure, only ``i32`` integer 8448**constants** are allowed (when using a vector of indices they must all 8449be the **same** ``i32`` integer constant). When indexing into an array, 8450pointer or vector, integers of any width are allowed, and they are not 8451required to be constant. These integers are treated as signed values 8452where relevant. 8453 8454For example, let's consider a C code fragment and how it gets compiled 8455to LLVM: 8456 8457.. code-block:: c 8458 8459 struct RT { 8460 char A; 8461 int B[10][20]; 8462 char C; 8463 }; 8464 struct ST { 8465 int X; 8466 double Y; 8467 struct RT Z; 8468 }; 8469 8470 int *foo(struct ST *s) { 8471 return &s[1].Z.B[5][13]; 8472 } 8473 8474The LLVM code generated by Clang is: 8475 8476.. code-block:: llvm 8477 8478 %struct.RT = type { i8, [10 x [20 x i32]], i8 } 8479 %struct.ST = type { i32, double, %struct.RT } 8480 8481 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { 8482 entry: 8483 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 8484 ret i32* %arrayidx 8485 } 8486 8487Semantics: 8488"""""""""" 8489 8490In the example above, the first index is indexing into the 8491'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' 8492= '``{ i32, double, %struct.RT }``' type, a structure. The second index 8493indexes into the third element of the structure, yielding a 8494'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another 8495structure. The third index indexes into the second element of the 8496structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two 8497dimensions of the array are subscripted into, yielding an '``i32``' 8498type. The '``getelementptr``' instruction returns a pointer to this 8499element, thus computing a value of '``i32*``' type. 8500 8501Note that it is perfectly legal to index partially through a structure, 8502returning a pointer to an inner element. Because of this, the LLVM code 8503for the given testcase is equivalent to: 8504 8505.. code-block:: llvm 8506 8507 define i32* @foo(%struct.ST* %s) { 8508 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1 8509 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2 8510 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3 8511 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4 8512 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5 8513 ret i32* %t5 8514 } 8515 8516If the ``inbounds`` keyword is present, the result value of the 8517``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base 8518pointer is not an *in bounds* address of an allocated object, or if any 8519of the addresses that would be formed by successive addition of the 8520offsets implied by the indices to the base address with infinitely 8521precise signed arithmetic are not an *in bounds* address of that 8522allocated object. The *in bounds* addresses for an allocated object are 8523all the addresses that point into the object, plus the address one byte 8524past the end. The only *in bounds* address for a null pointer in the 8525default address-space is the null pointer itself. In cases where the 8526base is a vector of pointers the ``inbounds`` keyword applies to each 8527of the computations element-wise. 8528 8529If the ``inbounds`` keyword is not present, the offsets are added to the 8530base address with silently-wrapping two's complement arithmetic. If the 8531offsets have a different width from the pointer, they are sign-extended 8532or truncated to the width of the pointer. The result value of the 8533``getelementptr`` may be outside the object pointed to by the base 8534pointer. The result value may not necessarily be used to access memory 8535though, even if it happens to point into allocated storage. See the 8536:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more 8537information. 8538 8539If the ``inrange`` keyword is present before any index, loading from or 8540storing to any pointer derived from the ``getelementptr`` has undefined 8541behavior if the load or store would access memory outside of the bounds of 8542the element selected by the index marked as ``inrange``. The result of a 8543pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations 8544involving memory) involving a pointer derived from a ``getelementptr`` with 8545the ``inrange`` keyword is undefined, with the exception of comparisons 8546in the case where both operands are in the range of the element selected 8547by the ``inrange`` keyword, inclusive of the address one past the end of 8548that element. Note that the ``inrange`` keyword is currently only allowed 8549in constant ``getelementptr`` expressions. 8550 8551The getelementptr instruction is often confusing. For some more insight 8552into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. 8553 8554Example: 8555"""""""" 8556 8557.. code-block:: llvm 8558 8559 ; yields [12 x i8]*:aptr 8560 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1 8561 ; yields i8*:vptr 8562 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1 8563 ; yields i8*:eptr 8564 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1 8565 ; yields i32*:iptr 8566 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0 8567 8568Vector of pointers: 8569""""""""""""""""""" 8570 8571The ``getelementptr`` returns a vector of pointers, instead of a single address, 8572when one or more of its arguments is a vector. In such cases, all vector 8573arguments should have the same number of elements, and every scalar argument 8574will be effectively broadcast into a vector during address calculation. 8575 8576.. code-block:: llvm 8577 8578 ; All arguments are vectors: 8579 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8) 8580 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets 8581 8582 ; Add the same scalar offset to each pointer of a vector: 8583 ; A[i] = ptrs[i] + offset*sizeof(i8) 8584 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset 8585 8586 ; Add distinct offsets to the same pointer: 8587 ; A[i] = ptr + offsets[i]*sizeof(i8) 8588 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets 8589 8590 ; In all cases described above the type of the result is <4 x i8*> 8591 8592The two following instructions are equivalent: 8593 8594.. code-block:: llvm 8595 8596 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 8597 <4 x i32> <i32 2, i32 2, i32 2, i32 2>, 8598 <4 x i32> <i32 1, i32 1, i32 1, i32 1>, 8599 <4 x i32> %ind4, 8600 <4 x i64> <i64 13, i64 13, i64 13, i64 13> 8601 8602 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, 8603 i32 2, i32 1, <4 x i32> %ind4, i64 13 8604 8605Let's look at the C code, where the vector version of ``getelementptr`` 8606makes sense: 8607 8608.. code-block:: c 8609 8610 // Let's assume that we vectorize the following loop: 8611 double *A, *B; int *C; 8612 for (int i = 0; i < size; ++i) { 8613 A[i] = B[C[i]]; 8614 } 8615 8616.. code-block:: llvm 8617 8618 ; get pointers for 8 elements from array B 8619 %ptrs = getelementptr double, double* %B, <8 x i32> %C 8620 ; load 8 elements from array B into A 8621 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs, 8622 i32 8, <8 x i1> %mask, <8 x double> %passthru) 8623 8624Conversion Operations 8625--------------------- 8626 8627The instructions in this category are the conversion instructions 8628(casting) which all take a single operand and a type. They perform 8629various bit conversions on the operand. 8630 8631.. _i_trunc: 8632 8633'``trunc .. to``' Instruction 8634^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8635 8636Syntax: 8637""""""" 8638 8639:: 8640 8641 <result> = trunc <ty> <value> to <ty2> ; yields ty2 8642 8643Overview: 8644""""""""" 8645 8646The '``trunc``' instruction truncates its operand to the type ``ty2``. 8647 8648Arguments: 8649"""""""""" 8650 8651The '``trunc``' instruction takes a value to trunc, and a type to trunc 8652it to. Both types must be of :ref:`integer <t_integer>` types, or vectors 8653of the same number of integers. The bit size of the ``value`` must be 8654larger than the bit size of the destination type, ``ty2``. Equal sized 8655types are not allowed. 8656 8657Semantics: 8658"""""""""" 8659 8660The '``trunc``' instruction truncates the high order bits in ``value`` 8661and converts the remaining bits to ``ty2``. Since the source size must 8662be larger than the destination size, ``trunc`` cannot be a *no-op cast*. 8663It will always truncate bits. 8664 8665Example: 8666"""""""" 8667 8668.. code-block:: llvm 8669 8670 %X = trunc i32 257 to i8 ; yields i8:1 8671 %Y = trunc i32 123 to i1 ; yields i1:true 8672 %Z = trunc i32 122 to i1 ; yields i1:false 8673 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> 8674 8675.. _i_zext: 8676 8677'``zext .. to``' Instruction 8678^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8679 8680Syntax: 8681""""""" 8682 8683:: 8684 8685 <result> = zext <ty> <value> to <ty2> ; yields ty2 8686 8687Overview: 8688""""""""" 8689 8690The '``zext``' instruction zero extends its operand to type ``ty2``. 8691 8692Arguments: 8693"""""""""" 8694 8695The '``zext``' instruction takes a value to cast, and a type to cast it 8696to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 8697the same number of integers. The bit size of the ``value`` must be 8698smaller than the bit size of the destination type, ``ty2``. 8699 8700Semantics: 8701"""""""""" 8702 8703The ``zext`` fills the high order bits of the ``value`` with zero bits 8704until it reaches the size of the destination type, ``ty2``. 8705 8706When zero extending from i1, the result will always be either 0 or 1. 8707 8708Example: 8709"""""""" 8710 8711.. code-block:: llvm 8712 8713 %X = zext i32 257 to i64 ; yields i64:257 8714 %Y = zext i1 true to i32 ; yields i32:1 8715 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 8716 8717.. _i_sext: 8718 8719'``sext .. to``' Instruction 8720^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8721 8722Syntax: 8723""""""" 8724 8725:: 8726 8727 <result> = sext <ty> <value> to <ty2> ; yields ty2 8728 8729Overview: 8730""""""""" 8731 8732The '``sext``' sign extends ``value`` to the type ``ty2``. 8733 8734Arguments: 8735"""""""""" 8736 8737The '``sext``' instruction takes a value to cast, and a type to cast it 8738to. Both types must be of :ref:`integer <t_integer>` types, or vectors of 8739the same number of integers. The bit size of the ``value`` must be 8740smaller than the bit size of the destination type, ``ty2``. 8741 8742Semantics: 8743"""""""""" 8744 8745The '``sext``' instruction performs a sign extension by copying the sign 8746bit (highest order bit) of the ``value`` until it reaches the bit size 8747of the type ``ty2``. 8748 8749When sign extending from i1, the extension always results in -1 or 0. 8750 8751Example: 8752"""""""" 8753 8754.. code-block:: llvm 8755 8756 %X = sext i8 -1 to i16 ; yields i16 :65535 8757 %Y = sext i1 true to i32 ; yields i32:-1 8758 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> 8759 8760'``fptrunc .. to``' Instruction 8761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8762 8763Syntax: 8764""""""" 8765 8766:: 8767 8768 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2 8769 8770Overview: 8771""""""""" 8772 8773The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. 8774 8775Arguments: 8776"""""""""" 8777 8778The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>` 8779value to cast and a :ref:`floating-point <t_floating>` type to cast it to. 8780The size of ``value`` must be larger than the size of ``ty2``. This 8781implies that ``fptrunc`` cannot be used to make a *no-op cast*. 8782 8783Semantics: 8784"""""""""" 8785 8786The '``fptrunc``' instruction casts a ``value`` from a larger 8787:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point 8788<t_floating>` type. 8789This instruction is assumed to execute in the default :ref:`floating-point 8790environment <floatenv>`. 8791 8792Example: 8793"""""""" 8794 8795.. code-block:: llvm 8796 8797 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0 8798 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity 8799 8800'``fpext .. to``' Instruction 8801^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8802 8803Syntax: 8804""""""" 8805 8806:: 8807 8808 <result> = fpext <ty> <value> to <ty2> ; yields ty2 8809 8810Overview: 8811""""""""" 8812 8813The '``fpext``' extends a floating-point ``value`` to a larger floating-point 8814value. 8815 8816Arguments: 8817"""""""""" 8818 8819The '``fpext``' instruction takes a :ref:`floating-point <t_floating>` 8820``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it 8821to. The source type must be smaller than the destination type. 8822 8823Semantics: 8824"""""""""" 8825 8826The '``fpext``' instruction extends the ``value`` from a smaller 8827:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point 8828<t_floating>` type. The ``fpext`` cannot be used to make a 8829*no-op cast* because it always changes bits. Use ``bitcast`` to make a 8830*no-op cast* for a floating-point cast. 8831 8832Example: 8833"""""""" 8834 8835.. code-block:: llvm 8836 8837 %X = fpext float 3.125 to double ; yields double:3.125000e+00 8838 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 8839 8840'``fptoui .. to``' Instruction 8841^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8842 8843Syntax: 8844""""""" 8845 8846:: 8847 8848 <result> = fptoui <ty> <value> to <ty2> ; yields ty2 8849 8850Overview: 8851""""""""" 8852 8853The '``fptoui``' converts a floating-point ``value`` to its unsigned 8854integer equivalent of type ``ty2``. 8855 8856Arguments: 8857"""""""""" 8858 8859The '``fptoui``' instruction takes a value to cast, which must be a 8860scalar or vector :ref:`floating-point <t_floating>` value, and a type to 8861cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 8862``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 8863type with the same number of elements as ``ty`` 8864 8865Semantics: 8866"""""""""" 8867 8868The '``fptoui``' instruction converts its :ref:`floating-point 8869<t_floating>` operand into the nearest (rounding towards zero) 8870unsigned integer value. If the value cannot fit in ``ty2``, the result 8871is a :ref:`poison value <poisonvalues>`. 8872 8873Example: 8874"""""""" 8875 8876.. code-block:: llvm 8877 8878 %X = fptoui double 123.0 to i32 ; yields i32:123 8879 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 8880 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 8881 8882'``fptosi .. to``' Instruction 8883^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8884 8885Syntax: 8886""""""" 8887 8888:: 8889 8890 <result> = fptosi <ty> <value> to <ty2> ; yields ty2 8891 8892Overview: 8893""""""""" 8894 8895The '``fptosi``' instruction converts :ref:`floating-point <t_floating>` 8896``value`` to type ``ty2``. 8897 8898Arguments: 8899"""""""""" 8900 8901The '``fptosi``' instruction takes a value to cast, which must be a 8902scalar or vector :ref:`floating-point <t_floating>` value, and a type to 8903cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If 8904``ty`` is a vector floating-point type, ``ty2`` must be a vector integer 8905type with the same number of elements as ``ty`` 8906 8907Semantics: 8908"""""""""" 8909 8910The '``fptosi``' instruction converts its :ref:`floating-point 8911<t_floating>` operand into the nearest (rounding towards zero) 8912signed integer value. If the value cannot fit in ``ty2``, the result 8913is a :ref:`poison value <poisonvalues>`. 8914 8915Example: 8916"""""""" 8917 8918.. code-block:: llvm 8919 8920 %X = fptosi double -123.0 to i32 ; yields i32:-123 8921 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 8922 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 8923 8924'``uitofp .. to``' Instruction 8925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8926 8927Syntax: 8928""""""" 8929 8930:: 8931 8932 <result> = uitofp <ty> <value> to <ty2> ; yields ty2 8933 8934Overview: 8935""""""""" 8936 8937The '``uitofp``' instruction regards ``value`` as an unsigned integer 8938and converts that value to the ``ty2`` type. 8939 8940Arguments: 8941"""""""""" 8942 8943The '``uitofp``' instruction takes a value to cast, which must be a 8944scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 8945``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 8946``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 8947type with the same number of elements as ``ty`` 8948 8949Semantics: 8950"""""""""" 8951 8952The '``uitofp``' instruction interprets its operand as an unsigned 8953integer quantity and converts it to the corresponding floating-point 8954value. If the value cannot be exactly represented, it is rounded using 8955the default rounding mode. 8956 8957 8958Example: 8959"""""""" 8960 8961.. code-block:: llvm 8962 8963 %X = uitofp i32 257 to float ; yields float:257.0 8964 %Y = uitofp i8 -1 to double ; yields double:255.0 8965 8966'``sitofp .. to``' Instruction 8967^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 8968 8969Syntax: 8970""""""" 8971 8972:: 8973 8974 <result> = sitofp <ty> <value> to <ty2> ; yields ty2 8975 8976Overview: 8977""""""""" 8978 8979The '``sitofp``' instruction regards ``value`` as a signed integer and 8980converts that value to the ``ty2`` type. 8981 8982Arguments: 8983"""""""""" 8984 8985The '``sitofp``' instruction takes a value to cast, which must be a 8986scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to 8987``ty2``, which must be an :ref:`floating-point <t_floating>` type. If 8988``ty`` is a vector integer type, ``ty2`` must be a vector floating-point 8989type with the same number of elements as ``ty`` 8990 8991Semantics: 8992"""""""""" 8993 8994The '``sitofp``' instruction interprets its operand as a signed integer 8995quantity and converts it to the corresponding floating-point value. If the 8996value cannot be exactly represented, it is rounded using the default rounding 8997mode. 8998 8999Example: 9000"""""""" 9001 9002.. code-block:: llvm 9003 9004 %X = sitofp i32 257 to float ; yields float:257.0 9005 %Y = sitofp i8 -1 to double ; yields double:-1.0 9006 9007.. _i_ptrtoint: 9008 9009'``ptrtoint .. to``' Instruction 9010^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9011 9012Syntax: 9013""""""" 9014 9015:: 9016 9017 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 9018 9019Overview: 9020""""""""" 9021 9022The '``ptrtoint``' instruction converts the pointer or a vector of 9023pointers ``value`` to the integer (or vector of integers) type ``ty2``. 9024 9025Arguments: 9026"""""""""" 9027 9028The '``ptrtoint``' instruction takes a ``value`` to cast, which must be 9029a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a 9030type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or 9031a vector of integers type. 9032 9033Semantics: 9034"""""""""" 9035 9036The '``ptrtoint``' instruction converts ``value`` to integer type 9037``ty2`` by interpreting the pointer value as an integer and either 9038truncating or zero extending that value to the size of the integer type. 9039If ``value`` is smaller than ``ty2`` then a zero extension is done. If 9040``value`` is larger than ``ty2`` then a truncation is done. If they are 9041the same size, then nothing is done (*no-op cast*) other than a type 9042change. 9043 9044Example: 9045"""""""" 9046 9047.. code-block:: llvm 9048 9049 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture 9050 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture 9051 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture 9052 9053.. _i_inttoptr: 9054 9055'``inttoptr .. to``' Instruction 9056^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9057 9058Syntax: 9059""""""" 9060 9061:: 9062 9063 <result> = inttoptr <ty> <value> to <ty2> ; yields ty2 9064 9065Overview: 9066""""""""" 9067 9068The '``inttoptr``' instruction converts an integer ``value`` to a 9069pointer type, ``ty2``. 9070 9071Arguments: 9072"""""""""" 9073 9074The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to 9075cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` 9076type. 9077 9078Semantics: 9079"""""""""" 9080 9081The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by 9082applying either a zero extension or a truncation depending on the size 9083of the integer ``value``. If ``value`` is larger than the size of a 9084pointer then a truncation is done. If ``value`` is smaller than the size 9085of a pointer then a zero extension is done. If they are the same size, 9086nothing is done (*no-op cast*). 9087 9088Example: 9089"""""""" 9090 9091.. code-block:: llvm 9092 9093 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture 9094 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture 9095 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture 9096 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers 9097 9098.. _i_bitcast: 9099 9100'``bitcast .. to``' Instruction 9101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9102 9103Syntax: 9104""""""" 9105 9106:: 9107 9108 <result> = bitcast <ty> <value> to <ty2> ; yields ty2 9109 9110Overview: 9111""""""""" 9112 9113The '``bitcast``' instruction converts ``value`` to type ``ty2`` without 9114changing any bits. 9115 9116Arguments: 9117"""""""""" 9118 9119The '``bitcast``' instruction takes a value to cast, which must be a 9120non-aggregate first class value, and a type to cast it to, which must 9121also be a non-aggregate :ref:`first class <t_firstclass>` type. The 9122bit sizes of ``value`` and the destination type, ``ty2``, must be 9123identical. If the source type is a pointer, the destination type must 9124also be a pointer of the same size. This instruction supports bitwise 9125conversion of vectors to integers and to vectors of other types (as 9126long as they have the same size). 9127 9128Semantics: 9129"""""""""" 9130 9131The '``bitcast``' instruction converts ``value`` to type ``ty2``. It 9132is always a *no-op cast* because no bits change with this 9133conversion. The conversion is done as if the ``value`` had been stored 9134to memory and read back as type ``ty2``. Pointer (or vector of 9135pointers) types may only be converted to other pointer (or vector of 9136pointers) types with the same address space through this instruction. 9137To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>` 9138or :ref:`ptrtoint <i_ptrtoint>` instructions first. 9139 9140Example: 9141"""""""" 9142 9143.. code-block:: text 9144 9145 %X = bitcast i8 255 to i8 ; yields i8 :-1 9146 %Y = bitcast i32* %x to sint* ; yields sint*:%x 9147 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V 9148 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> 9149 9150.. _i_addrspacecast: 9151 9152'``addrspacecast .. to``' Instruction 9153^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9154 9155Syntax: 9156""""""" 9157 9158:: 9159 9160 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2 9161 9162Overview: 9163""""""""" 9164 9165The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in 9166address space ``n`` to type ``pty2`` in address space ``m``. 9167 9168Arguments: 9169"""""""""" 9170 9171The '``addrspacecast``' instruction takes a pointer or vector of pointer value 9172to cast and a pointer type to cast it to, which must have a different 9173address space. 9174 9175Semantics: 9176"""""""""" 9177 9178The '``addrspacecast``' instruction converts the pointer value 9179``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex 9180value modification, depending on the target and the address space 9181pair. Pointer conversions within the same address space must be 9182performed with the ``bitcast`` instruction. Note that if the address space 9183conversion is legal then both result and operand refer to the same memory 9184location. 9185 9186Example: 9187"""""""" 9188 9189.. code-block:: llvm 9190 9191 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x 9192 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y 9193 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z 9194 9195.. _otherops: 9196 9197Other Operations 9198---------------- 9199 9200The instructions in this category are the "miscellaneous" instructions, 9201which defy better classification. 9202 9203.. _i_icmp: 9204 9205'``icmp``' Instruction 9206^^^^^^^^^^^^^^^^^^^^^^ 9207 9208Syntax: 9209""""""" 9210 9211:: 9212 9213 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 9214 9215Overview: 9216""""""""" 9217 9218The '``icmp``' instruction returns a boolean value or a vector of 9219boolean values based on comparison of its two integer, integer vector, 9220pointer, or pointer vector operands. 9221 9222Arguments: 9223"""""""""" 9224 9225The '``icmp``' instruction takes three operands. The first operand is 9226the condition code indicating the kind of comparison to perform. It is 9227not a value, just a keyword. The possible condition codes are: 9228 9229#. ``eq``: equal 9230#. ``ne``: not equal 9231#. ``ugt``: unsigned greater than 9232#. ``uge``: unsigned greater or equal 9233#. ``ult``: unsigned less than 9234#. ``ule``: unsigned less or equal 9235#. ``sgt``: signed greater than 9236#. ``sge``: signed greater or equal 9237#. ``slt``: signed less than 9238#. ``sle``: signed less or equal 9239 9240The remaining two arguments must be :ref:`integer <t_integer>` or 9241:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They 9242must also be identical types. 9243 9244Semantics: 9245"""""""""" 9246 9247The '``icmp``' compares ``op1`` and ``op2`` according to the condition 9248code given as ``cond``. The comparison performed always yields either an 9249:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: 9250 9251#. ``eq``: yields ``true`` if the operands are equal, ``false`` 9252 otherwise. No sign interpretation is necessary or performed. 9253#. ``ne``: yields ``true`` if the operands are unequal, ``false`` 9254 otherwise. No sign interpretation is necessary or performed. 9255#. ``ugt``: interprets the operands as unsigned values and yields 9256 ``true`` if ``op1`` is greater than ``op2``. 9257#. ``uge``: interprets the operands as unsigned values and yields 9258 ``true`` if ``op1`` is greater than or equal to ``op2``. 9259#. ``ult``: interprets the operands as unsigned values and yields 9260 ``true`` if ``op1`` is less than ``op2``. 9261#. ``ule``: interprets the operands as unsigned values and yields 9262 ``true`` if ``op1`` is less than or equal to ``op2``. 9263#. ``sgt``: interprets the operands as signed values and yields ``true`` 9264 if ``op1`` is greater than ``op2``. 9265#. ``sge``: interprets the operands as signed values and yields ``true`` 9266 if ``op1`` is greater than or equal to ``op2``. 9267#. ``slt``: interprets the operands as signed values and yields ``true`` 9268 if ``op1`` is less than ``op2``. 9269#. ``sle``: interprets the operands as signed values and yields ``true`` 9270 if ``op1`` is less than or equal to ``op2``. 9271 9272If the operands are :ref:`pointer <t_pointer>` typed, the pointer values 9273are compared as if they were integers. 9274 9275If the operands are integer vectors, then they are compared element by 9276element. The result is an ``i1`` vector with the same number of elements 9277as the values being compared. Otherwise, the result is an ``i1``. 9278 9279Example: 9280"""""""" 9281 9282.. code-block:: text 9283 9284 <result> = icmp eq i32 4, 5 ; yields: result=false 9285 <result> = icmp ne float* %X, %X ; yields: result=false 9286 <result> = icmp ult i16 4, 5 ; yields: result=true 9287 <result> = icmp sgt i16 4, 5 ; yields: result=false 9288 <result> = icmp ule i16 -4, 5 ; yields: result=false 9289 <result> = icmp sge i16 4, 5 ; yields: result=false 9290 9291.. _i_fcmp: 9292 9293'``fcmp``' Instruction 9294^^^^^^^^^^^^^^^^^^^^^^ 9295 9296Syntax: 9297""""""" 9298 9299:: 9300 9301 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result 9302 9303Overview: 9304""""""""" 9305 9306The '``fcmp``' instruction returns a boolean value or vector of boolean 9307values based on comparison of its operands. 9308 9309If the operands are floating-point scalars, then the result type is a 9310boolean (:ref:`i1 <t_integer>`). 9311 9312If the operands are floating-point vectors, then the result type is a 9313vector of boolean with the same number of elements as the operands being 9314compared. 9315 9316Arguments: 9317"""""""""" 9318 9319The '``fcmp``' instruction takes three operands. The first operand is 9320the condition code indicating the kind of comparison to perform. It is 9321not a value, just a keyword. The possible condition codes are: 9322 9323#. ``false``: no comparison, always returns false 9324#. ``oeq``: ordered and equal 9325#. ``ogt``: ordered and greater than 9326#. ``oge``: ordered and greater than or equal 9327#. ``olt``: ordered and less than 9328#. ``ole``: ordered and less than or equal 9329#. ``one``: ordered and not equal 9330#. ``ord``: ordered (no nans) 9331#. ``ueq``: unordered or equal 9332#. ``ugt``: unordered or greater than 9333#. ``uge``: unordered or greater than or equal 9334#. ``ult``: unordered or less than 9335#. ``ule``: unordered or less than or equal 9336#. ``une``: unordered or not equal 9337#. ``uno``: unordered (either nans) 9338#. ``true``: no comparison, always returns true 9339 9340*Ordered* means that neither operand is a QNAN while *unordered* means 9341that either operand may be a QNAN. 9342 9343Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point 9344<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type. 9345They must have identical types. 9346 9347Semantics: 9348"""""""""" 9349 9350The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the 9351condition code given as ``cond``. If the operands are vectors, then the 9352vectors are compared element by element. Each comparison performed 9353always yields an :ref:`i1 <t_integer>` result, as follows: 9354 9355#. ``false``: always yields ``false``, regardless of operands. 9356#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` 9357 is equal to ``op2``. 9358#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` 9359 is greater than ``op2``. 9360#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` 9361 is greater than or equal to ``op2``. 9362#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` 9363 is less than ``op2``. 9364#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` 9365 is less than or equal to ``op2``. 9366#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` 9367 is not equal to ``op2``. 9368#. ``ord``: yields ``true`` if both operands are not a QNAN. 9369#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is 9370 equal to ``op2``. 9371#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is 9372 greater than ``op2``. 9373#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is 9374 greater than or equal to ``op2``. 9375#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is 9376 less than ``op2``. 9377#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is 9378 less than or equal to ``op2``. 9379#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is 9380 not equal to ``op2``. 9381#. ``uno``: yields ``true`` if either operand is a QNAN. 9382#. ``true``: always yields ``true``, regardless of operands. 9383 9384The ``fcmp`` instruction can also optionally take any number of 9385:ref:`fast-math flags <fastmath>`, which are optimization hints to enable 9386otherwise unsafe floating-point optimizations. 9387 9388Any set of fast-math flags are legal on an ``fcmp`` instruction, but the 9389only flags that have any effect on its semantics are those that allow 9390assumptions to be made about the values of input arguments; namely 9391``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information. 9392 9393Example: 9394"""""""" 9395 9396.. code-block:: text 9397 9398 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false 9399 <result> = fcmp one float 4.0, 5.0 ; yields: result=true 9400 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true 9401 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false 9402 9403.. _i_phi: 9404 9405'``phi``' Instruction 9406^^^^^^^^^^^^^^^^^^^^^ 9407 9408Syntax: 9409""""""" 9410 9411:: 9412 9413 <result> = phi <ty> [ <val0>, <label0>], ... 9414 9415Overview: 9416""""""""" 9417 9418The '``phi``' instruction is used to implement the φ node in the SSA 9419graph representing the function. 9420 9421Arguments: 9422"""""""""" 9423 9424The type of the incoming values is specified with the first type field. 9425After this, the '``phi``' instruction takes a list of pairs as 9426arguments, with one pair for each predecessor basic block of the current 9427block. Only values of :ref:`first class <t_firstclass>` type may be used as 9428the value arguments to the PHI node. Only labels may be used as the 9429label arguments. 9430 9431There must be no non-phi instructions between the start of a basic block 9432and the PHI instructions: i.e. PHI instructions must be first in a basic 9433block. 9434 9435For the purposes of the SSA form, the use of each incoming value is 9436deemed to occur on the edge from the corresponding predecessor block to 9437the current block (but after any definition of an '``invoke``' 9438instruction's return value on the same edge). 9439 9440Semantics: 9441"""""""""" 9442 9443At runtime, the '``phi``' instruction logically takes on the value 9444specified by the pair corresponding to the predecessor basic block that 9445executed just prior to the current block. 9446 9447Example: 9448"""""""" 9449 9450.. code-block:: llvm 9451 9452 Loop: ; Infinite loop that counts from 0 on up... 9453 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] 9454 %nextindvar = add i32 %indvar, 1 9455 br label %Loop 9456 9457.. _i_select: 9458 9459'``select``' Instruction 9460^^^^^^^^^^^^^^^^^^^^^^^^ 9461 9462Syntax: 9463""""""" 9464 9465:: 9466 9467 <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty 9468 9469 selty is either i1 or {<N x i1>} 9470 9471Overview: 9472""""""""" 9473 9474The '``select``' instruction is used to choose one value based on a 9475condition, without IR-level branching. 9476 9477Arguments: 9478"""""""""" 9479 9480The '``select``' instruction requires an 'i1' value or a vector of 'i1' 9481values indicating the condition, and two values of the same :ref:`first 9482class <t_firstclass>` type. 9483 9484Semantics: 9485"""""""""" 9486 9487If the condition is an i1 and it evaluates to 1, the instruction returns 9488the first value argument; otherwise, it returns the second value 9489argument. 9490 9491If the condition is a vector of i1, then the value arguments must be 9492vectors of the same size, and the selection is done element by element. 9493 9494If the condition is an i1 and the value arguments are vectors of the 9495same size, then an entire vector is selected. 9496 9497Example: 9498"""""""" 9499 9500.. code-block:: llvm 9501 9502 %X = select i1 true, i8 17, i8 42 ; yields i8:17 9503 9504.. _i_call: 9505 9506'``call``' Instruction 9507^^^^^^^^^^^^^^^^^^^^^^ 9508 9509Syntax: 9510""""""" 9511 9512:: 9513 9514 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] 9515 [ operand bundles ] 9516 9517Overview: 9518""""""""" 9519 9520The '``call``' instruction represents a simple function call. 9521 9522Arguments: 9523"""""""""" 9524 9525This instruction requires several arguments: 9526 9527#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers 9528 should perform tail call optimization. The ``tail`` marker is a hint that 9529 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker 9530 means that the call must be tail call optimized in order for the program to 9531 be correct. The ``musttail`` marker provides these guarantees: 9532 9533 #. The call will not cause unbounded stack growth if it is part of a 9534 recursive cycle in the call graph. 9535 #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are 9536 forwarded in place. 9537 9538 Both markers imply that the callee does not access allocas from the caller. 9539 The ``tail`` marker additionally implies that the callee does not access 9540 varargs from the caller, while ``musttail`` implies that varargs from the 9541 caller are passed to the callee. Calls marked ``musttail`` must obey the 9542 following additional rules: 9543 9544 - The call must immediately precede a :ref:`ret <i_ret>` instruction, 9545 or a pointer bitcast followed by a ret instruction. 9546 - The ret instruction must return the (possibly bitcasted) value 9547 produced by the call or void. 9548 - The caller and callee prototypes must match. Pointer types of 9549 parameters or return types may differ in pointee type, but not 9550 in address space. 9551 - The calling conventions of the caller and callee must match. 9552 - All ABI-impacting function attributes, such as sret, byval, inreg, 9553 returned, and inalloca, must match. 9554 - The callee must be varargs iff the caller is varargs. Bitcasting a 9555 non-varargs function to the appropriate varargs type is legal so 9556 long as the non-varargs prefixes obey the other rules. 9557 9558 Tail call optimization for calls marked ``tail`` is guaranteed to occur if 9559 the following conditions are met: 9560 9561 - Caller and callee both have the calling convention ``fastcc``. 9562 - The call is in tail position (ret immediately follows call and ret 9563 uses value of call or is void). 9564 - Option ``-tailcallopt`` is enabled, or 9565 ``llvm::GuaranteedTailCallOpt`` is ``true``. 9566 - `Platform-specific constraints are 9567 met. <CodeGenerator.html#tailcallopt>`_ 9568 9569#. The optional ``notail`` marker indicates that the optimizers should not add 9570 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail 9571 call optimization from being performed on the call. 9572 9573#. The optional ``fast-math flags`` marker indicates that the call has one or more 9574 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable 9575 otherwise unsafe floating-point optimizations. Fast-math flags are only valid 9576 for calls that return a floating-point scalar or vector type. 9577 9578#. The optional "cconv" marker indicates which :ref:`calling 9579 convention <callingconv>` the call should use. If none is 9580 specified, the call defaults to using C calling conventions. The 9581 calling convention of the call must match the calling convention of 9582 the target function, or else the behavior is undefined. 9583#. The optional :ref:`Parameter Attributes <paramattrs>` list for return 9584 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes 9585 are valid here. 9586#. '``ty``': the type of the call instruction itself which is also the 9587 type of the return value. Functions that return no value are marked 9588 ``void``. 9589#. '``fnty``': shall be the signature of the function being called. The 9590 argument types must match the types implied by this signature. This 9591 type can be omitted if the function is not varargs. 9592#. '``fnptrval``': An LLVM value containing a pointer to a function to 9593 be called. In most cases, this is a direct function call, but 9594 indirect ``call``'s are just as possible, calling an arbitrary pointer 9595 to function value. 9596#. '``function args``': argument list whose types match the function 9597 signature argument types and parameter attributes. All arguments must 9598 be of :ref:`first class <t_firstclass>` type. If the function signature 9599 indicates the function accepts a variable number of arguments, the 9600 extra arguments can be specified. 9601#. The optional :ref:`function attributes <fnattrs>` list. 9602#. The optional :ref:`operand bundles <opbundles>` list. 9603 9604Semantics: 9605"""""""""" 9606 9607The '``call``' instruction is used to cause control flow to transfer to 9608a specified function, with its incoming arguments bound to the specified 9609values. Upon a '``ret``' instruction in the called function, control 9610flow continues with the instruction after the function call, and the 9611return value of the function is bound to the result argument. 9612 9613Example: 9614"""""""" 9615 9616.. code-block:: llvm 9617 9618 %retval = call i32 @test(i32 %argc) 9619 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32 9620 %X = tail call i32 @foo() ; yields i32 9621 %Y = tail call fastcc i32 @foo() ; yields i32 9622 call void %foo(i8 97 signext) 9623 9624 %struct.A = type { i32, i8 } 9625 %r = call %struct.A @foo() ; yields { i32, i8 } 9626 %gr = extractvalue %struct.A %r, 0 ; yields i32 9627 %gr1 = extractvalue %struct.A %r, 1 ; yields i8 9628 %Z = call void @foo() noreturn ; indicates that %foo never returns normally 9629 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended 9630 9631llvm treats calls to some functions with names and arguments that match 9632the standard C99 library as being the C99 library functions, and may 9633perform optimizations or generate code for them under that assumption. 9634This is something we'd like to change in the future to provide better 9635support for freestanding environments and non-C-based languages. 9636 9637.. _i_va_arg: 9638 9639'``va_arg``' Instruction 9640^^^^^^^^^^^^^^^^^^^^^^^^ 9641 9642Syntax: 9643""""""" 9644 9645:: 9646 9647 <resultval> = va_arg <va_list*> <arglist>, <argty> 9648 9649Overview: 9650""""""""" 9651 9652The '``va_arg``' instruction is used to access arguments passed through 9653the "variable argument" area of a function call. It is used to implement 9654the ``va_arg`` macro in C. 9655 9656Arguments: 9657"""""""""" 9658 9659This instruction takes a ``va_list*`` value and the type of the 9660argument. It returns a value of the specified argument type and 9661increments the ``va_list`` to point to the next argument. The actual 9662type of ``va_list`` is target specific. 9663 9664Semantics: 9665"""""""""" 9666 9667The '``va_arg``' instruction loads an argument of the specified type 9668from the specified ``va_list`` and causes the ``va_list`` to point to 9669the next argument. For more information, see the variable argument 9670handling :ref:`Intrinsic Functions <int_varargs>`. 9671 9672It is legal for this instruction to be called in a function which does 9673not take a variable number of arguments, for example, the ``vfprintf`` 9674function. 9675 9676``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic 9677function <intrinsics>` because it takes a type as an argument. 9678 9679Example: 9680"""""""" 9681 9682See the :ref:`variable argument processing <int_varargs>` section. 9683 9684Note that the code generator does not yet fully support va\_arg on many 9685targets. Also, it does not currently support va\_arg with aggregate 9686types on any target. 9687 9688.. _i_landingpad: 9689 9690'``landingpad``' Instruction 9691^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9692 9693Syntax: 9694""""""" 9695 9696:: 9697 9698 <resultval> = landingpad <resultty> <clause>+ 9699 <resultval> = landingpad <resultty> cleanup <clause>* 9700 9701 <clause> := catch <type> <value> 9702 <clause> := filter <array constant type> <array constant> 9703 9704Overview: 9705""""""""" 9706 9707The '``landingpad``' instruction is used by `LLVM's exception handling 9708system <ExceptionHandling.html#overview>`_ to specify that a basic block 9709is a landing pad --- one where the exception lands, and corresponds to the 9710code found in the ``catch`` portion of a ``try``/``catch`` sequence. It 9711defines values supplied by the :ref:`personality function <personalityfn>` upon 9712re-entry to the function. The ``resultval`` has the type ``resultty``. 9713 9714Arguments: 9715"""""""""" 9716 9717The optional 9718``cleanup`` flag indicates that the landing pad block is a cleanup. 9719 9720A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and 9721contains the global variable representing the "type" that may be caught 9722or filtered respectively. Unlike the ``catch`` clause, the ``filter`` 9723clause takes an array constant as its argument. Use 9724"``[0 x i8**] undef``" for a filter which cannot throw. The 9725'``landingpad``' instruction must contain *at least* one ``clause`` or 9726the ``cleanup`` flag. 9727 9728Semantics: 9729"""""""""" 9730 9731The '``landingpad``' instruction defines the values which are set by the 9732:ref:`personality function <personalityfn>` upon re-entry to the function, and 9733therefore the "result type" of the ``landingpad`` instruction. As with 9734calling conventions, how the personality function results are 9735represented in LLVM IR is target specific. 9736 9737The clauses are applied in order from top to bottom. If two 9738``landingpad`` instructions are merged together through inlining, the 9739clauses from the calling function are appended to the list of clauses. 9740When the call stack is being unwound due to an exception being thrown, 9741the exception is compared against each ``clause`` in turn. If it doesn't 9742match any of the clauses, and the ``cleanup`` flag is not set, then 9743unwinding continues further up the call stack. 9744 9745The ``landingpad`` instruction has several restrictions: 9746 9747- A landing pad block is a basic block which is the unwind destination 9748 of an '``invoke``' instruction. 9749- A landing pad block must have a '``landingpad``' instruction as its 9750 first non-PHI instruction. 9751- There can be only one '``landingpad``' instruction within the landing 9752 pad block. 9753- A basic block that is not a landing pad block may not include a 9754 '``landingpad``' instruction. 9755 9756Example: 9757"""""""" 9758 9759.. code-block:: llvm 9760 9761 ;; A landing pad which can catch an integer. 9762 %res = landingpad { i8*, i32 } 9763 catch i8** @_ZTIi 9764 ;; A landing pad that is a cleanup. 9765 %res = landingpad { i8*, i32 } 9766 cleanup 9767 ;; A landing pad which can catch an integer and can only throw a double. 9768 %res = landingpad { i8*, i32 } 9769 catch i8** @_ZTIi 9770 filter [1 x i8**] [@_ZTId] 9771 9772.. _i_catchpad: 9773 9774'``catchpad``' Instruction 9775^^^^^^^^^^^^^^^^^^^^^^^^^^ 9776 9777Syntax: 9778""""""" 9779 9780:: 9781 9782 <resultval> = catchpad within <catchswitch> [<args>*] 9783 9784Overview: 9785""""""""" 9786 9787The '``catchpad``' instruction is used by `LLVM's exception handling 9788system <ExceptionHandling.html#overview>`_ to specify that a basic block 9789begins a catch handler --- one where a personality routine attempts to transfer 9790control to catch an exception. 9791 9792Arguments: 9793"""""""""" 9794 9795The ``catchswitch`` operand must always be a token produced by a 9796:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This 9797ensures that each ``catchpad`` has exactly one predecessor block, and it always 9798terminates in a ``catchswitch``. 9799 9800The ``args`` correspond to whatever information the personality routine 9801requires to know if this is an appropriate handler for the exception. Control 9802will transfer to the ``catchpad`` if this is the first appropriate handler for 9803the exception. 9804 9805The ``resultval`` has the type :ref:`token <t_token>` and is used to match the 9806``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH 9807pads. 9808 9809Semantics: 9810"""""""""" 9811 9812When the call stack is being unwound due to an exception being thrown, the 9813exception is compared against the ``args``. If it doesn't match, control will 9814not reach the ``catchpad`` instruction. The representation of ``args`` is 9815entirely target and personality function-specific. 9816 9817Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad`` 9818instruction must be the first non-phi of its parent basic block. 9819 9820The meaning of the tokens produced and consumed by ``catchpad`` and other "pad" 9821instructions is described in the 9822`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_. 9823 9824When a ``catchpad`` has been "entered" but not yet "exited" (as 9825described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 9826it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 9827that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 9828 9829Example: 9830"""""""" 9831 9832.. code-block:: text 9833 9834 dispatch: 9835 %cs = catchswitch within none [label %handler0] unwind to caller 9836 ;; A catch block which can catch an integer. 9837 handler0: 9838 %tok = catchpad within %cs [i8** @_ZTIi] 9839 9840.. _i_cleanuppad: 9841 9842'``cleanuppad``' Instruction 9843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 9844 9845Syntax: 9846""""""" 9847 9848:: 9849 9850 <resultval> = cleanuppad within <parent> [<args>*] 9851 9852Overview: 9853""""""""" 9854 9855The '``cleanuppad``' instruction is used by `LLVM's exception handling 9856system <ExceptionHandling.html#overview>`_ to specify that a basic block 9857is a cleanup block --- one where a personality routine attempts to 9858transfer control to run cleanup actions. 9859The ``args`` correspond to whatever additional 9860information the :ref:`personality function <personalityfn>` requires to 9861execute the cleanup. 9862The ``resultval`` has the type :ref:`token <t_token>` and is used to 9863match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`. 9864The ``parent`` argument is the token of the funclet that contains the 9865``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet, 9866this operand may be the token ``none``. 9867 9868Arguments: 9869"""""""""" 9870 9871The instruction takes a list of arbitrary values which are interpreted 9872by the :ref:`personality function <personalityfn>`. 9873 9874Semantics: 9875"""""""""" 9876 9877When the call stack is being unwound due to an exception being thrown, 9878the :ref:`personality function <personalityfn>` transfers control to the 9879``cleanuppad`` with the aid of the personality-specific arguments. 9880As with calling conventions, how the personality function results are 9881represented in LLVM IR is target specific. 9882 9883The ``cleanuppad`` instruction has several restrictions: 9884 9885- A cleanup block is a basic block which is the unwind destination of 9886 an exceptional instruction. 9887- A cleanup block must have a '``cleanuppad``' instruction as its 9888 first non-PHI instruction. 9889- There can be only one '``cleanuppad``' instruction within the 9890 cleanup block. 9891- A basic block that is not a cleanup block may not include a 9892 '``cleanuppad``' instruction. 9893 9894When a ``cleanuppad`` has been "entered" but not yet "exited" (as 9895described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_), 9896it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>` 9897that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`. 9898 9899Example: 9900"""""""" 9901 9902.. code-block:: text 9903 9904 %tok = cleanuppad within %cs [] 9905 9906.. _intrinsics: 9907 9908Intrinsic Functions 9909=================== 9910 9911LLVM supports the notion of an "intrinsic function". These functions 9912have well known names and semantics and are required to follow certain 9913restrictions. Overall, these intrinsics represent an extension mechanism 9914for the LLVM language that does not require changing all of the 9915transformations in LLVM when adding to the language (or the bitcode 9916reader/writer, the parser, etc...). 9917 9918Intrinsic function names must all start with an "``llvm.``" prefix. This 9919prefix is reserved in LLVM for intrinsic names; thus, function names may 9920not begin with this prefix. Intrinsic functions must always be external 9921functions: you cannot define the body of intrinsic functions. Intrinsic 9922functions may only be used in call or invoke instructions: it is illegal 9923to take the address of an intrinsic function. Additionally, because 9924intrinsic functions are part of the LLVM language, it is required if any 9925are added that they be documented here. 9926 9927Some intrinsic functions can be overloaded, i.e., the intrinsic 9928represents a family of functions that perform the same operation but on 9929different data types. Because LLVM can represent over 8 million 9930different integer types, overloading is used commonly to allow an 9931intrinsic function to operate on any integer type. One or more of the 9932argument types or the result type can be overloaded to accept any 9933integer type. Argument types may also be defined as exactly matching a 9934previous argument's type or the result type. This allows an intrinsic 9935function which accepts multiple arguments, but needs all of them to be 9936of the same type, to only be overloaded with respect to a single 9937argument or the result. 9938 9939Overloaded intrinsics will have the names of its overloaded argument 9940types encoded into its function name, each preceded by a period. Only 9941those types which are overloaded result in a name suffix. Arguments 9942whose type is matched against another type do not. For example, the 9943``llvm.ctpop`` function can take an integer of any width and returns an 9944integer of exactly the same integer width. This leads to a family of 9945functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and 9946``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is 9947overloaded, and only one type suffix is required. Because the argument's 9948type is matched against the return type, it does not require its own 9949name suffix. 9950 9951To learn how to add an intrinsic function, please see the `Extending 9952LLVM Guide <ExtendingLLVM.html>`_. 9953 9954.. _int_varargs: 9955 9956Variable Argument Handling Intrinsics 9957------------------------------------- 9958 9959Variable argument support is defined in LLVM with the 9960:ref:`va_arg <i_va_arg>` instruction and these three intrinsic 9961functions. These functions are related to the similarly named macros 9962defined in the ``<stdarg.h>`` header file. 9963 9964All of these functions operate on arguments that use a target-specific 9965value type "``va_list``". The LLVM assembly language reference manual 9966does not define what this type is, so all transformations should be 9967prepared to handle these functions regardless of the type used. 9968 9969This example shows how the :ref:`va_arg <i_va_arg>` instruction and the 9970variable argument handling intrinsic functions are used. 9971 9972.. code-block:: llvm 9973 9974 ; This struct is different for every platform. For most platforms, 9975 ; it is merely an i8*. 9976 %struct.va_list = type { i8* } 9977 9978 ; For Unix x86_64 platforms, va_list is the following struct: 9979 ; %struct.va_list = type { i32, i32, i8*, i8* } 9980 9981 define i32 @test(i32 %X, ...) { 9982 ; Initialize variable argument processing 9983 %ap = alloca %struct.va_list 9984 %ap2 = bitcast %struct.va_list* %ap to i8* 9985 call void @llvm.va_start(i8* %ap2) 9986 9987 ; Read a single integer argument 9988 %tmp = va_arg i8* %ap2, i32 9989 9990 ; Demonstrate usage of llvm.va_copy and llvm.va_end 9991 %aq = alloca i8* 9992 %aq2 = bitcast i8** %aq to i8* 9993 call void @llvm.va_copy(i8* %aq2, i8* %ap2) 9994 call void @llvm.va_end(i8* %aq2) 9995 9996 ; Stop processing of arguments. 9997 call void @llvm.va_end(i8* %ap2) 9998 ret i32 %tmp 9999 } 10000 10001 declare void @llvm.va_start(i8*) 10002 declare void @llvm.va_copy(i8*, i8*) 10003 declare void @llvm.va_end(i8*) 10004 10005.. _int_va_start: 10006 10007'``llvm.va_start``' Intrinsic 10008^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10009 10010Syntax: 10011""""""" 10012 10013:: 10014 10015 declare void @llvm.va_start(i8* <arglist>) 10016 10017Overview: 10018""""""""" 10019 10020The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for 10021subsequent use by ``va_arg``. 10022 10023Arguments: 10024"""""""""" 10025 10026The argument is a pointer to a ``va_list`` element to initialize. 10027 10028Semantics: 10029"""""""""" 10030 10031The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro 10032available in C. In a target-dependent way, it initializes the 10033``va_list`` element to which the argument points, so that the next call 10034to ``va_arg`` will produce the first variable argument passed to the 10035function. Unlike the C ``va_start`` macro, this intrinsic does not need 10036to know the last argument of the function as the compiler can figure 10037that out. 10038 10039'``llvm.va_end``' Intrinsic 10040^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10041 10042Syntax: 10043""""""" 10044 10045:: 10046 10047 declare void @llvm.va_end(i8* <arglist>) 10048 10049Overview: 10050""""""""" 10051 10052The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been 10053initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. 10054 10055Arguments: 10056"""""""""" 10057 10058The argument is a pointer to a ``va_list`` to destroy. 10059 10060Semantics: 10061"""""""""" 10062 10063The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro 10064available in C. In a target-dependent way, it destroys the ``va_list`` 10065element to which the argument points. Calls to 10066:ref:`llvm.va_start <int_va_start>` and 10067:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to 10068``llvm.va_end``. 10069 10070.. _int_va_copy: 10071 10072'``llvm.va_copy``' Intrinsic 10073^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10074 10075Syntax: 10076""""""" 10077 10078:: 10079 10080 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>) 10081 10082Overview: 10083""""""""" 10084 10085The '``llvm.va_copy``' intrinsic copies the current argument position 10086from the source argument list to the destination argument list. 10087 10088Arguments: 10089"""""""""" 10090 10091The first argument is a pointer to a ``va_list`` element to initialize. 10092The second argument is a pointer to a ``va_list`` element to copy from. 10093 10094Semantics: 10095"""""""""" 10096 10097The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro 10098available in C. In a target-dependent way, it copies the source 10099``va_list`` element into the destination ``va_list`` element. This 10100intrinsic is necessary because the `` llvm.va_start`` intrinsic may be 10101arbitrarily complex and require, for example, memory allocation. 10102 10103Accurate Garbage Collection Intrinsics 10104-------------------------------------- 10105 10106LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_ 10107(GC) requires the frontend to generate code containing appropriate intrinsic 10108calls and select an appropriate GC strategy which knows how to lower these 10109intrinsics in a manner which is appropriate for the target collector. 10110 10111These intrinsics allow identification of :ref:`GC roots on the 10112stack <int_gcroot>`, as well as garbage collector implementations that 10113require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. 10114Frontends for type-safe garbage collected languages should generate 10115these intrinsics to make use of the LLVM garbage collectors. For more 10116details, see `Garbage Collection with LLVM <GarbageCollection.html>`_. 10117 10118Experimental Statepoint Intrinsics 10119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10120 10121LLVM provides an second experimental set of intrinsics for describing garbage 10122collection safepoints in compiled code. These intrinsics are an alternative 10123to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for 10124:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The 10125differences in approach are covered in the `Garbage Collection with LLVM 10126<GarbageCollection.html>`_ documentation. The intrinsics themselves are 10127described in :doc:`Statepoints`. 10128 10129.. _int_gcroot: 10130 10131'``llvm.gcroot``' Intrinsic 10132^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10133 10134Syntax: 10135""""""" 10136 10137:: 10138 10139 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata) 10140 10141Overview: 10142""""""""" 10143 10144The '``llvm.gcroot``' intrinsic declares the existence of a GC root to 10145the code generator, and allows some metadata to be associated with it. 10146 10147Arguments: 10148"""""""""" 10149 10150The first argument specifies the address of a stack object that contains 10151the root pointer. The second pointer (which must be either a constant or 10152a global value address) contains the meta-data to be associated with the 10153root. 10154 10155Semantics: 10156"""""""""" 10157 10158At runtime, a call to this intrinsic stores a null pointer into the 10159"ptrloc" location. At compile-time, the code generator generates 10160information to allow the runtime to find the pointer at GC safe points. 10161The '``llvm.gcroot``' intrinsic may only be used in a function which 10162:ref:`specifies a GC algorithm <gc>`. 10163 10164.. _int_gcread: 10165 10166'``llvm.gcread``' Intrinsic 10167^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10168 10169Syntax: 10170""""""" 10171 10172:: 10173 10174 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr) 10175 10176Overview: 10177""""""""" 10178 10179The '``llvm.gcread``' intrinsic identifies reads of references from heap 10180locations, allowing garbage collector implementations that require read 10181barriers. 10182 10183Arguments: 10184"""""""""" 10185 10186The second argument is the address to read from, which should be an 10187address allocated from the garbage collector. The first object is a 10188pointer to the start of the referenced object, if needed by the language 10189runtime (otherwise null). 10190 10191Semantics: 10192"""""""""" 10193 10194The '``llvm.gcread``' intrinsic has the same semantics as a load 10195instruction, but may be replaced with substantially more complex code by 10196the garbage collector runtime, as needed. The '``llvm.gcread``' 10197intrinsic may only be used in a function which :ref:`specifies a GC 10198algorithm <gc>`. 10199 10200.. _int_gcwrite: 10201 10202'``llvm.gcwrite``' Intrinsic 10203^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10204 10205Syntax: 10206""""""" 10207 10208:: 10209 10210 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2) 10211 10212Overview: 10213""""""""" 10214 10215The '``llvm.gcwrite``' intrinsic identifies writes of references to heap 10216locations, allowing garbage collector implementations that require write 10217barriers (such as generational or reference counting collectors). 10218 10219Arguments: 10220"""""""""" 10221 10222The first argument is the reference to store, the second is the start of 10223the object to store it to, and the third is the address of the field of 10224Obj to store to. If the runtime does not require a pointer to the 10225object, Obj may be null. 10226 10227Semantics: 10228"""""""""" 10229 10230The '``llvm.gcwrite``' intrinsic has the same semantics as a store 10231instruction, but may be replaced with substantially more complex code by 10232the garbage collector runtime, as needed. The '``llvm.gcwrite``' 10233intrinsic may only be used in a function which :ref:`specifies a GC 10234algorithm <gc>`. 10235 10236Code Generator Intrinsics 10237------------------------- 10238 10239These intrinsics are provided by LLVM to expose special features that 10240may only be implemented with code generator support. 10241 10242'``llvm.returnaddress``' Intrinsic 10243^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10244 10245Syntax: 10246""""""" 10247 10248:: 10249 10250 declare i8* @llvm.returnaddress(i32 <level>) 10251 10252Overview: 10253""""""""" 10254 10255The '``llvm.returnaddress``' intrinsic attempts to compute a 10256target-specific value indicating the return address of the current 10257function or one of its callers. 10258 10259Arguments: 10260"""""""""" 10261 10262The argument to this intrinsic indicates which function to return the 10263address for. Zero indicates the calling function, one indicates its 10264caller, etc. The argument is **required** to be a constant integer 10265value. 10266 10267Semantics: 10268"""""""""" 10269 10270The '``llvm.returnaddress``' intrinsic either returns a pointer 10271indicating the return address of the specified call frame, or zero if it 10272cannot be identified. The value returned by this intrinsic is likely to 10273be incorrect or 0 for arguments other than zero, so it should only be 10274used for debugging purposes. 10275 10276Note that calling this intrinsic does not prevent function inlining or 10277other aggressive transformations, so the value returned may not be that 10278of the obvious source-language caller. 10279 10280'``llvm.addressofreturnaddress``' Intrinsic 10281^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10282 10283Syntax: 10284""""""" 10285 10286:: 10287 10288 declare i8* @llvm.addressofreturnaddress() 10289 10290Overview: 10291""""""""" 10292 10293The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific 10294pointer to the place in the stack frame where the return address of the 10295current function is stored. 10296 10297Semantics: 10298"""""""""" 10299 10300Note that calling this intrinsic does not prevent function inlining or 10301other aggressive transformations, so the value returned may not be that 10302of the obvious source-language caller. 10303 10304This intrinsic is only implemented for x86. 10305 10306'``llvm.frameaddress``' Intrinsic 10307^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10308 10309Syntax: 10310""""""" 10311 10312:: 10313 10314 declare i8* @llvm.frameaddress(i32 <level>) 10315 10316Overview: 10317""""""""" 10318 10319The '``llvm.frameaddress``' intrinsic attempts to return the 10320target-specific frame pointer value for the specified stack frame. 10321 10322Arguments: 10323"""""""""" 10324 10325The argument to this intrinsic indicates which function to return the 10326frame pointer for. Zero indicates the calling function, one indicates 10327its caller, etc. The argument is **required** to be a constant integer 10328value. 10329 10330Semantics: 10331"""""""""" 10332 10333The '``llvm.frameaddress``' intrinsic either returns a pointer 10334indicating the frame address of the specified call frame, or zero if it 10335cannot be identified. The value returned by this intrinsic is likely to 10336be incorrect or 0 for arguments other than zero, so it should only be 10337used for debugging purposes. 10338 10339Note that calling this intrinsic does not prevent function inlining or 10340other aggressive transformations, so the value returned may not be that 10341of the obvious source-language caller. 10342 10343'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics 10344^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10345 10346Syntax: 10347""""""" 10348 10349:: 10350 10351 declare void @llvm.localescape(...) 10352 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx) 10353 10354Overview: 10355""""""""" 10356 10357The '``llvm.localescape``' intrinsic escapes offsets of a collection of static 10358allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a 10359live frame pointer to recover the address of the allocation. The offset is 10360computed during frame layout of the caller of ``llvm.localescape``. 10361 10362Arguments: 10363"""""""""" 10364 10365All arguments to '``llvm.localescape``' must be pointers to static allocas or 10366casts of static allocas. Each function can only call '``llvm.localescape``' 10367once, and it can only do so from the entry block. 10368 10369The ``func`` argument to '``llvm.localrecover``' must be a constant 10370bitcasted pointer to a function defined in the current module. The code 10371generator cannot determine the frame allocation offset of functions defined in 10372other modules. 10373 10374The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a 10375call frame that is currently live. The return value of '``llvm.localaddress``' 10376is one way to produce such a value, but various runtimes also expose a suitable 10377pointer in platform-specific ways. 10378 10379The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to 10380'``llvm.localescape``' to recover. It is zero-indexed. 10381 10382Semantics: 10383"""""""""" 10384 10385These intrinsics allow a group of functions to share access to a set of local 10386stack allocations of a one parent function. The parent function may call the 10387'``llvm.localescape``' intrinsic once from the function entry block, and the 10388child functions can use '``llvm.localrecover``' to access the escaped allocas. 10389The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where 10390the escaped allocas are allocated, which would break attempts to use 10391'``llvm.localrecover``'. 10392 10393.. _int_read_register: 10394.. _int_write_register: 10395 10396'``llvm.read_register``' and '``llvm.write_register``' Intrinsics 10397^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10398 10399Syntax: 10400""""""" 10401 10402:: 10403 10404 declare i32 @llvm.read_register.i32(metadata) 10405 declare i64 @llvm.read_register.i64(metadata) 10406 declare void @llvm.write_register.i32(metadata, i32 @value) 10407 declare void @llvm.write_register.i64(metadata, i64 @value) 10408 !0 = !{!"sp\00"} 10409 10410Overview: 10411""""""""" 10412 10413The '``llvm.read_register``' and '``llvm.write_register``' intrinsics 10414provides access to the named register. The register must be valid on 10415the architecture being compiled to. The type needs to be compatible 10416with the register being read. 10417 10418Semantics: 10419"""""""""" 10420 10421The '``llvm.read_register``' intrinsic returns the current value of the 10422register, where possible. The '``llvm.write_register``' intrinsic sets 10423the current value of the register, where possible. 10424 10425This is useful to implement named register global variables that need 10426to always be mapped to a specific register, as is common practice on 10427bare-metal programs including OS kernels. 10428 10429The compiler doesn't check for register availability or use of the used 10430register in surrounding code, including inline assembly. Because of that, 10431allocatable registers are not supported. 10432 10433Warning: So far it only works with the stack pointer on selected 10434architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of 10435work is needed to support other registers and even more so, allocatable 10436registers. 10437 10438.. _int_stacksave: 10439 10440'``llvm.stacksave``' Intrinsic 10441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10442 10443Syntax: 10444""""""" 10445 10446:: 10447 10448 declare i8* @llvm.stacksave() 10449 10450Overview: 10451""""""""" 10452 10453The '``llvm.stacksave``' intrinsic is used to remember the current state 10454of the function stack, for use with 10455:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for 10456implementing language features like scoped automatic variable sized 10457arrays in C99. 10458 10459Semantics: 10460"""""""""" 10461 10462This intrinsic returns a opaque pointer value that can be passed to 10463:ref:`llvm.stackrestore <int_stackrestore>`. When an 10464``llvm.stackrestore`` intrinsic is executed with a value saved from 10465``llvm.stacksave``, it effectively restores the state of the stack to 10466the state it was in when the ``llvm.stacksave`` intrinsic executed. In 10467practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that 10468were allocated after the ``llvm.stacksave`` was executed. 10469 10470.. _int_stackrestore: 10471 10472'``llvm.stackrestore``' Intrinsic 10473^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10474 10475Syntax: 10476""""""" 10477 10478:: 10479 10480 declare void @llvm.stackrestore(i8* %ptr) 10481 10482Overview: 10483""""""""" 10484 10485The '``llvm.stackrestore``' intrinsic is used to restore the state of 10486the function stack to the state it was in when the corresponding 10487:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is 10488useful for implementing language features like scoped automatic variable 10489sized arrays in C99. 10490 10491Semantics: 10492"""""""""" 10493 10494See the description for :ref:`llvm.stacksave <int_stacksave>`. 10495 10496.. _int_get_dynamic_area_offset: 10497 10498'``llvm.get.dynamic.area.offset``' Intrinsic 10499^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10500 10501Syntax: 10502""""""" 10503 10504:: 10505 10506 declare i32 @llvm.get.dynamic.area.offset.i32() 10507 declare i64 @llvm.get.dynamic.area.offset.i64() 10508 10509Overview: 10510""""""""" 10511 10512 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to 10513 get the offset from native stack pointer to the address of the most 10514 recent dynamic alloca on the caller's stack. These intrinsics are 10515 intendend for use in combination with 10516 :ref:`llvm.stacksave <int_stacksave>` to get a 10517 pointer to the most recent dynamic alloca. This is useful, for example, 10518 for AddressSanitizer's stack unpoisoning routines. 10519 10520Semantics: 10521"""""""""" 10522 10523 These intrinsics return a non-negative integer value that can be used to 10524 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>` 10525 on the caller's stack. In particular, for targets where stack grows downwards, 10526 adding this offset to the native stack pointer would get the address of the most 10527 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more 10528 complicated, because subtracting this value from stack pointer would get the address 10529 one past the end of the most recent dynamic alloca. 10530 10531 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 10532 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a 10533 compile-time-known constant value. 10534 10535 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>` 10536 must match the target's default address space's (address space 0) pointer type. 10537 10538'``llvm.prefetch``' Intrinsic 10539^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10540 10541Syntax: 10542""""""" 10543 10544:: 10545 10546 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>) 10547 10548Overview: 10549""""""""" 10550 10551The '``llvm.prefetch``' intrinsic is a hint to the code generator to 10552insert a prefetch instruction if supported; otherwise, it is a noop. 10553Prefetches have no effect on the behavior of the program but can change 10554its performance characteristics. 10555 10556Arguments: 10557"""""""""" 10558 10559``address`` is the address to be prefetched, ``rw`` is the specifier 10560determining if the fetch should be for a read (0) or write (1), and 10561``locality`` is a temporal locality specifier ranging from (0) - no 10562locality, to (3) - extremely local keep in cache. The ``cache type`` 10563specifies whether the prefetch is performed on the data (1) or 10564instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` 10565arguments must be constant integers. 10566 10567Semantics: 10568"""""""""" 10569 10570This intrinsic does not modify the behavior of the program. In 10571particular, prefetches cannot trap and do not produce a value. On 10572targets that support this intrinsic, the prefetch can provide hints to 10573the processor cache for better performance. 10574 10575'``llvm.pcmarker``' Intrinsic 10576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10577 10578Syntax: 10579""""""" 10580 10581:: 10582 10583 declare void @llvm.pcmarker(i32 <id>) 10584 10585Overview: 10586""""""""" 10587 10588The '``llvm.pcmarker``' intrinsic is a method to export a Program 10589Counter (PC) in a region of code to simulators and other tools. The 10590method is target specific, but it is expected that the marker will use 10591exported symbols to transmit the PC of the marker. The marker makes no 10592guarantees that it will remain with any specific instruction after 10593optimizations. It is possible that the presence of a marker will inhibit 10594optimizations. The intended use is to be inserted after optimizations to 10595allow correlations of simulation runs. 10596 10597Arguments: 10598"""""""""" 10599 10600``id`` is a numerical id identifying the marker. 10601 10602Semantics: 10603"""""""""" 10604 10605This intrinsic does not modify the behavior of the program. Backends 10606that do not support this intrinsic may ignore it. 10607 10608'``llvm.readcyclecounter``' Intrinsic 10609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10610 10611Syntax: 10612""""""" 10613 10614:: 10615 10616 declare i64 @llvm.readcyclecounter() 10617 10618Overview: 10619""""""""" 10620 10621The '``llvm.readcyclecounter``' intrinsic provides access to the cycle 10622counter register (or similar low latency, high accuracy clocks) on those 10623targets that support it. On X86, it should map to RDTSC. On Alpha, it 10624should map to RPCC. As the backing counters overflow quickly (on the 10625order of 9 seconds on alpha), this should only be used for small 10626timings. 10627 10628Semantics: 10629"""""""""" 10630 10631When directly supported, reading the cycle counter should not modify any 10632memory. Implementations are allowed to either return a application 10633specific value or a system wide value. On backends without support, this 10634is lowered to a constant 0. 10635 10636Note that runtime support may be conditional on the privilege-level code is 10637running at and the host platform. 10638 10639'``llvm.clear_cache``' Intrinsic 10640^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10641 10642Syntax: 10643""""""" 10644 10645:: 10646 10647 declare void @llvm.clear_cache(i8*, i8*) 10648 10649Overview: 10650""""""""" 10651 10652The '``llvm.clear_cache``' intrinsic ensures visibility of modifications 10653in the specified range to the execution unit of the processor. On 10654targets with non-unified instruction and data cache, the implementation 10655flushes the instruction cache. 10656 10657Semantics: 10658"""""""""" 10659 10660On platforms with coherent instruction and data caches (e.g. x86), this 10661intrinsic is a nop. On platforms with non-coherent instruction and data 10662cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate 10663instructions or a system call, if cache flushing requires special 10664privileges. 10665 10666The default behavior is to emit a call to ``__clear_cache`` from the run 10667time library. 10668 10669This instrinsic does *not* empty the instruction pipeline. Modifications 10670of the current function are outside the scope of the intrinsic. 10671 10672'``llvm.instrprof.increment``' Intrinsic 10673^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10674 10675Syntax: 10676""""""" 10677 10678:: 10679 10680 declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>, 10681 i32 <num-counters>, i32 <index>) 10682 10683Overview: 10684""""""""" 10685 10686The '``llvm.instrprof.increment``' intrinsic can be emitted by a 10687frontend for use with instrumentation based profiling. These will be 10688lowered by the ``-instrprof`` pass to generate execution counts of a 10689program at runtime. 10690 10691Arguments: 10692"""""""""" 10693 10694The first argument is a pointer to a global variable containing the 10695name of the entity being instrumented. This should generally be the 10696(mangled) function name for a set of counters. 10697 10698The second argument is a hash value that can be used by the consumer 10699of the profile data to detect changes to the instrumented source, and 10700the third is the number of counters associated with ``name``. It is an 10701error if ``hash`` or ``num-counters`` differ between two instances of 10702``instrprof.increment`` that refer to the same name. 10703 10704The last argument refers to which of the counters for ``name`` should 10705be incremented. It should be a value between 0 and ``num-counters``. 10706 10707Semantics: 10708"""""""""" 10709 10710This intrinsic represents an increment of a profiling counter. It will 10711cause the ``-instrprof`` pass to generate the appropriate data 10712structures and the code to increment the appropriate value, in a 10713format that can be written out by a compiler runtime and consumed via 10714the ``llvm-profdata`` tool. 10715 10716'``llvm.instrprof.increment.step``' Intrinsic 10717^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10718 10719Syntax: 10720""""""" 10721 10722:: 10723 10724 declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>, 10725 i32 <num-counters>, 10726 i32 <index>, i64 <step>) 10727 10728Overview: 10729""""""""" 10730 10731The '``llvm.instrprof.increment.step``' intrinsic is an extension to 10732the '``llvm.instrprof.increment``' intrinsic with an additional fifth 10733argument to specify the step of the increment. 10734 10735Arguments: 10736"""""""""" 10737The first four arguments are the same as '``llvm.instrprof.increment``' 10738intrinsic. 10739 10740The last argument specifies the value of the increment of the counter variable. 10741 10742Semantics: 10743"""""""""" 10744See description of '``llvm.instrprof.increment``' instrinsic. 10745 10746 10747'``llvm.instrprof.value.profile``' Intrinsic 10748^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10749 10750Syntax: 10751""""""" 10752 10753:: 10754 10755 declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>, 10756 i64 <value>, i32 <value_kind>, 10757 i32 <index>) 10758 10759Overview: 10760""""""""" 10761 10762The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a 10763frontend for use with instrumentation based profiling. This will be 10764lowered by the ``-instrprof`` pass to find out the target values, 10765instrumented expressions take in a program at runtime. 10766 10767Arguments: 10768"""""""""" 10769 10770The first argument is a pointer to a global variable containing the 10771name of the entity being instrumented. ``name`` should generally be the 10772(mangled) function name for a set of counters. 10773 10774The second argument is a hash value that can be used by the consumer 10775of the profile data to detect changes to the instrumented source. It 10776is an error if ``hash`` differs between two instances of 10777``llvm.instrprof.*`` that refer to the same name. 10778 10779The third argument is the value of the expression being profiled. The profiled 10780expression's value should be representable as an unsigned 64-bit value. The 10781fourth argument represents the kind of value profiling that is being done. The 10782supported value profiling kinds are enumerated through the 10783``InstrProfValueKind`` type declared in the 10784``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the 10785index of the instrumented expression within ``name``. It should be >= 0. 10786 10787Semantics: 10788"""""""""" 10789 10790This intrinsic represents the point where a call to a runtime routine 10791should be inserted for value profiling of target expressions. ``-instrprof`` 10792pass will generate the appropriate data structures and replace the 10793``llvm.instrprof.value.profile`` intrinsic with the call to the profile 10794runtime library with proper arguments. 10795 10796'``llvm.thread.pointer``' Intrinsic 10797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10798 10799Syntax: 10800""""""" 10801 10802:: 10803 10804 declare i8* @llvm.thread.pointer() 10805 10806Overview: 10807""""""""" 10808 10809The '``llvm.thread.pointer``' intrinsic returns the value of the thread 10810pointer. 10811 10812Semantics: 10813"""""""""" 10814 10815The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area 10816for the current thread. The exact semantics of this value are target 10817specific: it may point to the start of TLS area, to the end, or somewhere 10818in the middle. Depending on the target, this intrinsic may read a register, 10819call a helper function, read from an alternate memory space, or perform 10820other operations necessary to locate the TLS area. Not all targets support 10821this intrinsic. 10822 10823Standard C Library Intrinsics 10824----------------------------- 10825 10826LLVM provides intrinsics for a few important standard C library 10827functions. These intrinsics allow source-language front-ends to pass 10828information about the alignment of the pointer arguments to the code 10829generator, providing opportunity for more efficient code generation. 10830 10831.. _int_memcpy: 10832 10833'``llvm.memcpy``' Intrinsic 10834^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10835 10836Syntax: 10837""""""" 10838 10839This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any 10840integer bit width and for different address spaces. Not all targets 10841support all bit widths however. 10842 10843:: 10844 10845 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 10846 i32 <len>, i1 <isvolatile>) 10847 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 10848 i64 <len>, i1 <isvolatile>) 10849 10850Overview: 10851""""""""" 10852 10853The '``llvm.memcpy.*``' intrinsics copy a block of memory from the 10854source location to the destination location. 10855 10856Note that, unlike the standard libc function, the ``llvm.memcpy.*`` 10857intrinsics do not return a value, takes extra isvolatile 10858arguments and the pointers can be in specified address spaces. 10859 10860Arguments: 10861"""""""""" 10862 10863The first argument is a pointer to the destination, the second is a 10864pointer to the source. The third argument is an integer argument 10865specifying the number of bytes to copy, and the fourth is a 10866boolean indicating a volatile access. 10867 10868The :ref:`align <attr_align>` parameter attribute can be provided 10869for the first and second arguments. 10870 10871If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is 10872a :ref:`volatile operation <volatile>`. The detailed access behavior is not 10873very cleanly specified and it is unwise to depend on it. 10874 10875Semantics: 10876"""""""""" 10877 10878The '``llvm.memcpy.*``' intrinsics copy a block of memory from the 10879source location to the destination location, which are not allowed to 10880overlap. It copies "len" bytes of memory over. If the argument is known 10881to be aligned to some boundary, this can be specified as the fourth 10882argument, otherwise it should be set to 0 or 1 (both meaning no alignment). 10883 10884.. _int_memmove: 10885 10886'``llvm.memmove``' Intrinsic 10887^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10888 10889Syntax: 10890""""""" 10891 10892This is an overloaded intrinsic. You can use llvm.memmove on any integer 10893bit width and for different address space. Not all targets support all 10894bit widths however. 10895 10896:: 10897 10898 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>, 10899 i32 <len>, i1 <isvolatile>) 10900 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>, 10901 i64 <len>, i1 <isvolatile>) 10902 10903Overview: 10904""""""""" 10905 10906The '``llvm.memmove.*``' intrinsics move a block of memory from the 10907source location to the destination location. It is similar to the 10908'``llvm.memcpy``' intrinsic but allows the two memory locations to 10909overlap. 10910 10911Note that, unlike the standard libc function, the ``llvm.memmove.*`` 10912intrinsics do not return a value, takes an extra isvolatile 10913argument and the pointers can be in specified address spaces. 10914 10915Arguments: 10916"""""""""" 10917 10918The first argument is a pointer to the destination, the second is a 10919pointer to the source. The third argument is an integer argument 10920specifying the number of bytes to copy, and the fourth is a 10921boolean indicating a volatile access. 10922 10923The :ref:`align <attr_align>` parameter attribute can be provided 10924for the first and second arguments. 10925 10926If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call 10927is a :ref:`volatile operation <volatile>`. The detailed access behavior is 10928not very cleanly specified and it is unwise to depend on it. 10929 10930Semantics: 10931"""""""""" 10932 10933The '``llvm.memmove.*``' intrinsics copy a block of memory from the 10934source location to the destination location, which may overlap. It 10935copies "len" bytes of memory over. If the argument is known to be 10936aligned to some boundary, this can be specified as the fourth argument, 10937otherwise it should be set to 0 or 1 (both meaning no alignment). 10938 10939.. _int_memset: 10940 10941'``llvm.memset.*``' Intrinsics 10942^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10943 10944Syntax: 10945""""""" 10946 10947This is an overloaded intrinsic. You can use llvm.memset on any integer 10948bit width and for different address spaces. However, not all targets 10949support all bit widths. 10950 10951:: 10952 10953 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>, 10954 i32 <len>, i1 <isvolatile>) 10955 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>, 10956 i64 <len>, i1 <isvolatile>) 10957 10958Overview: 10959""""""""" 10960 10961The '``llvm.memset.*``' intrinsics fill a block of memory with a 10962particular byte value. 10963 10964Note that, unlike the standard libc function, the ``llvm.memset`` 10965intrinsic does not return a value and takes an extra volatile 10966argument. Also, the destination can be in an arbitrary address space. 10967 10968Arguments: 10969"""""""""" 10970 10971The first argument is a pointer to the destination to fill, the second 10972is the byte value with which to fill it, the third argument is an 10973integer argument specifying the number of bytes to fill, and the fourth 10974is a boolean indicating a volatile access. 10975 10976The :ref:`align <attr_align>` parameter attribute can be provided 10977for the first arguments. 10978 10979If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is 10980a :ref:`volatile operation <volatile>`. The detailed access behavior is not 10981very cleanly specified and it is unwise to depend on it. 10982 10983Semantics: 10984"""""""""" 10985 10986The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting 10987at the destination location. 10988 10989'``llvm.sqrt.*``' Intrinsic 10990^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10991 10992Syntax: 10993""""""" 10994 10995This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any 10996floating-point or vector of floating-point type. Not all targets support 10997all types however. 10998 10999:: 11000 11001 declare float @llvm.sqrt.f32(float %Val) 11002 declare double @llvm.sqrt.f64(double %Val) 11003 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) 11004 declare fp128 @llvm.sqrt.f128(fp128 %Val) 11005 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) 11006 11007Overview: 11008""""""""" 11009 11010The '``llvm.sqrt``' intrinsics return the square root of the specified value. 11011 11012Arguments: 11013"""""""""" 11014 11015The argument and return value are floating-point numbers of the same type. 11016 11017Semantics: 11018"""""""""" 11019 11020Return the same value as a corresponding libm '``sqrt``' function but without 11021trapping or setting ``errno``. For types specified by IEEE-754, the result 11022matches a conforming libm implementation. 11023 11024When specified with the fast-math-flag 'afn', the result may be approximated 11025using a less accurate calculation. 11026 11027'``llvm.powi.*``' Intrinsic 11028^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11029 11030Syntax: 11031""""""" 11032 11033This is an overloaded intrinsic. You can use ``llvm.powi`` on any 11034floating-point or vector of floating-point type. Not all targets support 11035all types however. 11036 11037:: 11038 11039 declare float @llvm.powi.f32(float %Val, i32 %power) 11040 declare double @llvm.powi.f64(double %Val, i32 %power) 11041 declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power) 11042 declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power) 11043 declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power) 11044 11045Overview: 11046""""""""" 11047 11048The '``llvm.powi.*``' intrinsics return the first operand raised to the 11049specified (positive or negative) power. The order of evaluation of 11050multiplications is not defined. When a vector of floating-point type is 11051used, the second argument remains a scalar integer value. 11052 11053Arguments: 11054"""""""""" 11055 11056The second argument is an integer power, and the first is a value to 11057raise to that power. 11058 11059Semantics: 11060"""""""""" 11061 11062This function returns the first value raised to the second power with an 11063unspecified sequence of rounding operations. 11064 11065'``llvm.sin.*``' Intrinsic 11066^^^^^^^^^^^^^^^^^^^^^^^^^^ 11067 11068Syntax: 11069""""""" 11070 11071This is an overloaded intrinsic. You can use ``llvm.sin`` on any 11072floating-point or vector of floating-point type. Not all targets support 11073all types however. 11074 11075:: 11076 11077 declare float @llvm.sin.f32(float %Val) 11078 declare double @llvm.sin.f64(double %Val) 11079 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) 11080 declare fp128 @llvm.sin.f128(fp128 %Val) 11081 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) 11082 11083Overview: 11084""""""""" 11085 11086The '``llvm.sin.*``' intrinsics return the sine of the operand. 11087 11088Arguments: 11089"""""""""" 11090 11091The argument and return value are floating-point numbers of the same type. 11092 11093Semantics: 11094"""""""""" 11095 11096Return the same value as a corresponding libm '``sin``' function but without 11097trapping or setting ``errno``. 11098 11099When specified with the fast-math-flag 'afn', the result may be approximated 11100using a less accurate calculation. 11101 11102'``llvm.cos.*``' Intrinsic 11103^^^^^^^^^^^^^^^^^^^^^^^^^^ 11104 11105Syntax: 11106""""""" 11107 11108This is an overloaded intrinsic. You can use ``llvm.cos`` on any 11109floating-point or vector of floating-point type. Not all targets support 11110all types however. 11111 11112:: 11113 11114 declare float @llvm.cos.f32(float %Val) 11115 declare double @llvm.cos.f64(double %Val) 11116 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) 11117 declare fp128 @llvm.cos.f128(fp128 %Val) 11118 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) 11119 11120Overview: 11121""""""""" 11122 11123The '``llvm.cos.*``' intrinsics return the cosine of the operand. 11124 11125Arguments: 11126"""""""""" 11127 11128The argument and return value are floating-point numbers of the same type. 11129 11130Semantics: 11131"""""""""" 11132 11133Return the same value as a corresponding libm '``cos``' function but without 11134trapping or setting ``errno``. 11135 11136When specified with the fast-math-flag 'afn', the result may be approximated 11137using a less accurate calculation. 11138 11139'``llvm.pow.*``' Intrinsic 11140^^^^^^^^^^^^^^^^^^^^^^^^^^ 11141 11142Syntax: 11143""""""" 11144 11145This is an overloaded intrinsic. You can use ``llvm.pow`` on any 11146floating-point or vector of floating-point type. Not all targets support 11147all types however. 11148 11149:: 11150 11151 declare float @llvm.pow.f32(float %Val, float %Power) 11152 declare double @llvm.pow.f64(double %Val, double %Power) 11153 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) 11154 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) 11155 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) 11156 11157Overview: 11158""""""""" 11159 11160The '``llvm.pow.*``' intrinsics return the first operand raised to the 11161specified (positive or negative) power. 11162 11163Arguments: 11164"""""""""" 11165 11166The arguments and return value are floating-point numbers of the same type. 11167 11168Semantics: 11169"""""""""" 11170 11171Return the same value as a corresponding libm '``pow``' function but without 11172trapping or setting ``errno``. 11173 11174When specified with the fast-math-flag 'afn', the result may be approximated 11175using a less accurate calculation. 11176 11177'``llvm.exp.*``' Intrinsic 11178^^^^^^^^^^^^^^^^^^^^^^^^^^ 11179 11180Syntax: 11181""""""" 11182 11183This is an overloaded intrinsic. You can use ``llvm.exp`` on any 11184floating-point or vector of floating-point type. Not all targets support 11185all types however. 11186 11187:: 11188 11189 declare float @llvm.exp.f32(float %Val) 11190 declare double @llvm.exp.f64(double %Val) 11191 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) 11192 declare fp128 @llvm.exp.f128(fp128 %Val) 11193 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) 11194 11195Overview: 11196""""""""" 11197 11198The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified 11199value. 11200 11201Arguments: 11202"""""""""" 11203 11204The argument and return value are floating-point numbers of the same type. 11205 11206Semantics: 11207"""""""""" 11208 11209Return the same value as a corresponding libm '``exp``' function but without 11210trapping or setting ``errno``. 11211 11212When specified with the fast-math-flag 'afn', the result may be approximated 11213using a less accurate calculation. 11214 11215'``llvm.exp2.*``' Intrinsic 11216^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11217 11218Syntax: 11219""""""" 11220 11221This is an overloaded intrinsic. You can use ``llvm.exp2`` on any 11222floating-point or vector of floating-point type. Not all targets support 11223all types however. 11224 11225:: 11226 11227 declare float @llvm.exp2.f32(float %Val) 11228 declare double @llvm.exp2.f64(double %Val) 11229 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) 11230 declare fp128 @llvm.exp2.f128(fp128 %Val) 11231 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) 11232 11233Overview: 11234""""""""" 11235 11236The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the 11237specified value. 11238 11239Arguments: 11240"""""""""" 11241 11242The argument and return value are floating-point numbers of the same type. 11243 11244Semantics: 11245"""""""""" 11246 11247Return the same value as a corresponding libm '``exp2``' function but without 11248trapping or setting ``errno``. 11249 11250When specified with the fast-math-flag 'afn', the result may be approximated 11251using a less accurate calculation. 11252 11253'``llvm.log.*``' Intrinsic 11254^^^^^^^^^^^^^^^^^^^^^^^^^^ 11255 11256Syntax: 11257""""""" 11258 11259This is an overloaded intrinsic. You can use ``llvm.log`` on any 11260floating-point or vector of floating-point type. Not all targets support 11261all types however. 11262 11263:: 11264 11265 declare float @llvm.log.f32(float %Val) 11266 declare double @llvm.log.f64(double %Val) 11267 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) 11268 declare fp128 @llvm.log.f128(fp128 %Val) 11269 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) 11270 11271Overview: 11272""""""""" 11273 11274The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified 11275value. 11276 11277Arguments: 11278"""""""""" 11279 11280The argument and return value are floating-point numbers of the same type. 11281 11282Semantics: 11283"""""""""" 11284 11285Return the same value as a corresponding libm '``log``' function but without 11286trapping or setting ``errno``. 11287 11288When specified with the fast-math-flag 'afn', the result may be approximated 11289using a less accurate calculation. 11290 11291'``llvm.log10.*``' Intrinsic 11292^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11293 11294Syntax: 11295""""""" 11296 11297This is an overloaded intrinsic. You can use ``llvm.log10`` on any 11298floating-point or vector of floating-point type. Not all targets support 11299all types however. 11300 11301:: 11302 11303 declare float @llvm.log10.f32(float %Val) 11304 declare double @llvm.log10.f64(double %Val) 11305 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) 11306 declare fp128 @llvm.log10.f128(fp128 %Val) 11307 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) 11308 11309Overview: 11310""""""""" 11311 11312The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the 11313specified value. 11314 11315Arguments: 11316"""""""""" 11317 11318The argument and return value are floating-point numbers of the same type. 11319 11320Semantics: 11321"""""""""" 11322 11323Return the same value as a corresponding libm '``log10``' function but without 11324trapping or setting ``errno``. 11325 11326When specified with the fast-math-flag 'afn', the result may be approximated 11327using a less accurate calculation. 11328 11329'``llvm.log2.*``' Intrinsic 11330^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11331 11332Syntax: 11333""""""" 11334 11335This is an overloaded intrinsic. You can use ``llvm.log2`` on any 11336floating-point or vector of floating-point type. Not all targets support 11337all types however. 11338 11339:: 11340 11341 declare float @llvm.log2.f32(float %Val) 11342 declare double @llvm.log2.f64(double %Val) 11343 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) 11344 declare fp128 @llvm.log2.f128(fp128 %Val) 11345 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) 11346 11347Overview: 11348""""""""" 11349 11350The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified 11351value. 11352 11353Arguments: 11354"""""""""" 11355 11356The argument and return value are floating-point numbers of the same type. 11357 11358Semantics: 11359"""""""""" 11360 11361Return the same value as a corresponding libm '``log2``' function but without 11362trapping or setting ``errno``. 11363 11364When specified with the fast-math-flag 'afn', the result may be approximated 11365using a less accurate calculation. 11366 11367'``llvm.fma.*``' Intrinsic 11368^^^^^^^^^^^^^^^^^^^^^^^^^^ 11369 11370Syntax: 11371""""""" 11372 11373This is an overloaded intrinsic. You can use ``llvm.fma`` on any 11374floating-point or vector of floating-point type. Not all targets support 11375all types however. 11376 11377:: 11378 11379 declare float @llvm.fma.f32(float %a, float %b, float %c) 11380 declare double @llvm.fma.f64(double %a, double %b, double %c) 11381 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) 11382 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) 11383 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) 11384 11385Overview: 11386""""""""" 11387 11388The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation. 11389 11390Arguments: 11391"""""""""" 11392 11393The arguments and return value are floating-point numbers of the same type. 11394 11395Semantics: 11396"""""""""" 11397 11398Return the same value as a corresponding libm '``fma``' function but without 11399trapping or setting ``errno``. 11400 11401When specified with the fast-math-flag 'afn', the result may be approximated 11402using a less accurate calculation. 11403 11404'``llvm.fabs.*``' Intrinsic 11405^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11406 11407Syntax: 11408""""""" 11409 11410This is an overloaded intrinsic. You can use ``llvm.fabs`` on any 11411floating-point or vector of floating-point type. Not all targets support 11412all types however. 11413 11414:: 11415 11416 declare float @llvm.fabs.f32(float %Val) 11417 declare double @llvm.fabs.f64(double %Val) 11418 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) 11419 declare fp128 @llvm.fabs.f128(fp128 %Val) 11420 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) 11421 11422Overview: 11423""""""""" 11424 11425The '``llvm.fabs.*``' intrinsics return the absolute value of the 11426operand. 11427 11428Arguments: 11429"""""""""" 11430 11431The argument and return value are floating-point numbers of the same 11432type. 11433 11434Semantics: 11435"""""""""" 11436 11437This function returns the same values as the libm ``fabs`` functions 11438would, and handles error conditions in the same way. 11439 11440'``llvm.minnum.*``' Intrinsic 11441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11442 11443Syntax: 11444""""""" 11445 11446This is an overloaded intrinsic. You can use ``llvm.minnum`` on any 11447floating-point or vector of floating-point type. Not all targets support 11448all types however. 11449 11450:: 11451 11452 declare float @llvm.minnum.f32(float %Val0, float %Val1) 11453 declare double @llvm.minnum.f64(double %Val0, double %Val1) 11454 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 11455 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1) 11456 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 11457 11458Overview: 11459""""""""" 11460 11461The '``llvm.minnum.*``' intrinsics return the minimum of the two 11462arguments. 11463 11464 11465Arguments: 11466"""""""""" 11467 11468The arguments and return value are floating-point numbers of the same 11469type. 11470 11471Semantics: 11472"""""""""" 11473 11474Follows the IEEE-754 semantics for minNum, which also match for libm's 11475fmin. 11476 11477If either operand is a NaN, returns the other non-NaN operand. Returns 11478NaN only if both operands are NaN. If the operands compare equal, 11479returns a value that compares equal to both operands. This means that 11480fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0. 11481 11482'``llvm.maxnum.*``' Intrinsic 11483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11484 11485Syntax: 11486""""""" 11487 11488This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any 11489floating-point or vector of floating-point type. Not all targets support 11490all types however. 11491 11492:: 11493 11494 declare float @llvm.maxnum.f32(float %Val0, float %Val1l) 11495 declare double @llvm.maxnum.f64(double %Val0, double %Val1) 11496 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) 11497 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1) 11498 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) 11499 11500Overview: 11501""""""""" 11502 11503The '``llvm.maxnum.*``' intrinsics return the maximum of the two 11504arguments. 11505 11506 11507Arguments: 11508"""""""""" 11509 11510The arguments and return value are floating-point numbers of the same 11511type. 11512 11513Semantics: 11514"""""""""" 11515Follows the IEEE-754 semantics for maxNum, which also match for libm's 11516fmax. 11517 11518If either operand is a NaN, returns the other non-NaN operand. Returns 11519NaN only if both operands are NaN. If the operands compare equal, 11520returns a value that compares equal to both operands. This means that 11521fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0. 11522 11523'``llvm.copysign.*``' Intrinsic 11524^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11525 11526Syntax: 11527""""""" 11528 11529This is an overloaded intrinsic. You can use ``llvm.copysign`` on any 11530floating-point or vector of floating-point type. Not all targets support 11531all types however. 11532 11533:: 11534 11535 declare float @llvm.copysign.f32(float %Mag, float %Sgn) 11536 declare double @llvm.copysign.f64(double %Mag, double %Sgn) 11537 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn) 11538 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn) 11539 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn) 11540 11541Overview: 11542""""""""" 11543 11544The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the 11545first operand and the sign of the second operand. 11546 11547Arguments: 11548"""""""""" 11549 11550The arguments and return value are floating-point numbers of the same 11551type. 11552 11553Semantics: 11554"""""""""" 11555 11556This function returns the same values as the libm ``copysign`` 11557functions would, and handles error conditions in the same way. 11558 11559'``llvm.floor.*``' Intrinsic 11560^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11561 11562Syntax: 11563""""""" 11564 11565This is an overloaded intrinsic. You can use ``llvm.floor`` on any 11566floating-point or vector of floating-point type. Not all targets support 11567all types however. 11568 11569:: 11570 11571 declare float @llvm.floor.f32(float %Val) 11572 declare double @llvm.floor.f64(double %Val) 11573 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) 11574 declare fp128 @llvm.floor.f128(fp128 %Val) 11575 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) 11576 11577Overview: 11578""""""""" 11579 11580The '``llvm.floor.*``' intrinsics return the floor of the operand. 11581 11582Arguments: 11583"""""""""" 11584 11585The argument and return value are floating-point numbers of the same 11586type. 11587 11588Semantics: 11589"""""""""" 11590 11591This function returns the same values as the libm ``floor`` functions 11592would, and handles error conditions in the same way. 11593 11594'``llvm.ceil.*``' Intrinsic 11595^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11596 11597Syntax: 11598""""""" 11599 11600This is an overloaded intrinsic. You can use ``llvm.ceil`` on any 11601floating-point or vector of floating-point type. Not all targets support 11602all types however. 11603 11604:: 11605 11606 declare float @llvm.ceil.f32(float %Val) 11607 declare double @llvm.ceil.f64(double %Val) 11608 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) 11609 declare fp128 @llvm.ceil.f128(fp128 %Val) 11610 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) 11611 11612Overview: 11613""""""""" 11614 11615The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. 11616 11617Arguments: 11618"""""""""" 11619 11620The argument and return value are floating-point numbers of the same 11621type. 11622 11623Semantics: 11624"""""""""" 11625 11626This function returns the same values as the libm ``ceil`` functions 11627would, and handles error conditions in the same way. 11628 11629'``llvm.trunc.*``' Intrinsic 11630^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11631 11632Syntax: 11633""""""" 11634 11635This is an overloaded intrinsic. You can use ``llvm.trunc`` on any 11636floating-point or vector of floating-point type. Not all targets support 11637all types however. 11638 11639:: 11640 11641 declare float @llvm.trunc.f32(float %Val) 11642 declare double @llvm.trunc.f64(double %Val) 11643 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) 11644 declare fp128 @llvm.trunc.f128(fp128 %Val) 11645 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) 11646 11647Overview: 11648""""""""" 11649 11650The '``llvm.trunc.*``' intrinsics returns the operand rounded to the 11651nearest integer not larger in magnitude than the operand. 11652 11653Arguments: 11654"""""""""" 11655 11656The argument and return value are floating-point numbers of the same 11657type. 11658 11659Semantics: 11660"""""""""" 11661 11662This function returns the same values as the libm ``trunc`` functions 11663would, and handles error conditions in the same way. 11664 11665'``llvm.rint.*``' Intrinsic 11666^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11667 11668Syntax: 11669""""""" 11670 11671This is an overloaded intrinsic. You can use ``llvm.rint`` on any 11672floating-point or vector of floating-point type. Not all targets support 11673all types however. 11674 11675:: 11676 11677 declare float @llvm.rint.f32(float %Val) 11678 declare double @llvm.rint.f64(double %Val) 11679 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) 11680 declare fp128 @llvm.rint.f128(fp128 %Val) 11681 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) 11682 11683Overview: 11684""""""""" 11685 11686The '``llvm.rint.*``' intrinsics returns the operand rounded to the 11687nearest integer. It may raise an inexact floating-point exception if the 11688operand isn't an integer. 11689 11690Arguments: 11691"""""""""" 11692 11693The argument and return value are floating-point numbers of the same 11694type. 11695 11696Semantics: 11697"""""""""" 11698 11699This function returns the same values as the libm ``rint`` functions 11700would, and handles error conditions in the same way. 11701 11702'``llvm.nearbyint.*``' Intrinsic 11703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11704 11705Syntax: 11706""""""" 11707 11708This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any 11709floating-point or vector of floating-point type. Not all targets support 11710all types however. 11711 11712:: 11713 11714 declare float @llvm.nearbyint.f32(float %Val) 11715 declare double @llvm.nearbyint.f64(double %Val) 11716 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) 11717 declare fp128 @llvm.nearbyint.f128(fp128 %Val) 11718 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) 11719 11720Overview: 11721""""""""" 11722 11723The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the 11724nearest integer. 11725 11726Arguments: 11727"""""""""" 11728 11729The argument and return value are floating-point numbers of the same 11730type. 11731 11732Semantics: 11733"""""""""" 11734 11735This function returns the same values as the libm ``nearbyint`` 11736functions would, and handles error conditions in the same way. 11737 11738'``llvm.round.*``' Intrinsic 11739^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11740 11741Syntax: 11742""""""" 11743 11744This is an overloaded intrinsic. You can use ``llvm.round`` on any 11745floating-point or vector of floating-point type. Not all targets support 11746all types however. 11747 11748:: 11749 11750 declare float @llvm.round.f32(float %Val) 11751 declare double @llvm.round.f64(double %Val) 11752 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val) 11753 declare fp128 @llvm.round.f128(fp128 %Val) 11754 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val) 11755 11756Overview: 11757""""""""" 11758 11759The '``llvm.round.*``' intrinsics returns the operand rounded to the 11760nearest integer. 11761 11762Arguments: 11763"""""""""" 11764 11765The argument and return value are floating-point numbers of the same 11766type. 11767 11768Semantics: 11769"""""""""" 11770 11771This function returns the same values as the libm ``round`` 11772functions would, and handles error conditions in the same way. 11773 11774Bit Manipulation Intrinsics 11775--------------------------- 11776 11777LLVM provides intrinsics for a few important bit manipulation 11778operations. These allow efficient code generation for some algorithms. 11779 11780'``llvm.bitreverse.*``' Intrinsics 11781^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11782 11783Syntax: 11784""""""" 11785 11786This is an overloaded intrinsic function. You can use bitreverse on any 11787integer type. 11788 11789:: 11790 11791 declare i16 @llvm.bitreverse.i16(i16 <id>) 11792 declare i32 @llvm.bitreverse.i32(i32 <id>) 11793 declare i64 @llvm.bitreverse.i64(i64 <id>) 11794 11795Overview: 11796""""""""" 11797 11798The '``llvm.bitreverse``' family of intrinsics is used to reverse the 11799bitpattern of an integer value; for example ``0b10110110`` becomes 11800``0b01101101``. 11801 11802Semantics: 11803"""""""""" 11804 11805The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit 11806``M`` in the input moved to bit ``N-M`` in the output. 11807 11808'``llvm.bswap.*``' Intrinsics 11809^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11810 11811Syntax: 11812""""""" 11813 11814This is an overloaded intrinsic function. You can use bswap on any 11815integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). 11816 11817:: 11818 11819 declare i16 @llvm.bswap.i16(i16 <id>) 11820 declare i32 @llvm.bswap.i32(i32 <id>) 11821 declare i64 @llvm.bswap.i64(i64 <id>) 11822 11823Overview: 11824""""""""" 11825 11826The '``llvm.bswap``' family of intrinsics is used to byte swap integer 11827values with an even number of bytes (positive multiple of 16 bits). 11828These are useful for performing operations on data that is not in the 11829target's native byte order. 11830 11831Semantics: 11832"""""""""" 11833 11834The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high 11835and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` 11836intrinsic returns an i32 value that has the four bytes of the input i32 11837swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the 11838returned i32 will have its bytes in 3, 2, 1, 0 order. The 11839``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this 11840concept to additional even-byte lengths (6 bytes, 8 bytes and more, 11841respectively). 11842 11843'``llvm.ctpop.*``' Intrinsic 11844^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11845 11846Syntax: 11847""""""" 11848 11849This is an overloaded intrinsic. You can use llvm.ctpop on any integer 11850bit width, or on any vector with integer elements. Not all targets 11851support all bit widths or vector types, however. 11852 11853:: 11854 11855 declare i8 @llvm.ctpop.i8(i8 <src>) 11856 declare i16 @llvm.ctpop.i16(i16 <src>) 11857 declare i32 @llvm.ctpop.i32(i32 <src>) 11858 declare i64 @llvm.ctpop.i64(i64 <src>) 11859 declare i256 @llvm.ctpop.i256(i256 <src>) 11860 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) 11861 11862Overview: 11863""""""""" 11864 11865The '``llvm.ctpop``' family of intrinsics counts the number of bits set 11866in a value. 11867 11868Arguments: 11869"""""""""" 11870 11871The only argument is the value to be counted. The argument may be of any 11872integer type, or a vector with integer elements. The return type must 11873match the argument type. 11874 11875Semantics: 11876"""""""""" 11877 11878The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within 11879each element of a vector. 11880 11881'``llvm.ctlz.*``' Intrinsic 11882^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11883 11884Syntax: 11885""""""" 11886 11887This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any 11888integer bit width, or any vector whose elements are integers. Not all 11889targets support all bit widths or vector types, however. 11890 11891:: 11892 11893 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>) 11894 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>) 11895 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) 11896 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) 11897 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) 11898 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 11899 11900Overview: 11901""""""""" 11902 11903The '``llvm.ctlz``' family of intrinsic functions counts the number of 11904leading zeros in a variable. 11905 11906Arguments: 11907"""""""""" 11908 11909The first argument is the value to be counted. This argument may be of 11910any integer type, or a vector with integer element type. The return 11911type must match the first argument type. 11912 11913The second argument must be a constant and is a flag to indicate whether 11914the intrinsic should ensure that a zero as the first argument produces a 11915defined result. Historically some architectures did not provide a 11916defined result for zero values as efficiently, and many algorithms are 11917now predicated on avoiding zero-value inputs. 11918 11919Semantics: 11920"""""""""" 11921 11922The '``llvm.ctlz``' intrinsic counts the leading (most significant) 11923zeros in a variable, or within each element of the vector. If 11924``src == 0`` then the result is the size in bits of the type of ``src`` 11925if ``is_zero_undef == 0`` and ``undef`` otherwise. For example, 11926``llvm.ctlz(i32 2) = 30``. 11927 11928'``llvm.cttz.*``' Intrinsic 11929^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11930 11931Syntax: 11932""""""" 11933 11934This is an overloaded intrinsic. You can use ``llvm.cttz`` on any 11935integer bit width, or any vector of integer elements. Not all targets 11936support all bit widths or vector types, however. 11937 11938:: 11939 11940 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>) 11941 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>) 11942 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) 11943 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) 11944 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) 11945 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) 11946 11947Overview: 11948""""""""" 11949 11950The '``llvm.cttz``' family of intrinsic functions counts the number of 11951trailing zeros. 11952 11953Arguments: 11954"""""""""" 11955 11956The first argument is the value to be counted. This argument may be of 11957any integer type, or a vector with integer element type. The return 11958type must match the first argument type. 11959 11960The second argument must be a constant and is a flag to indicate whether 11961the intrinsic should ensure that a zero as the first argument produces a 11962defined result. Historically some architectures did not provide a 11963defined result for zero values as efficiently, and many algorithms are 11964now predicated on avoiding zero-value inputs. 11965 11966Semantics: 11967"""""""""" 11968 11969The '``llvm.cttz``' intrinsic counts the trailing (least significant) 11970zeros in a variable, or within each element of a vector. If ``src == 0`` 11971then the result is the size in bits of the type of ``src`` if 11972``is_zero_undef == 0`` and ``undef`` otherwise. For example, 11973``llvm.cttz(2) = 1``. 11974 11975.. _int_overflow: 11976 11977'``llvm.fshl.*``' Intrinsic 11978^^^^^^^^^^^^^^^^^^^^^^^^^^^ 11979 11980Syntax: 11981""""""" 11982 11983This is an overloaded intrinsic. You can use ``llvm.fshl`` on any 11984integer bit width or any vector of integer elements. Not all targets 11985support all bit widths or vector types, however. 11986 11987:: 11988 11989 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c) 11990 declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c) 11991 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 11992 11993Overview: 11994""""""""" 11995 11996The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left: 11997the first two values are concatenated as { %a : %b } (%a is the most significant 11998bits of the wide value), the combined value is shifted left, and the most 11999significant bits are extracted to produce a result that is the same size as the 12000original arguments. If the first 2 arguments are identical, this is equivalent 12001to a rotate left operation. For vector types, the operation occurs for each 12002element of the vector. The shift argument is treated as an unsigned amount 12003modulo the element size of the arguments. 12004 12005Arguments: 12006"""""""""" 12007 12008The first two arguments are the values to be concatenated. The third 12009argument is the shift amount. The arguments may be any integer type or a 12010vector with integer element type. All arguments and the return value must 12011have the same type. 12012 12013Example: 12014"""""""" 12015 12016.. code-block:: text 12017 12018 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8) 12019 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000) 12020 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000) 12021 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000) 12022 12023'``llvm.fshr.*``' Intrinsic 12024^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12025 12026Syntax: 12027""""""" 12028 12029This is an overloaded intrinsic. You can use ``llvm.fshr`` on any 12030integer bit width or any vector of integer elements. Not all targets 12031support all bit widths or vector types, however. 12032 12033:: 12034 12035 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c) 12036 declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c) 12037 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c) 12038 12039Overview: 12040""""""""" 12041 12042The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right: 12043the first two values are concatenated as { %a : %b } (%a is the most significant 12044bits of the wide value), the combined value is shifted right, and the least 12045significant bits are extracted to produce a result that is the same size as the 12046original arguments. If the first 2 arguments are identical, this is equivalent 12047to a rotate right operation. For vector types, the operation occurs for each 12048element of the vector. The shift argument is treated as an unsigned amount 12049modulo the element size of the arguments. 12050 12051Arguments: 12052"""""""""" 12053 12054The first two arguments are the values to be concatenated. The third 12055argument is the shift amount. The arguments may be any integer type or a 12056vector with integer element type. All arguments and the return value must 12057have the same type. 12058 12059Example: 12060"""""""" 12061 12062.. code-block:: text 12063 12064 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8) 12065 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110) 12066 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001) 12067 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111) 12068 12069Arithmetic with Overflow Intrinsics 12070----------------------------------- 12071 12072LLVM provides intrinsics for fast arithmetic overflow checking. 12073 12074Each of these intrinsics returns a two-element struct. The first 12075element of this struct contains the result of the corresponding 12076arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of 12077the result. Therefore, for example, the first element of the struct 12078returned by ``llvm.sadd.with.overflow.i32`` is always the same as the 12079result of a 32-bit ``add`` instruction with the same operands, where 12080the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag. 12081 12082The second element of the result is an ``i1`` that is 1 if the 12083arithmetic operation overflowed and 0 otherwise. An operation 12084overflows if, for any values of its operands ``A`` and ``B`` and for 12085any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is 12086not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is 12087``sext`` for signed overflow and ``zext`` for unsigned overflow, and 12088``op`` is the underlying arithmetic operation. 12089 12090The behavior of these intrinsics is well-defined for all argument 12091values. 12092 12093'``llvm.sadd.with.overflow.*``' Intrinsics 12094^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12095 12096Syntax: 12097""""""" 12098 12099This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` 12100on any integer bit width. 12101 12102:: 12103 12104 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 12105 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 12106 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) 12107 12108Overview: 12109""""""""" 12110 12111The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 12112a signed addition of the two arguments, and indicate whether an overflow 12113occurred during the signed summation. 12114 12115Arguments: 12116"""""""""" 12117 12118The arguments (%a and %b) and the first element of the result structure 12119may be of integer types of any bit width, but they must have the same 12120bit width. The second element of the result structure must be of type 12121``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 12122addition. 12123 12124Semantics: 12125"""""""""" 12126 12127The '``llvm.sadd.with.overflow``' family of intrinsic functions perform 12128a signed addition of the two variables. They return a structure --- the 12129first element of which is the signed summation, and the second element 12130of which is a bit specifying if the signed summation resulted in an 12131overflow. 12132 12133Examples: 12134""""""""" 12135 12136.. code-block:: llvm 12137 12138 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) 12139 %sum = extractvalue {i32, i1} %res, 0 12140 %obit = extractvalue {i32, i1} %res, 1 12141 br i1 %obit, label %overflow, label %normal 12142 12143'``llvm.uadd.with.overflow.*``' Intrinsics 12144^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12145 12146Syntax: 12147""""""" 12148 12149This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` 12150on any integer bit width. 12151 12152:: 12153 12154 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) 12155 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 12156 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) 12157 12158Overview: 12159""""""""" 12160 12161The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 12162an unsigned addition of the two arguments, and indicate whether a carry 12163occurred during the unsigned summation. 12164 12165Arguments: 12166"""""""""" 12167 12168The arguments (%a and %b) and the first element of the result structure 12169may be of integer types of any bit width, but they must have the same 12170bit width. The second element of the result structure must be of type 12171``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 12172addition. 12173 12174Semantics: 12175"""""""""" 12176 12177The '``llvm.uadd.with.overflow``' family of intrinsic functions perform 12178an unsigned addition of the two arguments. They return a structure --- the 12179first element of which is the sum, and the second element of which is a 12180bit specifying if the unsigned summation resulted in a carry. 12181 12182Examples: 12183""""""""" 12184 12185.. code-block:: llvm 12186 12187 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) 12188 %sum = extractvalue {i32, i1} %res, 0 12189 %obit = extractvalue {i32, i1} %res, 1 12190 br i1 %obit, label %carry, label %normal 12191 12192'``llvm.ssub.with.overflow.*``' Intrinsics 12193^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12194 12195Syntax: 12196""""""" 12197 12198This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` 12199on any integer bit width. 12200 12201:: 12202 12203 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) 12204 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 12205 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) 12206 12207Overview: 12208""""""""" 12209 12210The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 12211a signed subtraction of the two arguments, and indicate whether an 12212overflow occurred during the signed subtraction. 12213 12214Arguments: 12215"""""""""" 12216 12217The arguments (%a and %b) and the first element of the result structure 12218may be of integer types of any bit width, but they must have the same 12219bit width. The second element of the result structure must be of type 12220``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 12221subtraction. 12222 12223Semantics: 12224"""""""""" 12225 12226The '``llvm.ssub.with.overflow``' family of intrinsic functions perform 12227a signed subtraction of the two arguments. They return a structure --- the 12228first element of which is the subtraction, and the second element of 12229which is a bit specifying if the signed subtraction resulted in an 12230overflow. 12231 12232Examples: 12233""""""""" 12234 12235.. code-block:: llvm 12236 12237 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) 12238 %sum = extractvalue {i32, i1} %res, 0 12239 %obit = extractvalue {i32, i1} %res, 1 12240 br i1 %obit, label %overflow, label %normal 12241 12242'``llvm.usub.with.overflow.*``' Intrinsics 12243^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12244 12245Syntax: 12246""""""" 12247 12248This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` 12249on any integer bit width. 12250 12251:: 12252 12253 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) 12254 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 12255 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) 12256 12257Overview: 12258""""""""" 12259 12260The '``llvm.usub.with.overflow``' family of intrinsic functions perform 12261an unsigned subtraction of the two arguments, and indicate whether an 12262overflow occurred during the unsigned subtraction. 12263 12264Arguments: 12265"""""""""" 12266 12267The arguments (%a and %b) and the first element of the result structure 12268may be of integer types of any bit width, but they must have the same 12269bit width. The second element of the result structure must be of type 12270``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 12271subtraction. 12272 12273Semantics: 12274"""""""""" 12275 12276The '``llvm.usub.with.overflow``' family of intrinsic functions perform 12277an unsigned subtraction of the two arguments. They return a structure --- 12278the first element of which is the subtraction, and the second element of 12279which is a bit specifying if the unsigned subtraction resulted in an 12280overflow. 12281 12282Examples: 12283""""""""" 12284 12285.. code-block:: llvm 12286 12287 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) 12288 %sum = extractvalue {i32, i1} %res, 0 12289 %obit = extractvalue {i32, i1} %res, 1 12290 br i1 %obit, label %overflow, label %normal 12291 12292'``llvm.smul.with.overflow.*``' Intrinsics 12293^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12294 12295Syntax: 12296""""""" 12297 12298This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` 12299on any integer bit width. 12300 12301:: 12302 12303 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) 12304 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 12305 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) 12306 12307Overview: 12308""""""""" 12309 12310The '``llvm.smul.with.overflow``' family of intrinsic functions perform 12311a signed multiplication of the two arguments, and indicate whether an 12312overflow occurred during the signed multiplication. 12313 12314Arguments: 12315"""""""""" 12316 12317The arguments (%a and %b) and the first element of the result structure 12318may be of integer types of any bit width, but they must have the same 12319bit width. The second element of the result structure must be of type 12320``i1``. ``%a`` and ``%b`` are the two values that will undergo signed 12321multiplication. 12322 12323Semantics: 12324"""""""""" 12325 12326The '``llvm.smul.with.overflow``' family of intrinsic functions perform 12327a signed multiplication of the two arguments. They return a structure --- 12328the first element of which is the multiplication, and the second element 12329of which is a bit specifying if the signed multiplication resulted in an 12330overflow. 12331 12332Examples: 12333""""""""" 12334 12335.. code-block:: llvm 12336 12337 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) 12338 %sum = extractvalue {i32, i1} %res, 0 12339 %obit = extractvalue {i32, i1} %res, 1 12340 br i1 %obit, label %overflow, label %normal 12341 12342'``llvm.umul.with.overflow.*``' Intrinsics 12343^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12344 12345Syntax: 12346""""""" 12347 12348This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` 12349on any integer bit width. 12350 12351:: 12352 12353 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) 12354 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 12355 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) 12356 12357Overview: 12358""""""""" 12359 12360The '``llvm.umul.with.overflow``' family of intrinsic functions perform 12361a unsigned multiplication of the two arguments, and indicate whether an 12362overflow occurred during the unsigned multiplication. 12363 12364Arguments: 12365"""""""""" 12366 12367The arguments (%a and %b) and the first element of the result structure 12368may be of integer types of any bit width, but they must have the same 12369bit width. The second element of the result structure must be of type 12370``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned 12371multiplication. 12372 12373Semantics: 12374"""""""""" 12375 12376The '``llvm.umul.with.overflow``' family of intrinsic functions perform 12377an unsigned multiplication of the two arguments. They return a structure --- 12378the first element of which is the multiplication, and the second 12379element of which is a bit specifying if the unsigned multiplication 12380resulted in an overflow. 12381 12382Examples: 12383""""""""" 12384 12385.. code-block:: llvm 12386 12387 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) 12388 %sum = extractvalue {i32, i1} %res, 0 12389 %obit = extractvalue {i32, i1} %res, 1 12390 br i1 %obit, label %overflow, label %normal 12391 12392Specialised Arithmetic Intrinsics 12393--------------------------------- 12394 12395'``llvm.canonicalize.*``' Intrinsic 12396^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12397 12398Syntax: 12399""""""" 12400 12401:: 12402 12403 declare float @llvm.canonicalize.f32(float %a) 12404 declare double @llvm.canonicalize.f64(double %b) 12405 12406Overview: 12407""""""""" 12408 12409The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical 12410encoding of a floating-point number. This canonicalization is useful for 12411implementing certain numeric primitives such as frexp. The canonical encoding is 12412defined by IEEE-754-2008 to be: 12413 12414:: 12415 12416 2.1.8 canonical encoding: The preferred encoding of a floating-point 12417 representation in a format. Applied to declets, significands of finite 12418 numbers, infinities, and NaNs, especially in decimal formats. 12419 12420This operation can also be considered equivalent to the IEEE-754-2008 12421conversion of a floating-point value to the same format. NaNs are handled 12422according to section 6.2. 12423 12424Examples of non-canonical encodings: 12425 12426- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are 12427 converted to a canonical representation per hardware-specific protocol. 12428- Many normal decimal floating-point numbers have non-canonical alternative 12429 encodings. 12430- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values. 12431 These are treated as non-canonical encodings of zero and will be flushed to 12432 a zero of the same sign by this operation. 12433 12434Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with 12435default exception handling must signal an invalid exception, and produce a 12436quiet NaN result. 12437 12438This function should always be implementable as multiplication by 1.0, provided 12439that the compiler does not constant fold the operation. Likewise, division by 124401.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with 12441-0.0 is also sufficient provided that the rounding mode is not -Infinity. 12442 12443``@llvm.canonicalize`` must preserve the equality relation. That is: 12444 12445- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)`` 12446- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to 12447 to ``(x == y)`` 12448 12449Additionally, the sign of zero must be conserved: 12450``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0`` 12451 12452The payload bits of a NaN must be conserved, with two exceptions. 12453First, environments which use only a single canonical representation of NaN 12454must perform said canonicalization. Second, SNaNs must be quieted per the 12455usual methods. 12456 12457The canonicalization operation may be optimized away if: 12458 12459- The input is known to be canonical. For example, it was produced by a 12460 floating-point operation that is required by the standard to be canonical. 12461- The result is consumed only by (or fused with) other floating-point 12462 operations. That is, the bits of the floating-point value are not examined. 12463 12464'``llvm.fmuladd.*``' Intrinsic 12465^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12466 12467Syntax: 12468""""""" 12469 12470:: 12471 12472 declare float @llvm.fmuladd.f32(float %a, float %b, float %c) 12473 declare double @llvm.fmuladd.f64(double %a, double %b, double %c) 12474 12475Overview: 12476""""""""" 12477 12478The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add 12479expressions that can be fused if the code generator determines that (a) the 12480target instruction set has support for a fused operation, and (b) that the 12481fused operation is more efficient than the equivalent, separate pair of mul 12482and add instructions. 12483 12484Arguments: 12485"""""""""" 12486 12487The '``llvm.fmuladd.*``' intrinsics each take three arguments: two 12488multiplicands, a and b, and an addend c. 12489 12490Semantics: 12491"""""""""" 12492 12493The expression: 12494 12495:: 12496 12497 %0 = call float @llvm.fmuladd.f32(%a, %b, %c) 12498 12499is equivalent to the expression a \* b + c, except that rounding will 12500not be performed between the multiplication and addition steps if the 12501code generator fuses the operations. Fusion is not guaranteed, even if 12502the target platform supports it. If a fused multiply-add is required the 12503corresponding llvm.fma.\* intrinsic function should be used 12504instead. This never sets errno, just as '``llvm.fma.*``'. 12505 12506Examples: 12507""""""""" 12508 12509.. code-block:: llvm 12510 12511 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c 12512 12513 12514Experimental Vector Reduction Intrinsics 12515---------------------------------------- 12516 12517Horizontal reductions of vectors can be expressed using the following 12518intrinsics. Each one takes a vector operand as an input and applies its 12519respective operation across all elements of the vector, returning a single 12520scalar result of the same element type. 12521 12522 12523'``llvm.experimental.vector.reduce.add.*``' Intrinsic 12524^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12525 12526Syntax: 12527""""""" 12528 12529:: 12530 12531 declare i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> %a) 12532 declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %a) 12533 12534Overview: 12535""""""""" 12536 12537The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD`` 12538reduction of a vector, returning the result as a scalar. The return type matches 12539the element-type of the vector input. 12540 12541Arguments: 12542"""""""""" 12543The argument to this intrinsic must be a vector of integer values. 12544 12545'``llvm.experimental.vector.reduce.fadd.*``' Intrinsic 12546^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12547 12548Syntax: 12549""""""" 12550 12551:: 12552 12553 declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a) 12554 declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a) 12555 12556Overview: 12557""""""""" 12558 12559The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating-point 12560``ADD`` reduction of a vector, returning the result as a scalar. The return type 12561matches the element-type of the vector input. 12562 12563If the intrinsic call has fast-math flags, then the reduction will not preserve 12564the associativity of an equivalent scalarized counterpart. If it does not have 12565fast-math flags, then the reduction will be *ordered*, implying that the 12566operation respects the associativity of a scalarized reduction. 12567 12568 12569Arguments: 12570"""""""""" 12571The first argument to this intrinsic is a scalar accumulator value, which is 12572only used when there are no fast-math flags attached. This argument may be undef 12573when fast-math flags are used. 12574 12575The second argument must be a vector of floating-point values. 12576 12577Examples: 12578""""""""" 12579 12580.. code-block:: llvm 12581 12582 %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction 12583 %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction 12584 12585 12586'``llvm.experimental.vector.reduce.mul.*``' Intrinsic 12587^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12588 12589Syntax: 12590""""""" 12591 12592:: 12593 12594 declare i32 @llvm.experimental.vector.reduce.mul.i32.v4i32(<4 x i32> %a) 12595 declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %a) 12596 12597Overview: 12598""""""""" 12599 12600The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL`` 12601reduction of a vector, returning the result as a scalar. The return type matches 12602the element-type of the vector input. 12603 12604Arguments: 12605"""""""""" 12606The argument to this intrinsic must be a vector of integer values. 12607 12608'``llvm.experimental.vector.reduce.fmul.*``' Intrinsic 12609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12610 12611Syntax: 12612""""""" 12613 12614:: 12615 12616 declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a) 12617 declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a) 12618 12619Overview: 12620""""""""" 12621 12622The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating-point 12623``MUL`` reduction of a vector, returning the result as a scalar. The return type 12624matches the element-type of the vector input. 12625 12626If the intrinsic call has fast-math flags, then the reduction will not preserve 12627the associativity of an equivalent scalarized counterpart. If it does not have 12628fast-math flags, then the reduction will be *ordered*, implying that the 12629operation respects the associativity of a scalarized reduction. 12630 12631 12632Arguments: 12633"""""""""" 12634The first argument to this intrinsic is a scalar accumulator value, which is 12635only used when there are no fast-math flags attached. This argument may be undef 12636when fast-math flags are used. 12637 12638The second argument must be a vector of floating-point values. 12639 12640Examples: 12641""""""""" 12642 12643.. code-block:: llvm 12644 12645 %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction 12646 %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction 12647 12648'``llvm.experimental.vector.reduce.and.*``' Intrinsic 12649^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12650 12651Syntax: 12652""""""" 12653 12654:: 12655 12656 declare i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a) 12657 12658Overview: 12659""""""""" 12660 12661The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND`` 12662reduction of a vector, returning the result as a scalar. The return type matches 12663the element-type of the vector input. 12664 12665Arguments: 12666"""""""""" 12667The argument to this intrinsic must be a vector of integer values. 12668 12669'``llvm.experimental.vector.reduce.or.*``' Intrinsic 12670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12671 12672Syntax: 12673""""""" 12674 12675:: 12676 12677 declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) 12678 12679Overview: 12680""""""""" 12681 12682The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction 12683of a vector, returning the result as a scalar. The return type matches the 12684element-type of the vector input. 12685 12686Arguments: 12687"""""""""" 12688The argument to this intrinsic must be a vector of integer values. 12689 12690'``llvm.experimental.vector.reduce.xor.*``' Intrinsic 12691^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12692 12693Syntax: 12694""""""" 12695 12696:: 12697 12698 declare i32 @llvm.experimental.vector.reduce.xor.i32.v4i32(<4 x i32> %a) 12699 12700Overview: 12701""""""""" 12702 12703The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR`` 12704reduction of a vector, returning the result as a scalar. The return type matches 12705the element-type of the vector input. 12706 12707Arguments: 12708"""""""""" 12709The argument to this intrinsic must be a vector of integer values. 12710 12711'``llvm.experimental.vector.reduce.smax.*``' Intrinsic 12712^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12713 12714Syntax: 12715""""""" 12716 12717:: 12718 12719 declare i32 @llvm.experimental.vector.reduce.smax.i32.v4i32(<4 x i32> %a) 12720 12721Overview: 12722""""""""" 12723 12724The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer 12725``MAX`` reduction of a vector, returning the result as a scalar. The return type 12726matches the element-type of the vector input. 12727 12728Arguments: 12729"""""""""" 12730The argument to this intrinsic must be a vector of integer values. 12731 12732'``llvm.experimental.vector.reduce.smin.*``' Intrinsic 12733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12734 12735Syntax: 12736""""""" 12737 12738:: 12739 12740 declare i32 @llvm.experimental.vector.reduce.smin.i32.v4i32(<4 x i32> %a) 12741 12742Overview: 12743""""""""" 12744 12745The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer 12746``MIN`` reduction of a vector, returning the result as a scalar. The return type 12747matches the element-type of the vector input. 12748 12749Arguments: 12750"""""""""" 12751The argument to this intrinsic must be a vector of integer values. 12752 12753'``llvm.experimental.vector.reduce.umax.*``' Intrinsic 12754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12755 12756Syntax: 12757""""""" 12758 12759:: 12760 12761 declare i32 @llvm.experimental.vector.reduce.umax.i32.v4i32(<4 x i32> %a) 12762 12763Overview: 12764""""""""" 12765 12766The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned 12767integer ``MAX`` reduction of a vector, returning the result as a scalar. The 12768return type matches the element-type of the vector input. 12769 12770Arguments: 12771"""""""""" 12772The argument to this intrinsic must be a vector of integer values. 12773 12774'``llvm.experimental.vector.reduce.umin.*``' Intrinsic 12775^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12776 12777Syntax: 12778""""""" 12779 12780:: 12781 12782 declare i32 @llvm.experimental.vector.reduce.umin.i32.v4i32(<4 x i32> %a) 12783 12784Overview: 12785""""""""" 12786 12787The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned 12788integer ``MIN`` reduction of a vector, returning the result as a scalar. The 12789return type matches the element-type of the vector input. 12790 12791Arguments: 12792"""""""""" 12793The argument to this intrinsic must be a vector of integer values. 12794 12795'``llvm.experimental.vector.reduce.fmax.*``' Intrinsic 12796^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12797 12798Syntax: 12799""""""" 12800 12801:: 12802 12803 declare float @llvm.experimental.vector.reduce.fmax.f32.v4f32(<4 x float> %a) 12804 declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %a) 12805 12806Overview: 12807""""""""" 12808 12809The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating-point 12810``MAX`` reduction of a vector, returning the result as a scalar. The return type 12811matches the element-type of the vector input. 12812 12813If the intrinsic call has the ``nnan`` fast-math flag then the operation can 12814assume that NaNs are not present in the input vector. 12815 12816Arguments: 12817"""""""""" 12818The argument to this intrinsic must be a vector of floating-point values. 12819 12820'``llvm.experimental.vector.reduce.fmin.*``' Intrinsic 12821^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12822 12823Syntax: 12824""""""" 12825 12826:: 12827 12828 declare float @llvm.experimental.vector.reduce.fmin.f32.v4f32(<4 x float> %a) 12829 declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %a) 12830 12831Overview: 12832""""""""" 12833 12834The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating-point 12835``MIN`` reduction of a vector, returning the result as a scalar. The return type 12836matches the element-type of the vector input. 12837 12838If the intrinsic call has the ``nnan`` fast-math flag then the operation can 12839assume that NaNs are not present in the input vector. 12840 12841Arguments: 12842"""""""""" 12843The argument to this intrinsic must be a vector of floating-point values. 12844 12845Half Precision Floating-Point Intrinsics 12846---------------------------------------- 12847 12848For most target platforms, half precision floating-point is a 12849storage-only format. This means that it is a dense encoding (in memory) 12850but does not support computation in the format. 12851 12852This means that code must first load the half-precision floating-point 12853value as an i16, then convert it to float with 12854:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can 12855then be performed on the float value (including extending to double 12856etc). To store the value back to memory, it is first converted to float 12857if needed, then converted to i16 with 12858:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an 12859i16 value. 12860 12861.. _int_convert_to_fp16: 12862 12863'``llvm.convert.to.fp16``' Intrinsic 12864^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12865 12866Syntax: 12867""""""" 12868 12869:: 12870 12871 declare i16 @llvm.convert.to.fp16.f32(float %a) 12872 declare i16 @llvm.convert.to.fp16.f64(double %a) 12873 12874Overview: 12875""""""""" 12876 12877The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 12878conventional floating-point type to half precision floating-point format. 12879 12880Arguments: 12881"""""""""" 12882 12883The intrinsic function contains single argument - the value to be 12884converted. 12885 12886Semantics: 12887"""""""""" 12888 12889The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a 12890conventional floating-point format to half precision floating-point format. The 12891return value is an ``i16`` which contains the converted number. 12892 12893Examples: 12894""""""""" 12895 12896.. code-block:: llvm 12897 12898 %res = call i16 @llvm.convert.to.fp16.f32(float %a) 12899 store i16 %res, i16* @x, align 2 12900 12901.. _int_convert_from_fp16: 12902 12903'``llvm.convert.from.fp16``' Intrinsic 12904^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12905 12906Syntax: 12907""""""" 12908 12909:: 12910 12911 declare float @llvm.convert.from.fp16.f32(i16 %a) 12912 declare double @llvm.convert.from.fp16.f64(i16 %a) 12913 12914Overview: 12915""""""""" 12916 12917The '``llvm.convert.from.fp16``' intrinsic function performs a 12918conversion from half precision floating-point format to single precision 12919floating-point format. 12920 12921Arguments: 12922"""""""""" 12923 12924The intrinsic function contains single argument - the value to be 12925converted. 12926 12927Semantics: 12928"""""""""" 12929 12930The '``llvm.convert.from.fp16``' intrinsic function performs a 12931conversion from half single precision floating-point format to single 12932precision floating-point format. The input half-float value is 12933represented by an ``i16`` value. 12934 12935Examples: 12936""""""""" 12937 12938.. code-block:: llvm 12939 12940 %a = load i16, i16* @x, align 2 12941 %res = call float @llvm.convert.from.fp16(i16 %a) 12942 12943.. _dbg_intrinsics: 12944 12945Debugger Intrinsics 12946------------------- 12947 12948The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` 12949prefix), are described in the `LLVM Source Level 12950Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_ 12951document. 12952 12953Exception Handling Intrinsics 12954----------------------------- 12955 12956The LLVM exception handling intrinsics (which all start with 12957``llvm.eh.`` prefix), are described in the `LLVM Exception 12958Handling <ExceptionHandling.html#format-common-intrinsics>`_ document. 12959 12960.. _int_trampoline: 12961 12962Trampoline Intrinsics 12963--------------------- 12964 12965These intrinsics make it possible to excise one parameter, marked with 12966the :ref:`nest <nest>` attribute, from a function. The result is a 12967callable function pointer lacking the nest parameter - the caller does 12968not need to provide a value for it. Instead, the value to use is stored 12969in advance in a "trampoline", a block of memory usually allocated on the 12970stack, which also contains code to splice the nest value into the 12971argument list. This is used to implement the GCC nested function address 12972extension. 12973 12974For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)`` 12975then the resulting function pointer has signature ``i32 (i32, i32)*``. 12976It can be created as follows: 12977 12978.. code-block:: llvm 12979 12980 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 12981 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0 12982 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval) 12983 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1) 12984 %fp = bitcast i8* %p to i32 (i32, i32)* 12985 12986The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to 12987``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``. 12988 12989.. _int_it: 12990 12991'``llvm.init.trampoline``' Intrinsic 12992^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 12993 12994Syntax: 12995""""""" 12996 12997:: 12998 12999 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>) 13000 13001Overview: 13002""""""""" 13003 13004This fills the memory pointed to by ``tramp`` with executable code, 13005turning it into a trampoline. 13006 13007Arguments: 13008"""""""""" 13009 13010The ``llvm.init.trampoline`` intrinsic takes three arguments, all 13011pointers. The ``tramp`` argument must point to a sufficiently large and 13012sufficiently aligned block of memory; this memory is written to by the 13013intrinsic. Note that the size and the alignment are target-specific - 13014LLVM currently provides no portable way of determining them, so a 13015front-end that generates this intrinsic needs to have some 13016target-specific knowledge. The ``func`` argument must hold a function 13017bitcast to an ``i8*``. 13018 13019Semantics: 13020"""""""""" 13021 13022The block of memory pointed to by ``tramp`` is filled with target 13023dependent code, turning it into a function. Then ``tramp`` needs to be 13024passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can 13025be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new 13026function's signature is the same as that of ``func`` with any arguments 13027marked with the ``nest`` attribute removed. At most one such ``nest`` 13028argument is allowed, and it must be of pointer type. Calling the new 13029function is equivalent to calling ``func`` with the same argument list, 13030but with ``nval`` used for the missing ``nest`` argument. If, after 13031calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is 13032modified, then the effect of any later call to the returned function 13033pointer is undefined. 13034 13035.. _int_at: 13036 13037'``llvm.adjust.trampoline``' Intrinsic 13038^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13039 13040Syntax: 13041""""""" 13042 13043:: 13044 13045 declare i8* @llvm.adjust.trampoline(i8* <tramp>) 13046 13047Overview: 13048""""""""" 13049 13050This performs any required machine-specific adjustment to the address of 13051a trampoline (passed as ``tramp``). 13052 13053Arguments: 13054"""""""""" 13055 13056``tramp`` must point to a block of memory which already has trampoline 13057code filled in by a previous call to 13058:ref:`llvm.init.trampoline <int_it>`. 13059 13060Semantics: 13061"""""""""" 13062 13063On some architectures the address of the code to be executed needs to be 13064different than the address where the trampoline is actually stored. This 13065intrinsic returns the executable address corresponding to ``tramp`` 13066after performing the required machine specific adjustments. The pointer 13067returned can then be :ref:`bitcast and executed <int_trampoline>`. 13068 13069.. _int_mload_mstore: 13070 13071Masked Vector Load and Store Intrinsics 13072--------------------------------------- 13073 13074LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed. 13075 13076.. _int_mload: 13077 13078'``llvm.masked.load.*``' Intrinsics 13079^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13080 13081Syntax: 13082""""""" 13083This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type. 13084 13085:: 13086 13087 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 13088 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 13089 ;; The data is a vector of pointers to double 13090 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>) 13091 ;; The data is a vector of function pointers 13092 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>) 13093 13094Overview: 13095""""""""" 13096 13097Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 13098 13099 13100Arguments: 13101"""""""""" 13102 13103The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types. 13104 13105 13106Semantics: 13107"""""""""" 13108 13109The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations. 13110The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes. 13111 13112 13113:: 13114 13115 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru) 13116 13117 ;; The result of the two following instructions is identical aside from potential memory access exception 13118 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4 13119 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru 13120 13121.. _int_mstore: 13122 13123'``llvm.masked.store.*``' Intrinsics 13124^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13125 13126Syntax: 13127""""""" 13128This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. 13129 13130:: 13131 13132 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 13133 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>) 13134 ;; The data is a vector of pointers to double 13135 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>) 13136 ;; The data is a vector of function pointers 13137 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>) 13138 13139Overview: 13140""""""""" 13141 13142Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 13143 13144Arguments: 13145"""""""""" 13146 13147The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 13148 13149 13150Semantics: 13151"""""""""" 13152 13153The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 13154The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes. 13155 13156:: 13157 13158 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask) 13159 13160 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions 13161 %oldval = load <16 x float>, <16 x float>* %ptr, align 4 13162 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval 13163 store <16 x float> %res, <16 x float>* %ptr, align 4 13164 13165 13166Masked Vector Gather and Scatter Intrinsics 13167------------------------------------------- 13168 13169LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed. 13170 13171.. _int_mgather: 13172 13173'``llvm.masked.gather.*``' Intrinsics 13174^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13175 13176Syntax: 13177""""""" 13178This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector. 13179 13180:: 13181 13182 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>) 13183 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>) 13184 declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>) 13185 13186Overview: 13187""""""""" 13188 13189Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand. 13190 13191 13192Arguments: 13193"""""""""" 13194 13195The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types. 13196 13197 13198Semantics: 13199"""""""""" 13200 13201The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations. 13202The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks. 13203 13204 13205:: 13206 13207 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef) 13208 13209 ;; The gather with all-true mask is equivalent to the following instruction sequence 13210 %ptr0 = extractelement <4 x double*> %ptrs, i32 0 13211 %ptr1 = extractelement <4 x double*> %ptrs, i32 1 13212 %ptr2 = extractelement <4 x double*> %ptrs, i32 2 13213 %ptr3 = extractelement <4 x double*> %ptrs, i32 3 13214 13215 %val0 = load double, double* %ptr0, align 8 13216 %val1 = load double, double* %ptr1, align 8 13217 %val2 = load double, double* %ptr2, align 8 13218 %val3 = load double, double* %ptr3, align 8 13219 13220 %vec0 = insertelement <4 x double>undef, %val0, 0 13221 %vec01 = insertelement <4 x double>%vec0, %val1, 1 13222 %vec012 = insertelement <4 x double>%vec01, %val2, 2 13223 %vec0123 = insertelement <4 x double>%vec012, %val3, 3 13224 13225.. _int_mscatter: 13226 13227'``llvm.masked.scatter.*``' Intrinsics 13228^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13229 13230Syntax: 13231""""""" 13232This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element. 13233 13234:: 13235 13236 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>) 13237 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>) 13238 declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>) 13239 13240Overview: 13241""""""""" 13242 13243Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. 13244 13245Arguments: 13246"""""""""" 13247 13248The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements. 13249 13250 13251Semantics: 13252"""""""""" 13253 13254The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations. 13255 13256:: 13257 13258 ;; This instruction unconditionally stores data vector in multiple addresses 13259 call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>) 13260 13261 ;; It is equivalent to a list of scalar stores 13262 %val0 = extractelement <8 x i32> %value, i32 0 13263 %val1 = extractelement <8 x i32> %value, i32 1 13264 .. 13265 %val7 = extractelement <8 x i32> %value, i32 7 13266 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0 13267 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1 13268 .. 13269 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7 13270 ;; Note: the order of the following stores is important when they overlap: 13271 store i32 %val0, i32* %ptr0, align 4 13272 store i32 %val1, i32* %ptr1, align 4 13273 .. 13274 store i32 %val7, i32* %ptr7, align 4 13275 13276 13277Masked Vector Expanding Load and Compressing Store Intrinsics 13278------------------------------------------------------------- 13279 13280LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`. 13281 13282.. _int_expandload: 13283 13284'``llvm.masked.expandload.*``' Intrinsics 13285^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13286 13287Syntax: 13288""""""" 13289This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask. 13290 13291:: 13292 13293 declare <16 x float> @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>) 13294 declare <2 x i64> @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>) 13295 13296Overview: 13297""""""""" 13298 13299Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "explandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand. 13300 13301 13302Arguments: 13303"""""""""" 13304 13305The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type. 13306 13307Semantics: 13308"""""""""" 13309 13310The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example: 13311 13312.. code-block:: c 13313 13314 // In this loop we load from B and spread the elements into array A. 13315 double *A, B; int *C; 13316 for (int i = 0; i < size; ++i) { 13317 if (C[i] != 0) 13318 A[i] = B[j++]; 13319 } 13320 13321 13322.. code-block:: llvm 13323 13324 ; Load several elements from array B and expand them in a vector. 13325 ; The number of loaded elements is equal to the number of '1' elements in the Mask. 13326 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef) 13327 ; Store the result in A 13328 call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask) 13329 13330 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 13331 %MaskI = bitcast <8 x i1> %Mask to i8 13332 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 13333 %MaskI64 = zext i8 %MaskIPopcnt to i64 13334 %BNextInd = add i64 %BInd, %MaskI64 13335 13336 13337Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles. 13338If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load. 13339 13340.. _int_compressstore: 13341 13342'``llvm.masked.compressstore.*``' Intrinsics 13343^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13344 13345Syntax: 13346""""""" 13347This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector. 13348 13349:: 13350 13351 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, i32* <ptr>, <8 x i1> <mask>) 13352 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>) 13353 13354Overview: 13355""""""""" 13356 13357Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask. 13358 13359Arguments: 13360"""""""""" 13361 13362The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements. 13363 13364 13365Semantics: 13366"""""""""" 13367 13368The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example: 13369 13370.. code-block:: c 13371 13372 // In this loop we load elements from A and store them consecutively in B 13373 double *A, B; int *C; 13374 for (int i = 0; i < size; ++i) { 13375 if (C[i] != 0) 13376 B[j++] = A[i] 13377 } 13378 13379 13380.. code-block:: llvm 13381 13382 ; Load elements from A. 13383 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef) 13384 ; Store all selected elements consecutively in array B 13385 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask) 13386 13387 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask. 13388 %MaskI = bitcast <8 x i1> %Mask to i8 13389 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI) 13390 %MaskI64 = zext i8 %MaskIPopcnt to i64 13391 %BNextInd = add i64 %BInd, %MaskI64 13392 13393 13394Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations. 13395 13396 13397Memory Use Markers 13398------------------ 13399 13400This class of intrinsics provides information about the lifetime of 13401memory objects and ranges where variables are immutable. 13402 13403.. _int_lifestart: 13404 13405'``llvm.lifetime.start``' Intrinsic 13406^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13407 13408Syntax: 13409""""""" 13410 13411:: 13412 13413 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) 13414 13415Overview: 13416""""""""" 13417 13418The '``llvm.lifetime.start``' intrinsic specifies the start of a memory 13419object's lifetime. 13420 13421Arguments: 13422"""""""""" 13423 13424The first argument is a constant integer representing the size of the 13425object, or -1 if it is variable sized. The second argument is a pointer 13426to the object. 13427 13428Semantics: 13429"""""""""" 13430 13431This intrinsic indicates that before this point in the code, the value 13432of the memory pointed to by ``ptr`` is dead. This means that it is known 13433to never be used and has an undefined value. A load from the pointer 13434that precedes this intrinsic can be replaced with ``'undef'``. 13435 13436.. _int_lifeend: 13437 13438'``llvm.lifetime.end``' Intrinsic 13439^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13440 13441Syntax: 13442""""""" 13443 13444:: 13445 13446 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) 13447 13448Overview: 13449""""""""" 13450 13451The '``llvm.lifetime.end``' intrinsic specifies the end of a memory 13452object's lifetime. 13453 13454Arguments: 13455"""""""""" 13456 13457The first argument is a constant integer representing the size of the 13458object, or -1 if it is variable sized. The second argument is a pointer 13459to the object. 13460 13461Semantics: 13462"""""""""" 13463 13464This intrinsic indicates that after this point in the code, the value of 13465the memory pointed to by ``ptr`` is dead. This means that it is known to 13466never be used and has an undefined value. Any stores into the memory 13467object following this intrinsic may be removed as dead. 13468 13469'``llvm.invariant.start``' Intrinsic 13470^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13471 13472Syntax: 13473""""""" 13474This is an overloaded intrinsic. The memory object can belong to any address space. 13475 13476:: 13477 13478 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>) 13479 13480Overview: 13481""""""""" 13482 13483The '``llvm.invariant.start``' intrinsic specifies that the contents of 13484a memory object will not change. 13485 13486Arguments: 13487"""""""""" 13488 13489The first argument is a constant integer representing the size of the 13490object, or -1 if it is variable sized. The second argument is a pointer 13491to the object. 13492 13493Semantics: 13494"""""""""" 13495 13496This intrinsic indicates that until an ``llvm.invariant.end`` that uses 13497the return value, the referenced memory location is constant and 13498unchanging. 13499 13500'``llvm.invariant.end``' Intrinsic 13501^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13502 13503Syntax: 13504""""""" 13505This is an overloaded intrinsic. The memory object can belong to any address space. 13506 13507:: 13508 13509 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>) 13510 13511Overview: 13512""""""""" 13513 13514The '``llvm.invariant.end``' intrinsic specifies that the contents of a 13515memory object are mutable. 13516 13517Arguments: 13518"""""""""" 13519 13520The first argument is the matching ``llvm.invariant.start`` intrinsic. 13521The second argument is a constant integer representing the size of the 13522object, or -1 if it is variable sized and the third argument is a 13523pointer to the object. 13524 13525Semantics: 13526"""""""""" 13527 13528This intrinsic indicates that the memory is mutable again. 13529 13530'``llvm.launder.invariant.group``' Intrinsic 13531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13532 13533Syntax: 13534""""""" 13535This is an overloaded intrinsic. The memory object can belong to any address 13536space. The returned pointer must belong to the same address space as the 13537argument. 13538 13539:: 13540 13541 declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>) 13542 13543Overview: 13544""""""""" 13545 13546The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant 13547established by ``invariant.group`` metadata no longer holds, to obtain a new 13548pointer value that carries fresh invariant group information. It is an 13549experimental intrinsic, which means that its semantics might change in the 13550future. 13551 13552 13553Arguments: 13554"""""""""" 13555 13556The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer 13557to the memory. 13558 13559Semantics: 13560"""""""""" 13561 13562Returns another pointer that aliases its argument but which is considered different 13563for the purposes of ``load``/``store`` ``invariant.group`` metadata. 13564It does not read any accessible memory and the execution can be speculated. 13565 13566'``llvm.strip.invariant.group``' Intrinsic 13567^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13568 13569Syntax: 13570""""""" 13571This is an overloaded intrinsic. The memory object can belong to any address 13572space. The returned pointer must belong to the same address space as the 13573argument. 13574 13575:: 13576 13577 declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>) 13578 13579Overview: 13580""""""""" 13581 13582The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant 13583established by ``invariant.group`` metadata no longer holds, to obtain a new pointer 13584value that does not carry the invariant information. It is an experimental 13585intrinsic, which means that its semantics might change in the future. 13586 13587 13588Arguments: 13589"""""""""" 13590 13591The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer 13592to the memory. 13593 13594Semantics: 13595"""""""""" 13596 13597Returns another pointer that aliases its argument but which has no associated 13598``invariant.group`` metadata. 13599It does not read any memory and can be speculated. 13600 13601 13602 13603.. _constrainedfp: 13604 13605Constrained Floating-Point Intrinsics 13606------------------------------------- 13607 13608These intrinsics are used to provide special handling of floating-point 13609operations when specific rounding mode or floating-point exception behavior is 13610required. By default, LLVM optimization passes assume that the rounding mode is 13611round-to-nearest and that floating-point exceptions will not be monitored. 13612Constrained FP intrinsics are used to support non-default rounding modes and 13613accurately preserve exception behavior without compromising LLVM's ability to 13614optimize FP code when the default behavior is used. 13615 13616Each of these intrinsics corresponds to a normal floating-point operation. The 13617first two arguments and the return value are the same as the corresponding FP 13618operation. 13619 13620The third argument is a metadata argument specifying the rounding mode to be 13621assumed. This argument must be one of the following strings: 13622 13623:: 13624 13625 "round.dynamic" 13626 "round.tonearest" 13627 "round.downward" 13628 "round.upward" 13629 "round.towardzero" 13630 13631If this argument is "round.dynamic" optimization passes must assume that the 13632rounding mode is unknown and may change at runtime. No transformations that 13633depend on rounding mode may be performed in this case. 13634 13635The other possible values for the rounding mode argument correspond to the 13636similarly named IEEE rounding modes. If the argument is any of these values 13637optimization passes may perform transformations as long as they are consistent 13638with the specified rounding mode. 13639 13640For example, 'x-0'->'x' is not a valid transformation if the rounding mode is 13641"round.downward" or "round.dynamic" because if the value of 'x' is +0 then 13642'x-0' should evaluate to '-0' when rounding downward. However, this 13643transformation is legal for all other rounding modes. 13644 13645For values other than "round.dynamic" optimization passes may assume that the 13646actual runtime rounding mode (as defined in a target-specific manner) matches 13647the specified rounding mode, but this is not guaranteed. Using a specific 13648non-dynamic rounding mode which does not match the actual rounding mode at 13649runtime results in undefined behavior. 13650 13651The fourth argument to the constrained floating-point intrinsics specifies the 13652required exception behavior. This argument must be one of the following 13653strings: 13654 13655:: 13656 13657 "fpexcept.ignore" 13658 "fpexcept.maytrap" 13659 "fpexcept.strict" 13660 13661If this argument is "fpexcept.ignore" optimization passes may assume that the 13662exception status flags will not be read and that floating-point exceptions will 13663be masked. This allows transformations to be performed that may change the 13664exception semantics of the original code. For example, FP operations may be 13665speculatively executed in this case whereas they must not be for either of the 13666other possible values of this argument. 13667 13668If the exception behavior argument is "fpexcept.maytrap" optimization passes 13669must avoid transformations that may raise exceptions that would not have been 13670raised by the original code (such as speculatively executing FP operations), but 13671passes are not required to preserve all exceptions that are implied by the 13672original code. For example, exceptions may be potentially hidden by constant 13673folding. 13674 13675If the exception behavior argument is "fpexcept.strict" all transformations must 13676strictly preserve the floating-point exception semantics of the original code. 13677Any FP exception that would have been raised by the original code must be raised 13678by the transformed code, and the transformed code must not raise any FP 13679exceptions that would not have been raised by the original code. This is the 13680exception behavior argument that will be used if the code being compiled reads 13681the FP exception status flags, but this mode can also be used with code that 13682unmasks FP exceptions. 13683 13684The number and order of floating-point exceptions is NOT guaranteed. For 13685example, a series of FP operations that each may raise exceptions may be 13686vectorized into a single instruction that raises each unique exception a single 13687time. 13688 13689 13690'``llvm.experimental.constrained.fadd``' Intrinsic 13691^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13692 13693Syntax: 13694""""""" 13695 13696:: 13697 13698 declare <type> 13699 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>, 13700 metadata <rounding mode>, 13701 metadata <exception behavior>) 13702 13703Overview: 13704""""""""" 13705 13706The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its 13707two operands. 13708 13709 13710Arguments: 13711"""""""""" 13712 13713The first two arguments to the '``llvm.experimental.constrained.fadd``' 13714intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 13715of floating-point values. Both arguments must have identical types. 13716 13717The third and fourth arguments specify the rounding mode and exception 13718behavior as described above. 13719 13720Semantics: 13721"""""""""" 13722 13723The value produced is the floating-point sum of the two value operands and has 13724the same type as the operands. 13725 13726 13727'``llvm.experimental.constrained.fsub``' Intrinsic 13728^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13729 13730Syntax: 13731""""""" 13732 13733:: 13734 13735 declare <type> 13736 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>, 13737 metadata <rounding mode>, 13738 metadata <exception behavior>) 13739 13740Overview: 13741""""""""" 13742 13743The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference 13744of its two operands. 13745 13746 13747Arguments: 13748"""""""""" 13749 13750The first two arguments to the '``llvm.experimental.constrained.fsub``' 13751intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 13752of floating-point values. Both arguments must have identical types. 13753 13754The third and fourth arguments specify the rounding mode and exception 13755behavior as described above. 13756 13757Semantics: 13758"""""""""" 13759 13760The value produced is the floating-point difference of the two value operands 13761and has the same type as the operands. 13762 13763 13764'``llvm.experimental.constrained.fmul``' Intrinsic 13765^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13766 13767Syntax: 13768""""""" 13769 13770:: 13771 13772 declare <type> 13773 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>, 13774 metadata <rounding mode>, 13775 metadata <exception behavior>) 13776 13777Overview: 13778""""""""" 13779 13780The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of 13781its two operands. 13782 13783 13784Arguments: 13785"""""""""" 13786 13787The first two arguments to the '``llvm.experimental.constrained.fmul``' 13788intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 13789of floating-point values. Both arguments must have identical types. 13790 13791The third and fourth arguments specify the rounding mode and exception 13792behavior as described above. 13793 13794Semantics: 13795"""""""""" 13796 13797The value produced is the floating-point product of the two value operands and 13798has the same type as the operands. 13799 13800 13801'``llvm.experimental.constrained.fdiv``' Intrinsic 13802^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13803 13804Syntax: 13805""""""" 13806 13807:: 13808 13809 declare <type> 13810 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>, 13811 metadata <rounding mode>, 13812 metadata <exception behavior>) 13813 13814Overview: 13815""""""""" 13816 13817The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of 13818its two operands. 13819 13820 13821Arguments: 13822"""""""""" 13823 13824The first two arguments to the '``llvm.experimental.constrained.fdiv``' 13825intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 13826of floating-point values. Both arguments must have identical types. 13827 13828The third and fourth arguments specify the rounding mode and exception 13829behavior as described above. 13830 13831Semantics: 13832"""""""""" 13833 13834The value produced is the floating-point quotient of the two value operands and 13835has the same type as the operands. 13836 13837 13838'``llvm.experimental.constrained.frem``' Intrinsic 13839^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13840 13841Syntax: 13842""""""" 13843 13844:: 13845 13846 declare <type> 13847 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>, 13848 metadata <rounding mode>, 13849 metadata <exception behavior>) 13850 13851Overview: 13852""""""""" 13853 13854The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder 13855from the division of its two operands. 13856 13857 13858Arguments: 13859"""""""""" 13860 13861The first two arguments to the '``llvm.experimental.constrained.frem``' 13862intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` 13863of floating-point values. Both arguments must have identical types. 13864 13865The third and fourth arguments specify the rounding mode and exception 13866behavior as described above. The rounding mode argument has no effect, since 13867the result of frem is never rounded, but the argument is included for 13868consistency with the other constrained floating-point intrinsics. 13869 13870Semantics: 13871"""""""""" 13872 13873The value produced is the floating-point remainder from the division of the two 13874value operands and has the same type as the operands. The remainder has the 13875same sign as the dividend. 13876 13877'``llvm.experimental.constrained.fma``' Intrinsic 13878^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13879 13880Syntax: 13881""""""" 13882 13883:: 13884 13885 declare <type> 13886 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>, 13887 metadata <rounding mode>, 13888 metadata <exception behavior>) 13889 13890Overview: 13891""""""""" 13892 13893The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a 13894fused-multiply-add operation on its operands. 13895 13896Arguments: 13897"""""""""" 13898 13899The first three arguments to the '``llvm.experimental.constrained.fma``' 13900intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector 13901<t_vector>` of floating-point values. All arguments must have identical types. 13902 13903The fourth and fifth arguments specify the rounding mode and exception behavior 13904as described above. 13905 13906Semantics: 13907"""""""""" 13908 13909The result produced is the product of the first two operands added to the third 13910operand computed with infinite precision, and then rounded to the target 13911precision. 13912 13913Constrained libm-equivalent Intrinsics 13914-------------------------------------- 13915 13916In addition to the basic floating-point operations for which constrained 13917intrinsics are described above, there are constrained versions of various 13918operations which provide equivalent behavior to a corresponding libm function. 13919These intrinsics allow the precise behavior of these operations with respect to 13920rounding mode and exception behavior to be controlled. 13921 13922As with the basic constrained floating-point intrinsics, the rounding mode 13923and exception behavior arguments only control the behavior of the optimizer. 13924They do not change the runtime floating-point environment. 13925 13926 13927'``llvm.experimental.constrained.sqrt``' Intrinsic 13928^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13929 13930Syntax: 13931""""""" 13932 13933:: 13934 13935 declare <type> 13936 @llvm.experimental.constrained.sqrt(<type> <op1>, 13937 metadata <rounding mode>, 13938 metadata <exception behavior>) 13939 13940Overview: 13941""""""""" 13942 13943The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root 13944of the specified value, returning the same value as the libm '``sqrt``' 13945functions would, but without setting ``errno``. 13946 13947Arguments: 13948"""""""""" 13949 13950The first argument and the return type are floating-point numbers of the same 13951type. 13952 13953The second and third arguments specify the rounding mode and exception 13954behavior as described above. 13955 13956Semantics: 13957"""""""""" 13958 13959This function returns the nonnegative square root of the specified value. 13960If the value is less than negative zero, a floating-point exception occurs 13961and the return value is architecture specific. 13962 13963 13964'``llvm.experimental.constrained.pow``' Intrinsic 13965^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 13966 13967Syntax: 13968""""""" 13969 13970:: 13971 13972 declare <type> 13973 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>, 13974 metadata <rounding mode>, 13975 metadata <exception behavior>) 13976 13977Overview: 13978""""""""" 13979 13980The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand 13981raised to the (positive or negative) power specified by the second operand. 13982 13983Arguments: 13984"""""""""" 13985 13986The first two arguments and the return value are floating-point numbers of the 13987same type. The second argument specifies the power to which the first argument 13988should be raised. 13989 13990The third and fourth arguments specify the rounding mode and exception 13991behavior as described above. 13992 13993Semantics: 13994"""""""""" 13995 13996This function returns the first value raised to the second power, 13997returning the same values as the libm ``pow`` functions would, and 13998handles error conditions in the same way. 13999 14000 14001'``llvm.experimental.constrained.powi``' Intrinsic 14002^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14003 14004Syntax: 14005""""""" 14006 14007:: 14008 14009 declare <type> 14010 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>, 14011 metadata <rounding mode>, 14012 metadata <exception behavior>) 14013 14014Overview: 14015""""""""" 14016 14017The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand 14018raised to the (positive or negative) power specified by the second operand. The 14019order of evaluation of multiplications is not defined. When a vector of 14020floating-point type is used, the second argument remains a scalar integer value. 14021 14022 14023Arguments: 14024"""""""""" 14025 14026The first argument and the return value are floating-point numbers of the same 14027type. The second argument is a 32-bit signed integer specifying the power to 14028which the first argument should be raised. 14029 14030The third and fourth arguments specify the rounding mode and exception 14031behavior as described above. 14032 14033Semantics: 14034"""""""""" 14035 14036This function returns the first value raised to the second power with an 14037unspecified sequence of rounding operations. 14038 14039 14040'``llvm.experimental.constrained.sin``' Intrinsic 14041^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14042 14043Syntax: 14044""""""" 14045 14046:: 14047 14048 declare <type> 14049 @llvm.experimental.constrained.sin(<type> <op1>, 14050 metadata <rounding mode>, 14051 metadata <exception behavior>) 14052 14053Overview: 14054""""""""" 14055 14056The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the 14057first operand. 14058 14059Arguments: 14060"""""""""" 14061 14062The first argument and the return type are floating-point numbers of the same 14063type. 14064 14065The second and third arguments specify the rounding mode and exception 14066behavior as described above. 14067 14068Semantics: 14069"""""""""" 14070 14071This function returns the sine of the specified operand, returning the 14072same values as the libm ``sin`` functions would, and handles error 14073conditions in the same way. 14074 14075 14076'``llvm.experimental.constrained.cos``' Intrinsic 14077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14078 14079Syntax: 14080""""""" 14081 14082:: 14083 14084 declare <type> 14085 @llvm.experimental.constrained.cos(<type> <op1>, 14086 metadata <rounding mode>, 14087 metadata <exception behavior>) 14088 14089Overview: 14090""""""""" 14091 14092The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the 14093first operand. 14094 14095Arguments: 14096"""""""""" 14097 14098The first argument and the return type are floating-point numbers of the same 14099type. 14100 14101The second and third arguments specify the rounding mode and exception 14102behavior as described above. 14103 14104Semantics: 14105"""""""""" 14106 14107This function returns the cosine of the specified operand, returning the 14108same values as the libm ``cos`` functions would, and handles error 14109conditions in the same way. 14110 14111 14112'``llvm.experimental.constrained.exp``' Intrinsic 14113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14114 14115Syntax: 14116""""""" 14117 14118:: 14119 14120 declare <type> 14121 @llvm.experimental.constrained.exp(<type> <op1>, 14122 metadata <rounding mode>, 14123 metadata <exception behavior>) 14124 14125Overview: 14126""""""""" 14127 14128The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e 14129exponential of the specified value. 14130 14131Arguments: 14132"""""""""" 14133 14134The first argument and the return value are floating-point numbers of the same 14135type. 14136 14137The second and third arguments specify the rounding mode and exception 14138behavior as described above. 14139 14140Semantics: 14141"""""""""" 14142 14143This function returns the same values as the libm ``exp`` functions 14144would, and handles error conditions in the same way. 14145 14146 14147'``llvm.experimental.constrained.exp2``' Intrinsic 14148^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14149 14150Syntax: 14151""""""" 14152 14153:: 14154 14155 declare <type> 14156 @llvm.experimental.constrained.exp2(<type> <op1>, 14157 metadata <rounding mode>, 14158 metadata <exception behavior>) 14159 14160Overview: 14161""""""""" 14162 14163The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2 14164exponential of the specified value. 14165 14166 14167Arguments: 14168"""""""""" 14169 14170The first argument and the return value are floating-point numbers of the same 14171type. 14172 14173The second and third arguments specify the rounding mode and exception 14174behavior as described above. 14175 14176Semantics: 14177"""""""""" 14178 14179This function returns the same values as the libm ``exp2`` functions 14180would, and handles error conditions in the same way. 14181 14182 14183'``llvm.experimental.constrained.log``' Intrinsic 14184^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14185 14186Syntax: 14187""""""" 14188 14189:: 14190 14191 declare <type> 14192 @llvm.experimental.constrained.log(<type> <op1>, 14193 metadata <rounding mode>, 14194 metadata <exception behavior>) 14195 14196Overview: 14197""""""""" 14198 14199The '``llvm.experimental.constrained.log``' intrinsic computes the base-e 14200logarithm of the specified value. 14201 14202Arguments: 14203"""""""""" 14204 14205The first argument and the return value are floating-point numbers of the same 14206type. 14207 14208The second and third arguments specify the rounding mode and exception 14209behavior as described above. 14210 14211 14212Semantics: 14213"""""""""" 14214 14215This function returns the same values as the libm ``log`` functions 14216would, and handles error conditions in the same way. 14217 14218 14219'``llvm.experimental.constrained.log10``' Intrinsic 14220^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14221 14222Syntax: 14223""""""" 14224 14225:: 14226 14227 declare <type> 14228 @llvm.experimental.constrained.log10(<type> <op1>, 14229 metadata <rounding mode>, 14230 metadata <exception behavior>) 14231 14232Overview: 14233""""""""" 14234 14235The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10 14236logarithm of the specified value. 14237 14238Arguments: 14239"""""""""" 14240 14241The first argument and the return value are floating-point numbers of the same 14242type. 14243 14244The second and third arguments specify the rounding mode and exception 14245behavior as described above. 14246 14247Semantics: 14248"""""""""" 14249 14250This function returns the same values as the libm ``log10`` functions 14251would, and handles error conditions in the same way. 14252 14253 14254'``llvm.experimental.constrained.log2``' Intrinsic 14255^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14256 14257Syntax: 14258""""""" 14259 14260:: 14261 14262 declare <type> 14263 @llvm.experimental.constrained.log2(<type> <op1>, 14264 metadata <rounding mode>, 14265 metadata <exception behavior>) 14266 14267Overview: 14268""""""""" 14269 14270The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2 14271logarithm of the specified value. 14272 14273Arguments: 14274"""""""""" 14275 14276The first argument and the return value are floating-point numbers of the same 14277type. 14278 14279The second and third arguments specify the rounding mode and exception 14280behavior as described above. 14281 14282Semantics: 14283"""""""""" 14284 14285This function returns the same values as the libm ``log2`` functions 14286would, and handles error conditions in the same way. 14287 14288 14289'``llvm.experimental.constrained.rint``' Intrinsic 14290^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14291 14292Syntax: 14293""""""" 14294 14295:: 14296 14297 declare <type> 14298 @llvm.experimental.constrained.rint(<type> <op1>, 14299 metadata <rounding mode>, 14300 metadata <exception behavior>) 14301 14302Overview: 14303""""""""" 14304 14305The '``llvm.experimental.constrained.rint``' intrinsic returns the first 14306operand rounded to the nearest integer. It may raise an inexact floating-point 14307exception if the operand is not an integer. 14308 14309Arguments: 14310"""""""""" 14311 14312The first argument and the return value are floating-point numbers of the same 14313type. 14314 14315The second and third arguments specify the rounding mode and exception 14316behavior as described above. 14317 14318Semantics: 14319"""""""""" 14320 14321This function returns the same values as the libm ``rint`` functions 14322would, and handles error conditions in the same way. The rounding mode is 14323described, not determined, by the rounding mode argument. The actual rounding 14324mode is determined by the runtime floating-point environment. The rounding 14325mode argument is only intended as information to the compiler. 14326 14327 14328'``llvm.experimental.constrained.nearbyint``' Intrinsic 14329^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14330 14331Syntax: 14332""""""" 14333 14334:: 14335 14336 declare <type> 14337 @llvm.experimental.constrained.nearbyint(<type> <op1>, 14338 metadata <rounding mode>, 14339 metadata <exception behavior>) 14340 14341Overview: 14342""""""""" 14343 14344The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first 14345operand rounded to the nearest integer. It will not raise an inexact 14346floating-point exception if the operand is not an integer. 14347 14348 14349Arguments: 14350"""""""""" 14351 14352The first argument and the return value are floating-point numbers of the same 14353type. 14354 14355The second and third arguments specify the rounding mode and exception 14356behavior as described above. 14357 14358Semantics: 14359"""""""""" 14360 14361This function returns the same values as the libm ``nearbyint`` functions 14362would, and handles error conditions in the same way. The rounding mode is 14363described, not determined, by the rounding mode argument. The actual rounding 14364mode is determined by the runtime floating-point environment. The rounding 14365mode argument is only intended as information to the compiler. 14366 14367 14368General Intrinsics 14369------------------ 14370 14371This class of intrinsics is designed to be generic and has no specific 14372purpose. 14373 14374'``llvm.var.annotation``' Intrinsic 14375^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14376 14377Syntax: 14378""""""" 14379 14380:: 14381 14382 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 14383 14384Overview: 14385""""""""" 14386 14387The '``llvm.var.annotation``' intrinsic. 14388 14389Arguments: 14390"""""""""" 14391 14392The first argument is a pointer to a value, the second is a pointer to a 14393global string, the third is a pointer to a global string which is the 14394source file name, and the last argument is the line number. 14395 14396Semantics: 14397"""""""""" 14398 14399This intrinsic allows annotation of local variables with arbitrary 14400strings. This can be useful for special purpose optimizations that want 14401to look for these annotations. These have no other defined use; they are 14402ignored by code generation and optimization. 14403 14404'``llvm.ptr.annotation.*``' Intrinsic 14405^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14406 14407Syntax: 14408""""""" 14409 14410This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a 14411pointer to an integer of any width. *NOTE* you must specify an address space for 14412the pointer. The identifier for the default address space is the integer 14413'``0``'. 14414 14415:: 14416 14417 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>) 14418 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>) 14419 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>) 14420 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>) 14421 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>) 14422 14423Overview: 14424""""""""" 14425 14426The '``llvm.ptr.annotation``' intrinsic. 14427 14428Arguments: 14429"""""""""" 14430 14431The first argument is a pointer to an integer value of arbitrary bitwidth 14432(result of some expression), the second is a pointer to a global string, the 14433third is a pointer to a global string which is the source file name, and the 14434last argument is the line number. It returns the value of the first argument. 14435 14436Semantics: 14437"""""""""" 14438 14439This intrinsic allows annotation of a pointer to an integer with arbitrary 14440strings. This can be useful for special purpose optimizations that want to look 14441for these annotations. These have no other defined use; they are ignored by code 14442generation and optimization. 14443 14444'``llvm.annotation.*``' Intrinsic 14445^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14446 14447Syntax: 14448""""""" 14449 14450This is an overloaded intrinsic. You can use '``llvm.annotation``' on 14451any integer bit width. 14452 14453:: 14454 14455 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>) 14456 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>) 14457 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>) 14458 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>) 14459 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>) 14460 14461Overview: 14462""""""""" 14463 14464The '``llvm.annotation``' intrinsic. 14465 14466Arguments: 14467"""""""""" 14468 14469The first argument is an integer value (result of some expression), the 14470second is a pointer to a global string, the third is a pointer to a 14471global string which is the source file name, and the last argument is 14472the line number. It returns the value of the first argument. 14473 14474Semantics: 14475"""""""""" 14476 14477This intrinsic allows annotations to be put on arbitrary expressions 14478with arbitrary strings. This can be useful for special purpose 14479optimizations that want to look for these annotations. These have no 14480other defined use; they are ignored by code generation and optimization. 14481 14482'``llvm.codeview.annotation``' Intrinsic 14483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14484 14485Syntax: 14486""""""" 14487 14488This annotation emits a label at its program point and an associated 14489``S_ANNOTATION`` codeview record with some additional string metadata. This is 14490used to implement MSVC's ``__annotation`` intrinsic. It is marked 14491``noduplicate``, so calls to this intrinsic prevent inlining and should be 14492considered expensive. 14493 14494:: 14495 14496 declare void @llvm.codeview.annotation(metadata) 14497 14498Arguments: 14499"""""""""" 14500 14501The argument should be an MDTuple containing any number of MDStrings. 14502 14503'``llvm.trap``' Intrinsic 14504^^^^^^^^^^^^^^^^^^^^^^^^^ 14505 14506Syntax: 14507""""""" 14508 14509:: 14510 14511 declare void @llvm.trap() noreturn nounwind 14512 14513Overview: 14514""""""""" 14515 14516The '``llvm.trap``' intrinsic. 14517 14518Arguments: 14519"""""""""" 14520 14521None. 14522 14523Semantics: 14524"""""""""" 14525 14526This intrinsic is lowered to the target dependent trap instruction. If 14527the target does not have a trap instruction, this intrinsic will be 14528lowered to a call of the ``abort()`` function. 14529 14530'``llvm.debugtrap``' Intrinsic 14531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14532 14533Syntax: 14534""""""" 14535 14536:: 14537 14538 declare void @llvm.debugtrap() nounwind 14539 14540Overview: 14541""""""""" 14542 14543The '``llvm.debugtrap``' intrinsic. 14544 14545Arguments: 14546"""""""""" 14547 14548None. 14549 14550Semantics: 14551"""""""""" 14552 14553This intrinsic is lowered to code which is intended to cause an 14554execution trap with the intention of requesting the attention of a 14555debugger. 14556 14557'``llvm.stackprotector``' Intrinsic 14558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14559 14560Syntax: 14561""""""" 14562 14563:: 14564 14565 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>) 14566 14567Overview: 14568""""""""" 14569 14570The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it 14571onto the stack at ``slot``. The stack slot is adjusted to ensure that it 14572is placed on the stack before local variables. 14573 14574Arguments: 14575"""""""""" 14576 14577The ``llvm.stackprotector`` intrinsic requires two pointer arguments. 14578The first argument is the value loaded from the stack guard 14579``@__stack_chk_guard``. The second variable is an ``alloca`` that has 14580enough space to hold the value of the guard. 14581 14582Semantics: 14583"""""""""" 14584 14585This intrinsic causes the prologue/epilogue inserter to force the position of 14586the ``AllocaInst`` stack slot to be before local variables on the stack. This is 14587to ensure that if a local variable on the stack is overwritten, it will destroy 14588the value of the guard. When the function exits, the guard on the stack is 14589checked against the original guard by ``llvm.stackprotectorcheck``. If they are 14590different, then ``llvm.stackprotectorcheck`` causes the program to abort by 14591calling the ``__stack_chk_fail()`` function. 14592 14593'``llvm.stackguard``' Intrinsic 14594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14595 14596Syntax: 14597""""""" 14598 14599:: 14600 14601 declare i8* @llvm.stackguard() 14602 14603Overview: 14604""""""""" 14605 14606The ``llvm.stackguard`` intrinsic returns the system stack guard value. 14607 14608It should not be generated by frontends, since it is only for internal usage. 14609The reason why we create this intrinsic is that we still support IR form Stack 14610Protector in FastISel. 14611 14612Arguments: 14613"""""""""" 14614 14615None. 14616 14617Semantics: 14618"""""""""" 14619 14620On some platforms, the value returned by this intrinsic remains unchanged 14621between loads in the same thread. On other platforms, it returns the same 14622global variable value, if any, e.g. ``@__stack_chk_guard``. 14623 14624Currently some platforms have IR-level customized stack guard loading (e.g. 14625X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be 14626in the future. 14627 14628'``llvm.objectsize``' Intrinsic 14629^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14630 14631Syntax: 14632""""""" 14633 14634:: 14635 14636 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>) 14637 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>) 14638 14639Overview: 14640""""""""" 14641 14642The ``llvm.objectsize`` intrinsic is designed to provide information to 14643the optimizers to determine at compile time whether a) an operation 14644(like memcpy) will overflow a buffer that corresponds to an object, or 14645b) that a runtime check for overflow isn't necessary. An object in this 14646context means an allocation of a specific class, structure, array, or 14647other object. 14648 14649Arguments: 14650"""""""""" 14651 14652The ``llvm.objectsize`` intrinsic takes three arguments. The first argument is 14653a pointer to or into the ``object``. The second argument determines whether 14654``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size 14655is unknown. The third argument controls how ``llvm.objectsize`` acts when 14656``null`` in address space 0 is used as its pointer argument. If it's ``false``, 14657``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if 14658the ``null`` is in a non-zero address space or if ``true`` is given for the 14659third argument of ``llvm.objectsize``, we assume its size is unknown. 14660 14661The second and third arguments only accept constants. 14662 14663Semantics: 14664"""""""""" 14665 14666The ``llvm.objectsize`` intrinsic is lowered to a constant representing 14667the size of the object concerned. If the size cannot be determined at 14668compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending 14669on the ``min`` argument). 14670 14671'``llvm.expect``' Intrinsic 14672^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14673 14674Syntax: 14675""""""" 14676 14677This is an overloaded intrinsic. You can use ``llvm.expect`` on any 14678integer bit width. 14679 14680:: 14681 14682 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>) 14683 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) 14684 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) 14685 14686Overview: 14687""""""""" 14688 14689The ``llvm.expect`` intrinsic provides information about expected (the 14690most probable) value of ``val``, which can be used by optimizers. 14691 14692Arguments: 14693"""""""""" 14694 14695The ``llvm.expect`` intrinsic takes two arguments. The first argument is 14696a value. The second argument is an expected value, this needs to be a 14697constant value, variables are not allowed. 14698 14699Semantics: 14700"""""""""" 14701 14702This intrinsic is lowered to the ``val``. 14703 14704.. _int_assume: 14705 14706'``llvm.assume``' Intrinsic 14707^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14708 14709Syntax: 14710""""""" 14711 14712:: 14713 14714 declare void @llvm.assume(i1 %cond) 14715 14716Overview: 14717""""""""" 14718 14719The ``llvm.assume`` allows the optimizer to assume that the provided 14720condition is true. This information can then be used in simplifying other parts 14721of the code. 14722 14723Arguments: 14724"""""""""" 14725 14726The condition which the optimizer may assume is always true. 14727 14728Semantics: 14729"""""""""" 14730 14731The intrinsic allows the optimizer to assume that the provided condition is 14732always true whenever the control flow reaches the intrinsic call. No code is 14733generated for this intrinsic, and instructions that contribute only to the 14734provided condition are not used for code generation. If the condition is 14735violated during execution, the behavior is undefined. 14736 14737Note that the optimizer might limit the transformations performed on values 14738used by the ``llvm.assume`` intrinsic in order to preserve the instructions 14739only used to form the intrinsic's input argument. This might prove undesirable 14740if the extra information provided by the ``llvm.assume`` intrinsic does not cause 14741sufficient overall improvement in code quality. For this reason, 14742``llvm.assume`` should not be used to document basic mathematical invariants 14743that the optimizer can otherwise deduce or facts that are of little use to the 14744optimizer. 14745 14746.. _int_ssa_copy: 14747 14748'``llvm.ssa_copy``' Intrinsic 14749^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14750 14751Syntax: 14752""""""" 14753 14754:: 14755 14756 declare type @llvm.ssa_copy(type %operand) returned(1) readnone 14757 14758Arguments: 14759"""""""""" 14760 14761The first argument is an operand which is used as the returned value. 14762 14763Overview: 14764"""""""""" 14765 14766The ``llvm.ssa_copy`` intrinsic can be used to attach information to 14767operations by copying them and giving them new names. For example, 14768the PredicateInfo utility uses it to build Extended SSA form, and 14769attach various forms of information to operands that dominate specific 14770uses. It is not meant for general use, only for building temporary 14771renaming forms that require value splits at certain points. 14772 14773.. _type.test: 14774 14775'``llvm.type.test``' Intrinsic 14776^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14777 14778Syntax: 14779""""""" 14780 14781:: 14782 14783 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone 14784 14785 14786Arguments: 14787"""""""""" 14788 14789The first argument is a pointer to be tested. The second argument is a 14790metadata object representing a :doc:`type identifier <TypeMetadata>`. 14791 14792Overview: 14793""""""""" 14794 14795The ``llvm.type.test`` intrinsic tests whether the given pointer is associated 14796with the given type identifier. 14797 14798'``llvm.type.checked.load``' Intrinsic 14799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14800 14801Syntax: 14802""""""" 14803 14804:: 14805 14806 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly 14807 14808 14809Arguments: 14810"""""""""" 14811 14812The first argument is a pointer from which to load a function pointer. The 14813second argument is the byte offset from which to load the function pointer. The 14814third argument is a metadata object representing a :doc:`type identifier 14815<TypeMetadata>`. 14816 14817Overview: 14818""""""""" 14819 14820The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a 14821virtual table pointer using type metadata. This intrinsic is used to implement 14822control flow integrity in conjunction with virtual call optimization. The 14823virtual call optimization pass will optimize away ``llvm.type.checked.load`` 14824intrinsics associated with devirtualized calls, thereby removing the type 14825check in cases where it is not needed to enforce the control flow integrity 14826constraint. 14827 14828If the given pointer is associated with a type metadata identifier, this 14829function returns true as the second element of its return value. (Note that 14830the function may also return true if the given pointer is not associated 14831with a type metadata identifier.) If the function's return value's second 14832element is true, the following rules apply to the first element: 14833 14834- If the given pointer is associated with the given type metadata identifier, 14835 it is the function pointer loaded from the given byte offset from the given 14836 pointer. 14837 14838- If the given pointer is not associated with the given type metadata 14839 identifier, it is one of the following (the choice of which is unspecified): 14840 14841 1. The function pointer that would have been loaded from an arbitrarily chosen 14842 (through an unspecified mechanism) pointer associated with the type 14843 metadata. 14844 14845 2. If the function has a non-void return type, a pointer to a function that 14846 returns an unspecified value without causing side effects. 14847 14848If the function's return value's second element is false, the value of the 14849first element is undefined. 14850 14851 14852'``llvm.donothing``' Intrinsic 14853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14854 14855Syntax: 14856""""""" 14857 14858:: 14859 14860 declare void @llvm.donothing() nounwind readnone 14861 14862Overview: 14863""""""""" 14864 14865The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only 14866three intrinsics (besides ``llvm.experimental.patchpoint`` and 14867``llvm.experimental.gc.statepoint``) that can be called with an invoke 14868instruction. 14869 14870Arguments: 14871"""""""""" 14872 14873None. 14874 14875Semantics: 14876"""""""""" 14877 14878This intrinsic does nothing, and it's removed by optimizers and ignored 14879by codegen. 14880 14881'``llvm.experimental.deoptimize``' Intrinsic 14882^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14883 14884Syntax: 14885""""""" 14886 14887:: 14888 14889 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ] 14890 14891Overview: 14892""""""""" 14893 14894This intrinsic, together with :ref:`deoptimization operand bundles 14895<deopt_opbundles>`, allow frontends to express transfer of control and 14896frame-local state from the currently executing (typically more specialized, 14897hence faster) version of a function into another (typically more generic, hence 14898slower) version. 14899 14900In languages with a fully integrated managed runtime like Java and JavaScript 14901this intrinsic can be used to implement "uncommon trap" or "side exit" like 14902functionality. In unmanaged languages like C and C++, this intrinsic can be 14903used to represent the slow paths of specialized functions. 14904 14905 14906Arguments: 14907"""""""""" 14908 14909The intrinsic takes an arbitrary number of arguments, whose meaning is 14910decided by the :ref:`lowering strategy<deoptimize_lowering>`. 14911 14912Semantics: 14913"""""""""" 14914 14915The ``@llvm.experimental.deoptimize`` intrinsic executes an attached 14916deoptimization continuation (denoted using a :ref:`deoptimization 14917operand bundle <deopt_opbundles>`) and returns the value returned by 14918the deoptimization continuation. Defining the semantic properties of 14919the continuation itself is out of scope of the language reference -- 14920as far as LLVM is concerned, the deoptimization continuation can 14921invoke arbitrary side effects, including reading from and writing to 14922the entire heap. 14923 14924Deoptimization continuations expressed using ``"deopt"`` operand bundles always 14925continue execution to the end of the physical frame containing them, so all 14926calls to ``@llvm.experimental.deoptimize`` must be in "tail position": 14927 14928 - ``@llvm.experimental.deoptimize`` cannot be invoked. 14929 - The call must immediately precede a :ref:`ret <i_ret>` instruction. 14930 - The ``ret`` instruction must return the value produced by the 14931 ``@llvm.experimental.deoptimize`` call if there is one, or void. 14932 14933Note that the above restrictions imply that the return type for a call to 14934``@llvm.experimental.deoptimize`` will match the return type of its immediate 14935caller. 14936 14937The inliner composes the ``"deopt"`` continuations of the caller into the 14938``"deopt"`` continuations present in the inlinee, and also updates calls to this 14939intrinsic to return directly from the frame of the function it inlined into. 14940 14941All declarations of ``@llvm.experimental.deoptimize`` must share the 14942same calling convention. 14943 14944.. _deoptimize_lowering: 14945 14946Lowering: 14947""""""""" 14948 14949Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the 14950symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to 14951ensure that this symbol is defined). The call arguments to 14952``@llvm.experimental.deoptimize`` are lowered as if they were formal 14953arguments of the specified types, and not as varargs. 14954 14955 14956'``llvm.experimental.guard``' Intrinsic 14957^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 14958 14959Syntax: 14960""""""" 14961 14962:: 14963 14964 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ] 14965 14966Overview: 14967""""""""" 14968 14969This intrinsic, together with :ref:`deoptimization operand bundles 14970<deopt_opbundles>`, allows frontends to express guards or checks on 14971optimistic assumptions made during compilation. The semantics of 14972``@llvm.experimental.guard`` is defined in terms of 14973``@llvm.experimental.deoptimize`` -- its body is defined to be 14974equivalent to: 14975 14976.. code-block:: text 14977 14978 define void @llvm.experimental.guard(i1 %pred, <args...>) { 14979 %realPred = and i1 %pred, undef 14980 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}] 14981 14982 leave: 14983 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ] 14984 ret void 14985 14986 continue: 14987 ret void 14988 } 14989 14990 14991with the optional ``[, !make.implicit !{}]`` present if and only if it 14992is present on the call site. For more details on ``!make.implicit``, 14993see :doc:`FaultMaps`. 14994 14995In words, ``@llvm.experimental.guard`` executes the attached 14996``"deopt"`` continuation if (but **not** only if) its first argument 14997is ``false``. Since the optimizer is allowed to replace the ``undef`` 14998with an arbitrary value, it can optimize guard to fail "spuriously", 14999i.e. without the original condition being false (hence the "not only 15000if"); and this allows for "check widening" type optimizations. 15001 15002``@llvm.experimental.guard`` cannot be invoked. 15003 15004 15005'``llvm.load.relative``' Intrinsic 15006^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15007 15008Syntax: 15009""""""" 15010 15011:: 15012 15013 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly 15014 15015Overview: 15016""""""""" 15017 15018This intrinsic loads a 32-bit value from the address ``%ptr + %offset``, 15019adds ``%ptr`` to that value and returns it. The constant folder specifically 15020recognizes the form of this intrinsic and the constant initializers it may 15021load from; if a loaded constant initializer is known to have the form 15022``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``. 15023 15024LLVM provides that the calculation of such a constant initializer will 15025not overflow at link time under the medium code model if ``x`` is an 15026``unnamed_addr`` function. However, it does not provide this guarantee for 15027a constant initializer folded into a function body. This intrinsic can be 15028used to avoid the possibility of overflows when loading from such a constant. 15029 15030'``llvm.sideeffect``' Intrinsic 15031^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15032 15033Syntax: 15034""""""" 15035 15036:: 15037 15038 declare void @llvm.sideeffect() inaccessiblememonly nounwind 15039 15040Overview: 15041""""""""" 15042 15043The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers 15044treat it as having side effects, so it can be inserted into a loop to 15045indicate that the loop shouldn't be assumed to terminate (which could 15046potentially lead to the loop being optimized away entirely), even if it's 15047an infinite loop with no other side effects. 15048 15049Arguments: 15050"""""""""" 15051 15052None. 15053 15054Semantics: 15055"""""""""" 15056 15057This intrinsic actually does nothing, but optimizers must assume that it 15058has externally observable side effects. 15059 15060Stack Map Intrinsics 15061-------------------- 15062 15063LLVM provides experimental intrinsics to support runtime patching 15064mechanisms commonly desired in dynamic language JITs. These intrinsics 15065are described in :doc:`StackMaps`. 15066 15067Element Wise Atomic Memory Intrinsics 15068------------------------------------- 15069 15070These intrinsics are similar to the standard library memory intrinsics except 15071that they perform memory transfer as a sequence of atomic memory accesses. 15072 15073.. _int_memcpy_element_unordered_atomic: 15074 15075'``llvm.memcpy.element.unordered.atomic``' Intrinsic 15076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15077 15078Syntax: 15079""""""" 15080 15081This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on 15082any integer bit width and for different address spaces. Not all targets 15083support all bit widths however. 15084 15085:: 15086 15087 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 15088 i8* <src>, 15089 i32 <len>, 15090 i32 <element_size>) 15091 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 15092 i8* <src>, 15093 i64 <len>, 15094 i32 <element_size>) 15095 15096Overview: 15097""""""""" 15098 15099The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the 15100'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated 15101as arrays with elements that are exactly ``element_size`` bytes, and the copy between 15102buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations 15103that are a positive integer multiple of the ``element_size`` in size. 15104 15105Arguments: 15106"""""""""" 15107 15108The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>` 15109intrinsic, with the added constraint that ``len`` is required to be a positive integer 15110multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 15111``element_size``, then the behaviour of the intrinsic is undefined. 15112 15113``element_size`` must be a compile-time constant positive power of two no greater than 15114target-specific atomic access size limit. 15115 15116For each of the input pointers ``align`` parameter attribute must be specified. It 15117must be a power of two no less than the ``element_size``. Caller guarantees that 15118both the source and destination pointers are aligned to that boundary. 15119 15120Semantics: 15121"""""""""" 15122 15123The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of 15124memory from the source location to the destination location. These locations are not 15125allowed to overlap. The memory copy is performed as a sequence of load/store operations 15126where each access is guaranteed to be a multiple of ``element_size`` bytes wide and 15127aligned at an ``element_size`` boundary. 15128 15129The order of the copy is unspecified. The same value may be read from the source 15130buffer many times, but only one write is issued to the destination buffer per 15131element. It is well defined to have concurrent reads and writes to both source and 15132destination provided those reads and writes are unordered atomic when specified. 15133 15134This intrinsic does not provide any additional ordering guarantees over those 15135provided by a set of unordered loads from the source location and stores to the 15136destination. 15137 15138Lowering: 15139""""""""" 15140 15141In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is 15142lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*' 15143is replaced with an actual element size. 15144 15145Optimizer is allowed to inline memory copy when it's profitable to do so. 15146 15147'``llvm.memmove.element.unordered.atomic``' Intrinsic 15148^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15149 15150Syntax: 15151""""""" 15152 15153This is an overloaded intrinsic. You can use 15154``llvm.memmove.element.unordered.atomic`` on any integer bit width and for 15155different address spaces. Not all targets support all bit widths however. 15156 15157:: 15158 15159 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>, 15160 i8* <src>, 15161 i32 <len>, 15162 i32 <element_size>) 15163 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>, 15164 i8* <src>, 15165 i64 <len>, 15166 i32 <element_size>) 15167 15168Overview: 15169""""""""" 15170 15171The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization 15172of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and 15173``src`` are treated as arrays with elements that are exactly ``element_size`` 15174bytes, and the copy between buffers uses a sequence of 15175:ref:`unordered atomic <ordering>` load/store operations that are a positive 15176integer multiple of the ``element_size`` in size. 15177 15178Arguments: 15179"""""""""" 15180 15181The first three arguments are the same as they are in the 15182:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that 15183``len`` is required to be a positive integer multiple of the ``element_size``. 15184If ``len`` is not a positive integer multiple of ``element_size``, then the 15185behaviour of the intrinsic is undefined. 15186 15187``element_size`` must be a compile-time constant positive power of two no 15188greater than a target-specific atomic access size limit. 15189 15190For each of the input pointers the ``align`` parameter attribute must be 15191specified. It must be a power of two no less than the ``element_size``. Caller 15192guarantees that both the source and destination pointers are aligned to that 15193boundary. 15194 15195Semantics: 15196"""""""""" 15197 15198The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes 15199of memory from the source location to the destination location. These locations 15200are allowed to overlap. The memory copy is performed as a sequence of load/store 15201operations where each access is guaranteed to be a multiple of ``element_size`` 15202bytes wide and aligned at an ``element_size`` boundary. 15203 15204The order of the copy is unspecified. The same value may be read from the source 15205buffer many times, but only one write is issued to the destination buffer per 15206element. It is well defined to have concurrent reads and writes to both source 15207and destination provided those reads and writes are unordered atomic when 15208specified. 15209 15210This intrinsic does not provide any additional ordering guarantees over those 15211provided by a set of unordered loads from the source location and stores to the 15212destination. 15213 15214Lowering: 15215""""""""" 15216 15217In the most general case call to the 15218'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol 15219``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an 15220actual element size. 15221 15222The optimizer is allowed to inline the memory copy when it's profitable to do so. 15223 15224.. _int_memset_element_unordered_atomic: 15225 15226'``llvm.memset.element.unordered.atomic``' Intrinsic 15227^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15228 15229Syntax: 15230""""""" 15231 15232This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on 15233any integer bit width and for different address spaces. Not all targets 15234support all bit widths however. 15235 15236:: 15237 15238 declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>, 15239 i8 <value>, 15240 i32 <len>, 15241 i32 <element_size>) 15242 declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>, 15243 i8 <value>, 15244 i64 <len>, 15245 i32 <element_size>) 15246 15247Overview: 15248""""""""" 15249 15250The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the 15251'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array 15252with elements that are exactly ``element_size`` bytes, and the assignment to that array 15253uses uses a sequence of :ref:`unordered atomic <ordering>` store operations 15254that are a positive integer multiple of the ``element_size`` in size. 15255 15256Arguments: 15257"""""""""" 15258 15259The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>` 15260intrinsic, with the added constraint that ``len`` is required to be a positive integer 15261multiple of the ``element_size``. If ``len`` is not a positive integer multiple of 15262``element_size``, then the behaviour of the intrinsic is undefined. 15263 15264``element_size`` must be a compile-time constant positive power of two no greater than 15265target-specific atomic access size limit. 15266 15267The ``dest`` input pointer must have the ``align`` parameter attribute specified. It 15268must be a power of two no less than the ``element_size``. Caller guarantees that 15269the destination pointer is aligned to that boundary. 15270 15271Semantics: 15272"""""""""" 15273 15274The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of 15275memory starting at the destination location to the given ``value``. The memory is 15276set with a sequence of store operations where each access is guaranteed to be a 15277multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary. 15278 15279The order of the assignment is unspecified. Only one write is issued to the 15280destination buffer per element. It is well defined to have concurrent reads and 15281writes to the destination provided those reads and writes are unordered atomic 15282when specified. 15283 15284This intrinsic does not provide any additional ordering guarantees over those 15285provided by a set of unordered stores to the destination. 15286 15287Lowering: 15288""""""""" 15289 15290In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is 15291lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*' 15292is replaced with an actual element size. 15293 15294The optimizer is allowed to inline the memory assignment when it's profitable to do so. 15295