1*This document was originally written for a broad audience, and it was* 2*determined that it'd be good to hold in Bionic's docs, too. Due to the* 3*ever-changing nature of code, it tries to link to a stable tag of* 4*Bionic's libc, rather than the live code in Bionic. Same for Clang.* 5*Reader beware. :)* 6 7# The Anatomy of Clang FORTIFY 8 9## Objective 10 11The intent of this document is to run through the minutiae of how Clang FORTIFY 12actually works in Bionic at the time of writing. Other FORTIFY implementations 13that target Clang should use very similar mechanics. This document exists in part 14because many Clang-specific features serve multiple purposes simultaneously, so 15getting up-to-speed on how things function can be quite difficult. 16 17## Background 18 19FORTIFY is a broad suite of extensions to libc aimed at catching misuses of 20common library functions. Textually, these extensions exist purely in libc, but 21all implementations of FORTIFY rely heavily on C language extensions in order 22to function at all. 23 24Broadly, FORTIFY implementations try to guard against many misuses of C 25standard(-ish) libraries: 26- Buffer overruns in functions where pointers+sizes are passed (e.g., `memcpy`, 27 `poll`), or where sizes exist implicitly (e.g., `strcpy`). 28- Arguments with incorrect values passed to libc functions (e.g., 29 out-of-bounds bits in `umask`). 30- Missing arguments to functions (e.g., `open()` with `O_CREAT`, but no mode 31 bits). 32 33FORTIFY is traditionally enabled by passing `-D_FORTIFY_SOURCE=N` to your 34compiler. `N==0` disables FORTIFY, whereas `N==1`, `N==2`, and `N==3` enable 35increasingly strict versions of it. In general, FORTIFY doesn't require user 36code changes; that said, some code patterns 37are [incompatible with stricter versions of FORTIFY checking]. This is largely 38because FORTIFY has significant flexibility in what it considers to be an 39"out-of-bounds" access. 40 41FORTIFY implementations use a mix of compiler diagnostics and runtime checks to 42flag and/or mitigate the impacts of the misuses mentioned above. 43 44Further, given FORTIFY's design, the effectiveness of FORTIFY is a function of 45-- among other things -- the optimization level you're compiling your code at. 46Many FORTIFY implementations are implicitly disabled when building with `-O0`, 47since FORTIFY's design for both Clang and GCC relies on optimizations in order 48to provide useful run-time checks. For the purpose of this document, all 49analysis of FORTIFY functions and commentary on builtins assume that code is 50being built with some optimization level > `-O0`. 51 52### A note on GCC 53 54This document talks specifically about Bionic's FORTIFY implementation targeted 55at Clang. While GCC also provides a set of language extensions necessary to 56implement FORTIFY, these tools are different from what Clang offers. This 57divergence is an artifact of Clang and GCC's differing architecture as 58compilers. 59 60Textually, quite a bit can be shared between a FORTIFY implementation for GCC 61and one for Clang (e.g., see [ChromeOS' Glibc patch]), but this kind of sharing 62requires things like macros that expand to unbalanced braces depending on your 63compiler: 64 65```c 66/* 67 * Highly simplified; if you're interested in FORTIFY's actual implementation, 68 * please see the patch linked above. 69 */ 70#ifdef __clang__ 71# define FORTIFY_PRECONDITIONS 72# define FORTIFY_FUNCTION_END 73#else 74# define FORTIFY_PRECONDITIONS { 75# define FORTIFY_FUNCTION_END } 76#endif 77 78/* 79 * FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN is not defined, due to its 80 * complexity and irrelevance. It turns into a compile-time warning if the 81 * compiler can determine `*buf` has fewer than `size` bytes available. 82 */ 83 84char *getcwd(char *buf, size_t size) 85FORTIFY_PRECONDITIONS 86 FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN(buf, size, "`buf` is too smol.") 87{ 88 // Actual shared function implementation goes here. 89} 90FORTIFY_FUNCTION_END 91``` 92 93All talk of GCC-focused implementations and how to merge Clang and GCC 94implementations is out-of-scope for this doc, however. 95 96## The Life of a Clang FORTIFY Function 97 98As referenced in the Background section, FORTIFY performs many different checks 99for many functions. This section intends to go through real-world examples of 100FORTIFY functions in Bionic, breaking down how each part of these functions 101work, and how the pieces fit together to provide FORTIFY-like functionality. 102 103While FORTIFY implementations may differ between stdlibs, they broadly follow 104the same patterns when implementing their checks for Clang, and they try to 105make similar promises with respect to FORTIFY compiling to be zero-overhead in 106some cases, etc. Moreover, while this document specifically examines Bionic, 107many stdlibs will operate _very similarly_ to Bionic in their Clang FORTIFY 108implementations. 109 110**In general, when reading the below, be prepared for exceptions, subtlety, and 111corner cases. The individual function breakdowns below try to not offer 112redundant information. Each one focuses on different aspects of FORTIFY.** 113 114### Terminology 115 116Because FORTIFY should be mostly transparent to developers, there are inherent 117naming collisions here: `memcpy(x, y, z)` turns into fundamentally different 118generated code depending on the value of `_FORTIFY_SOURCE`. Further, said 119`memcpy` call with `_FORTIFY_SOURCE` enabled needs to be able to refer to the 120`memcpy` that would have been called, had `_FORTIFY_SOURCE` been disabled. 121Hence, the following convention is followed in the subsections below for all 122prose (namely, multiline code blocks are exempted from this): 123 124- Standard library function names preceded by `__builtin_` refer to the use of 125 the function with `_FORTIFY_SOURCE` disabled. 126- Standard library function names without a prefix refer to the use of the 127 function with `_FORTIFY_SOURCE` enabled. 128 129This convention also applies in `clang`. `__builtin_memcpy` will always call 130`memcpy` as though `_FORTIFY_SOURCE` were disabled. 131 132## Breakdown of `mempcpy` 133 134The [FORTIFY'ed version of `mempcpy`] is a full, featureful example of a 135FORTIFY'ed function from Bionic. From the user's perspective, it supports a few 136things: 137- Producing a compile-time error if the number of bytes to copy trivially 138 exceeds the number of bytes available at the destination pointer. 139- If the `mempcpy` has the potential to write to more bytes than what is 140 available at the destination, a run-time check is inserted to crash the 141 program if more bytes are written than what is allowed. 142- Compiling away to be zero overhead when none of the buffer sizes can be 143 determined at compile-time[^1]. 144 145The declaration in Bionic's headers for `__builtin_mempcpy` is: 146```c 147void* mempcpy(void* __dst, const void* __src, size_t __n) __INTRODUCED_IN(23); 148``` 149 150Which is annotated with nothing special, except for Bionic's versioner, which 151is Android-specific (and orthogonal to FORTIFY anyway), so it will be ignored. 152 153The [source for `mempcpy`] in Bionic's headers for is: 154```c 155__BIONIC_FORTIFY_INLINE 156void* mempcpy(void* const dst __pass_object_size0, const void* src, size_t copy_amount) 157 __overloadable 158 __clang_error_if(__bos_unevaluated_lt(__bos0(dst), copy_amount), 159 "'mempcpy' called with size bigger than buffer") { 160#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 161 size_t bos_dst = __bos0(dst); 162 if (!__bos_trivially_ge(bos_dst, copy_amount)) { 163 return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst); 164 } 165#endif 166 return __builtin_mempcpy(dst, src, copy_amount); 167} 168``` 169 170Expanding some of the important macros here, this function expands to roughly: 171```c 172static 173__inline__ 174__attribute__((no_stack_protector)) 175__attribute__((always_inline)) 176void* mempcpy( 177 void* const dst __attribute__((pass_object_size(0))), 178 const void* src, 179 size_t copy_amount) 180 __attribute__((overloadable)) 181 __attribute__((diagnose_if( 182 __builtin_object_size(dst, 0) != -1 && __builtin_object_size(dst, 0) <= copy_amount), 183 "'mempcpy' called with size bigger than buffer"))) { 184#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 185 size_t bos_dst = __builtin_object_size(dst, 0); 186 if (!(__bos_trivially_ge(bos_dst, copy_amount))) { 187 return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst); 188 } 189#endif 190 return __builtin_mempcpy(dst, src, copy_amount); 191} 192``` 193 194So let's walk through this step by step, to see how FORTIFY does what it says on 195the tin here. 196 197[^1]: "Zero overhead" in a way [similar to C++11's `std::unique_ptr`]: this will 198turn into a direct call `__builtin_mempcpy` (or an optimized form thereof) with 199no other surrounding checks at runtime. However, the additional complexity may 200hinder optimizations that are performed before the optimizer can prove that the 201`if (...) { ... }` can be optimized out. Depending on how late this happens, 202the additional complexity may skew inlining costs, hide opportunities for e.g., 203`memcpy` coalescing, etc etc. 204 205### How does Clang select `mempcpy`? 206 207First, it's critical to notice that `mempcpy` is marked `overloadable`. This 208function is a `static inline __attribute__((always_inline))` overload of 209`__builtin_mempcpy`: 210- `__attribute__((overloadable))` allows us to perform overloading in C. 211- `__attribute__((overloadable))` mangles all calls to functions marked with 212 `__attribute__((overloadable))`. 213- `__attribute__((overloadable))` allows exactly one function signature with a 214 given name to not be marked with `__attribute__((overloadable))`. Calls to 215 this overload will not be mangled. 216 217Second, one might note that this `mempcpy` implementation has the same C-level 218signature as `__builtin_mempcpy`. `pass_object_size` is a Clang attribute that 219is generally needed by FORTIFY, but it carries the side-effect that functions 220may be overloaded simply on the presence (or lack of presence) of 221`pass_object_size` attributes. Given two overloads of a function that only 222differ on the presence of `pass_object_size` attributes, the candidate with 223`pass_object_size` attributes is preferred. 224 225Finally, the prior paragraph gets thrown out if one tries to take the address of 226`mempcpy`. It is impossible to take the address of a function with one or more 227parameters that are annotated with `pass_object_size`. Hence, 228`&__builtin_mempcpy == &mempcpy`. Further, because this is an issue of overload 229resolution, `(&mempcpy)(x, y, z);` is functionally identical to 230`__builtin_mempcpy(x, y, z);`. 231 232All of this accomplishes the following: 233- Direct calls to `mempcpy` should call the FORTIFY-protected `mempcpy`. 234- Indirect calls to `&mempcpy` should call `__builtin_mempcpy`. 235 236### How does Clang offer compile-time diagnostics? 237 238Once one is convinced that the FORTIFY-enabled overload of `mempcpy` will be 239selected for direct calls, Clang's `diagnose_if` and `__builtin_object_size` do 240all of the work from there. 241 242Subtleties here primarily fall out of the discussion in the above section about 243`&__builtin_mempcpy == &mempcpy`: 244```c 245#define _FORTIFY_SOURCE 2 246#include <string.h> 247void example_code() { 248 char buf[4]; // ...Assume sizeof(char) == 1. 249 const char input_buf[] = "Hello, World"; 250 mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued. 251 252 mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5. 253 __builtin_mempcpy(buf, input_buf, 5); // No compile-time error. 254 (&mempcpy)(buf, input_buf, 5); // No compile-time error, since __builtin_mempcpy is selected. 255} 256``` 257 258Otherwise, the rest of this subsection is dedicated to preliminary discussion 259about `__builtin_object_size`. 260 261Clang's frontend can do one of two things with `__builtin_object_size(p, n)`: 262- Evaluate it as a constant. 263 - This can either mean declaring that the number of bytes at `p` is definitely 264 impossible to know, so the default value is used, or the number of bytes at 265 `p` can be known without optimizations. 266- Declare that the expression cannot form a constant, and lower it to 267 `@llvm.objectsize`, which is discussed in depth later. 268 269In the examples above, since `diagnose_if` is evaluated with context from the 270caller, Clang should be able to trivially determine that `buf` refers to a 271`char` array with 4 elements. 272 273The primary consequence of the above is that diagnostics can only be emitted if 274no optimizations are required to detect a broken code pattern. To be specific, 275clang's constexpr evaluator must be able to determine the logical object that 276any given pointer points to in order to fold `__builtin_object_size` to a 277constant, non-default answer: 278 279```c 280#define _FORTIFY_SOURCE 2 281#include <string.h> 282void example_code() { 283 char buf[4]; // ...Assume sizeof(char) == 1. 284 const char input_buf[] = "Hello, World"; 285 mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued. 286 mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5. 287 char *buf_ptr = buf; 288 mempcpy(buf_ptr, input_buf, 5); // No compile-time error; `buf_ptr`'s target can't be determined. 289} 290``` 291 292### How does Clang insert run-time checks? 293 294This section expands on the following statement: FORTIFY has zero runtime cost 295in instances where there is no chance of catching a bug at run-time. Otherwise, 296it introduces a tiny additional run-time cost to ensure that functions aren't 297misused. 298 299In prior sections, the following was established: 300- `overloadable` and `pass_object_size` prompt Clang to always select this 301 overload of `mempcpy` over `__builtin_mempcpy` for direct calls. 302- If a call to `mempcpy` was trivially broken, Clang would produce a 303 compile-time error, rather than producing a binary. 304 305Hence, the case we're interested in here is one where Clang's frontend selected 306a FORTIFY'ed function's implementation for a function call, but was unable to 307find anything seriously wrong with said function call. Since the frontend is 308powerless to detect bugs at this point, our focus shifts to the mechanisms that 309LLVM uses to support FORTIFY. 310 311Going back to Bionic's `mempcpy` implementation, we have the following (ignoring 312diagnose_if and assuming run-time checks are enabled): 313```c 314static 315__inline__ 316__attribute__((no_stack_protector)) 317__attribute__((always_inline)) 318void* mempcpy( 319 void* const dst __attribute__((pass_object_size(0))), 320 const void* src, 321 size_t copy_amount) 322 __attribute__((overloadable)) { 323 size_t bos_dst = __builtin_object_size(dst, 0); 324 if (bos_dst != -1 && 325 !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) { 326 return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst); 327 } 328 return __builtin_mempcpy(dst, src, copy_amount); 329} 330``` 331 332In other words, we have a `static`, `always_inline` function which: 333- If `__builtin_object_size(dst, 0)` cannot be determined (in which case, it 334 returns -1), calls `__builtin_mempcpy`. 335- Otherwise, if `copy_amount` can be folded to a constant, and if 336 `__builtin_object_size(dst, 0) >= copy_amount`, calls `__builtin_mempcpy`. 337- Otherwise, calls `__builtin___mempcpy_chk`. 338 339 340How can this be "zero overhead"? Let's focus on the following part of the 341function: 342 343```c 344 size_t bos_dst = __builtin_object_size(dst, 0); 345 if (bos_dst != -1 && 346 !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) { 347``` 348 349If Clang's frontend cannot determine a value for `__builtin_object_size`, Clang 350lowers it to LLVM's `@llvm.objectsize` intrinsic. The `@llvm.objectsize` 351invocation corresponding to `__builtin_object_size(p, 0)` is guaranteed to 352always fold to a constant value by the time LLVM emits machine code. 353 354Hence, `bos_dst` is guaranteed to be a constant; if it's -1, the above branch 355can be eliminated entirely, since it folds to `if (false && ...)`. Further, the 356RHS of the `&&` in this branch has us call `__builtin_mempcpy` if `copy_amount` 357is a known value less than `bos_dst` (yet another constant value). Therefore, 358the entire condition is always knowable when LLVM is done with LLVM IR-level 359optimizations, so no condition is ever emitted to machine code in practice. 360 361#### Why is "zero overhead" in quotes? Why is `unique_ptr` relevant? 362 363`__builtin_object_size` and `__builtin_constant_p` are forced to be constants 364after most optimizations take place. Until LLVM replaces both of these with 365constants and optimizes them out, we have additional branches and function calls 366in our IR. This can have negative effects, such as distorting inlining costs and 367inhibiting optimizations that are conservative around branches in control-flow. 368 369So FORTIFY is free in these cases _in isolation of any of the code around it_. 370Due to its implementation, it may impact the optimizations that occur on code 371around the literal call to the FORTIFY-hardened libc function. 372 373`unique_ptr` was just the first thing that came to the author's mind for "the 374type should be zero cost with any level of optimization enabled, but edge-cases 375might make it only-mostly-free to use." 376 377### How is checking actually performed? 378 379In cases where checking can be performed (e.g., where we call 380`__builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);`), Bionic provides [an 381implementation for `__mempcpy_chk`]. This is: 382 383```c 384extern "C" void* __mempcpy_chk(void* dst, const void* src, size_t count, size_t dst_len) { 385 __check_count("mempcpy", "count", count); 386 __check_buffer_access("mempcpy", "write into", count, dst_len); 387 return mempcpy(dst, src, count); 388} 389``` 390This function itself boils down to a few small branches which abort the program 391if they fail, and a direct call to `__builtin_mempcpy`. 392 393### Wrapping up 394 395In the above breakdown, it was shown how Clang and Bionic work together to: 396- represent FORTIFY-hardened overloads of functions, 397- report misuses of stdlib functions at compile-time, and 398- insert run-time checks for uses of functions that might be incorrect, but only 399 if we have the potential of proving the incorrectness of these. 400 401## Breakdown of open 402 403In Bionic, the [FORTIFY'ed implementation of `open`] is quite large. Much like 404`mempcpy`, the `__builtin_open` declaration is simple: 405 406```c 407int open(const char* __path, int __flags, ...); 408``` 409 410With some macros expanded, the FORTIFY-hardened header implementation is: 411```c 412int __open_2(const char*, int); 413int __open_real(const char*, int, ...) __asm__(open); 414 415#define __open_modes_useful(flags) (((flags) & O_CREAT) || ((flags) & O_TMPFILE) == O_TMPFILE) 416 417static 418int open(const char* pathname, int flags, mode_t modes, ...) __overloadable 419 __attribute__((diagnose_if(1, "error", "too many arguments"))); 420 421static 422__inline__ 423__attribute__((no_stack_protector)) 424__attribute__((always_inline)) 425int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags) 426 __attribute__((overloadable)) 427 __attribute__((diagnose_if( 428 __open_modes_useful(flags), 429 "error", 430 "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) { 431#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 432 return __open_2(pathname, flags); 433#else 434 return __open_real(pathname, flags); 435#endif 436} 437static 438__inline__ 439__attribute__((no_stack_protector)) 440__attribute__((always_inline)) 441int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes) 442 __attribute__((overloadable)) 443 __clang_warning_if(!__open_modes_useful(flags) && modes, 444 "'open' has superfluous mode bits; missing O_CREAT?") { 445 return __open_real(pathname, flags, modes); 446} 447``` 448 449Which may be a lot to take in. 450 451Before diving too deeply, please note that the remainder of these subsections 452assume that the programmer didn't make any egregious typos. Moreover, there's no 453real way that Bionic tries to prevent calls to `open` like 454`open("foo", 0, "how do you convert a const char[N] to mode_t?");`. The only 455real C-compatible solution the author can think of is "stamp out many overloads 456to catch sort-of-common instances of this very uncommon typo." This isn't great. 457 458More directly, no effort is made below to recognize calls that, due to 459incompatible argument types, cannot go to any `open` implementation other than 460`__builtin_open`, since it's recognized right here. :) 461 462### Implementation breakdown 463 464This `open` implementation does a few things: 465- Turns calls to `open` with too many arguments into a compile-time error. 466- Diagnoses calls to `open` with missing modes at compile-time and run-time 467 (both cases turn into errors). 468- Emits warnings on calls to `open` with useless mode bits, unless the mode bits 469 are all 0. 470 471One common bit of code not explained below is the `__open_real` declaration above: 472```c 473int __open_real(const char*, int, ...) __asm__(open); 474``` 475 476This exists as a way for us to call `__builtin_open` without needing clang to 477have a pre-defined `__builtin_open` function. 478 479#### Compile-time error on too many arguments 480 481```c 482static 483int open(const char* pathname, int flags, mode_t modes, ...) __overloadable 484 __attribute__((diagnose_if(1, "error", "too many arguments"))); 485``` 486 487Which matches most calls to open that supply too many arguments, since 488`int(const char *, int, ...)` matches less strongly than 489`int(const char *, int, mode_t, ...)` for calls where the 3rd arg can be 490converted to `mode_t` without too much effort. Because of the `diagnose_if` 491attribute, all of these calls turn into compile-time errors. 492 493#### Compile-time or run-time error on missing arguments 494The following overload handles all two-argument calls to `open`. 495```c 496static 497__inline__ 498__attribute__((no_stack_protector)) 499__attribute__((always_inline)) 500int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags) 501 __attribute__((overloadable)) 502 __attribute__((diagnose_if( 503 __open_modes_useful(flags), 504 "error", 505 "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) { 506#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED 507 return __open_2(pathname, flags); 508#else 509 return __open_real(pathname, flags); 510#endif 511} 512``` 513 514Like `mempcpy`, `diagnose_if` handles emitting a compile-time error if the call 515to `open` is broken in a way that's visible to Clang's frontend. This 516essentially boils down to "`open` is being called with a `flags` value that 517requires mode bits to be set." 518 519If that fails to catch a bug, we [unconditionally call `__open_2`], which 520performs a run-time check: 521```c 522int __open_2(const char* pathname, int flags) { 523 if (needs_mode(flags)) __fortify_fatal("open: called with O_CREAT/O_TMPFILE but no mode"); 524 return FDTRACK_CREATE_NAME("open", __openat(AT_FDCWD, pathname, force_O_LARGEFILE(flags), 0)); 525} 526``` 527 528#### Compile-time warning if modes are pointless 529 530Finally, we have the following `open` call: 531```c 532static 533__inline__ 534__attribute__((no_stack_protector)) 535__attribute__((always_inline)) 536int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes) 537 __attribute__((overloadable)) 538 __clang_warning_if(!__open_modes_useful(flags) && modes, 539 "'open' has superfluous mode bits; missing O_CREAT?") { 540 return __open_real(pathname, flags, modes); 541} 542``` 543 544This simply issues a warning if Clang's frontend can determine that `flags` 545isn't necessary. Due to conventions in existing code, a `modes` value of `0` is 546not diagnosed. 547 548#### What about `&open`? 549One yet-unaddressed aspect of the above is how `&open` works. This is thankfully 550a short answer: 551- It happens that `open` takes a parameter of type `const char*`. 552- It happens that `pass_object_size` -- an attribute only applicable to 553 parameters of type `T*` -- makes it impossible to take the address of a 554 function. 555 556Since clang doesn't support a "this function should never have its address 557taken," attribute, Bionic uses the next best thing: `pass_object_size`. :) 558 559## Breakdown of poll 560 561(Preemptively: at the time of writing, Clang has no literal `__builtin_poll` 562builtin. `__builtin_poll` is referenced below to remain consistent with the 563convention established in the Terminology section.) 564 565Bionic's `poll` implementation is closest to `mempcpy` above, though it has a 566few interesting aspects worth examining. 567 568The [full header implementation of `poll`] is, with some macros expanded: 569```c 570#define __bos_fd_count_trivially_safe(bos_val, fds, fd_count) \ 571 ((bos_val) == -1) || \ 572 (__builtin_constant_p(fd_count) && \ 573 (bos_val) >= sizeof(*fds) * (fd_count))) 574 575static 576__inline__ 577__attribute__((no_stack_protector)) 578__attribute__((always_inline)) 579int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout) 580 __attribute__((overloadable)) 581 __attriubte__((diagnose_if( 582 __builtin_object_size(fds, 1) != -1 && __builtin_object_size(fds, 1) < sizeof(*fds) * fd_count, 583 "error", 584 "in call to 'poll', fd_count is larger than the given buffer"))) { 585 size_t bos_fds = __builtin_object_size(fds, 1); 586 if (!__bos_fd_count_trivially_safe(bos_fds, fds, fd_count)) { 587 return __poll_chk(fds, fd_count, timeout, bos_fds); 588 } 589 return (&poll)(fds, fd_count, timeout); 590} 591``` 592 593To get the commonality with `mempcpy` and `open` out of the way: 594- This function is an overload with `__builtin_poll`. 595- The signature is the same, modulo the presence of a `pass_object_size` 596 attribute. Hence, for direct calls, overload resolution will always prefer it 597 over `__builtin_poll`. Taking the address of `poll` is forbidden, so all 598 references to `&poll` actually reference `__builtin_poll`. 599- When `fds` is too small to hold `fd_count` `pollfd`s, Clang will emit a 600 compile-time error if possible using `diagnose_if`. 601- If this can't be observed until run-time, `__poll_chk` verifies this. 602- When `fds` is a constant according to `__builtin_constant_p`, this always 603 compiles into `__poll_chk` for always-broken calls to `poll`, or 604 `__builtin_poll` for always-safe calls to `poll`. 605 606The critical bits to highlight here are on this line: 607```c 608int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout) 609``` 610 611And this line: 612```c 613 return (&poll)(fds, fd_count, timeout); 614``` 615 616Starting with the simplest, we call `__builtin_poll` with `(&poll)(...);`. As 617referenced above, taking the address of an overloaded function where all but one 618overload has a `pass_object_size` attribute on one or more parameters always 619resolves to the function without any `pass_object_size` attributes. 620 621The other line deserves a section. The subtlety of it is almost entirely in the 622use of `pass_object_size(1)` instead of `pass_object_size(0)`. on the `fds` 623parameter, and the corresponding use of `__builtin_object_size(fds, 1);` in the 624body of `poll`. 625 626### Subtleties of __builtin_object_size(p, N) 627 628Earlier in this document, it was said that a full description of each 629attribute/builtin necessary to power FORTIFY was out of scope. This is... only 630somewhat the case when we talk about `__builtin_object_size` and 631`pass_object_size`, especially when their second argument is `1`. 632 633#### tl;dr 634`__builtin_object_size(p, N)` and `pass_object_size(N)`, where `(N & 1) == 1`, 635can only be accurately determined by Clang. LLVM's `@llvm.objectsize` intrinsic 636ignores the value of `N & 1`, since handling `(N & 1) == 1` accurately requires 637data that's currently entirely inaccessible to LLVM, and that is difficult to 638preserve through LLVM's optimization passes. 639 640`pass_object_size`'s "lifting" of the evaluation of 641`__builtin_object_size(p, N)` to the caller is critical, since it allows Clang 642full visibility into the expression passed to e.g., `poll(&foo->bar, baz, qux)`. 643It's not a perfect solution, but it allows `N == 1` to be fully accurate in at 644least some cases. 645 646#### Background 647Clang's implementation of `__builtin_object_size` aims to be compatible with 648GCC's, which has [a decent bit of documentation]. Put simply, 649`__builtin_object_size(p, N)` is intended to evaluate at compile-time how many 650bytes can be accessed after `p` in a well-defined way. Straightforward examples 651of this are: 652```c 653char buf[8]; 654assert(__builtin_object_size(buf, N) == 8); 655assert(__builtin_object_size(buf + 1, N) == 7); 656``` 657 658This should hold for all values of N that are valid to pass to 659`__builtin_object_size`. The `N` value of `__builtin_object_size` is a mask of 660settings. 661 662##### (N & 2) == ? 663 664This is mostly for completeness sake; in Bionic's FORTIFY implementation, N is 665always either 0 or 1. 666 667If there are multiple possible values of `p` in a call to 668`__builtin_object_size(p, N)`, the second bit in `N` determines the behavior of 669the compiler. If `(N & 2) == 0`, `__builtin_object_size` should return the 670greatest possible size for each possible value of `p`. Otherwise, it should 671return the least possible value. For example: 672 673```c 674char smol_buf[7]; 675char buf[8]; 676char *p = rand() ? smol_buf : buf; 677assert(__builtin_object_size(p, 0) == 8); 678assert(__builtin_object_size(p, 2) == 7); 679``` 680 681##### (N & 1) == 0 682 683`__builtin_object_size(p, 0)` is more or less as simple as the example in the 684Background section directly above. When Clang attempts to evaluate 685`__builtin_object_size(p, 0);` and when LLVM tries to determine the result of a 686corresponding `@llvm.objectsize` call to, they search for the storage underlying 687the pointer in question. If that can be determined, Clang or LLVM can provide an 688answer; otherwise, they cannot. 689 690##### (N & 1) == 1, and the true magic of pass_object_size 691 692`__builtin_object_size(p, 1)` has a less uniform implementation between LLVM and 693Clang. According to GCC's documentation, "If the least significant bit [of 694__builtin_object_size's second argument] is clear, objects are whole variables, 695if it is set, a closest surrounding subobject is considered the object a pointer 696points to." 697 698The "closest surrounding subobject," means that `(N & 1) == 1` depends on type 699information in order to operate in many cases. Consider the following examples: 700```c 701struct Foo { 702 int a; 703 int b; 704}; 705 706struct Foo foo; 707assert(__builtin_object_size(&foo, 0) == sizeof(foo)); 708assert(__builtin_object_size(&foo, 1) == sizeof(foo)); 709assert(__builtin_object_size(&foo->a, 0) == sizeof(foo)); 710assert(__builtin_object_size(&foo->a, 1) == sizeof(int)); 711 712struct Foo foos[2]; 713assert(__builtin_object_size(&foos[0], 0) == 2 * sizeof(foo)); 714assert(__builtin_object_size(&foos[0], 1) == sizeof(foo)); 715assert(__builtin_object_size(&foos[0]->a, 0) == 2 * sizeof(foo)); 716assert(__builtin_object_size(&foos[0]->a, 1) == sizeof(int)); 717``` 718 719...And perhaps somewhat surprisingly: 720```c 721void example(struct Foo *foo) { 722 // (As a reminder, `-1` is "I don't know" when `(N & 2) == 0`.) 723 assert(__builtin_object_size(foo, 0) == -1); 724 assert(__builtin_object_size(foo, 1) == -1); 725 assert(__builtin_object_size(foo->a, 0) == -1); 726 assert(__builtin_object_size(foo->a, 1) == sizeof(int)); 727} 728``` 729 730In Clang, [this type-aware requirement poses problems for us]: Clang's frontend 731knows everything we could possibly want about the types of variables, but 732optimizations are only performed by LLVM. LLVM has no reliable source for C or 733C++ data types, so calls to `__builtin_object_size(p, N)` that cannot be 734resolved by clang are lowered to the equivalent of 735`__builtin_object_size(p, N & ~1)` in LLVM IR. 736 737Moreover, Clang's frontend is the best-equipped part of the compiler to 738accurately determine the answer for `__builtin_object_size(p, N)`, given we know 739what `p` is. LLVM is the best-equipped part of the compiler to determine the 740value of `p`. This ordering issue is unfortunate. 741 742This is where `pass_object_size(N)` comes in. To summarize [the docs for 743`pass_object_size`], it evaluates `__builtin_object_size(p, N)` within the 744context of the caller of the function annotated with `pass_object_size`, and 745passes the value of that into the callee as an invisible parameter. All calls to 746`__builtin_object_size(parameter, N)` are substituted with references to this 747invisible parameter. 748 749Putting this plainly, Clang's frontend struggles to evaluate the following: 750```c 751int foo(void *p) { 752 return __builtin_object_size(p, 1); 753} 754 755void bar() { 756 struct { int i, j } k; 757 // The frontend can't figure this interprocedural objectsize out, so it gets lowered to 758 // LLVM, which determines that the answer here is sizeof(k). 759 int baz = foo(&k.i); 760} 761``` 762 763However, with the magic of `pass_object_size`, we get one level of inlining to 764look through: 765```c 766int foo(void *const __attribute__((pass_object_size(1))) p) { 767 return __builtin_object_size(p, 1); 768} 769 770void bar() { 771 struct { int i, j } k; 772 // Due to pass_object_size, this is equivalent to: 773 // int baz = foo(&k.i, __builtin_object_size(&k.i, 1)); 774 // ...and `int foo(void *)` is actually equivalent to: 775 // int foo(void *const, size_t size) { 776 // return size; 777 // } 778 int baz = foo(&k.i); 779} 780``` 781 782So we can obtain an accurate result in this case. 783 784##### What about pass_object_size(0)? 785It's sort of tangential, but if you find yourself wondering about the utility of 786`pass_object_size(0)` ... it's somewhat split. `pass_object_size(0)` in Bionic's 787FORTIFY exists mostly for visual consistency, simplicity, and as a useful way to 788have e.g., `&mempcpy` == `&__builtin_mempcpy`. 789 790Outside of these fringe benefits, all of the functions with 791`pass_object_size(0)` on parameters are marked with `always_inline`, so 792"lifting" the `__builtin_object_size` call isn't ultimately very helpful. In 793theory, users can always have something like: 794 795```c 796// In some_header.h 797// This function does cool and interesting things with the `__builtin_object_size` of its parameter, 798// and is able to work with that as though the function were defined inline. 799void out_of_line_function(void *__attribute__((pass_object_size(0)))); 800``` 801 802Though the author isn't aware of uses like this in practice, beyond a few folks 803on LLVM's mailing list seeming interested in trying it someday. 804 805#### Wrapping up 806In the (long) section above, two things were covered: 807- The use of `(&poll)(...);` is a convenient shorthand for calling 808 `__builtin_poll`. 809- `__builtin_object_size(p, N)` with `(N & 1) == 1` is not easy for Clang to 810 answer accurately, since it relies on type info only available in the 811 frontend, and it sometimes relies on optimizations only available in the 812 middle-end. `pass_object_size` helps mitigate this. 813 814## Miscellaneous Notes 815The above should be a roughly comprehensive view of how FORTIFY works in the 816real world. The main thing it fails to mention is the use of [the `diagnose_as_builtin` attribute] in Clang. 817 818As time has moved on, Clang has increasingly gained support for emitting 819warnings that were previously emitted by FORTIFY machinery. 820`diagnose_as_builtin` allows us to remove the `diagnose_if`s from some of the 821`static inline` overloads of stdlib functions above, so Clang may diagnose them 822instead. 823 824Clang's built-in diagnostics are often better than `diagnose_if` diagnostics, 825since Clang can format its diagnostics to include e.g., information about the 826sizes of buffers in a suspect call to a function. `diagnose_if` can only have 827the compiler output constant strings. 828 829[ChromeOS' Glibc patch]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/90fa9b27731db10a6010c7f7c25b24028145b091/sys-libs/glibc/files/local/glibc-2.33/0007-glibc-add-clang-style-FORTIFY.patch 830[FORTIFY'ed implementation of `open`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/fcntl.h#41 831[FORTIFY'ed version of `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/string.h#45 832[a decent bit of documentation]: https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html 833[an implementation for `__mempcpy_chk`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/fortify.cpp#501 834[full header implementation of `poll`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/poll.h#43 835[incompatible with stricter versions of FORTIFY checking]: https://godbolt.org/z/fGfEYxfnf 836[similar to C++11's `std::unique_ptr`]: https://stackoverflow.com/questions/58339165/why-can-a-t-be-passed-in-register-but-a-unique-ptrt-cannot 837[source for `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/string.h#55 838[the `diagnose_as_builtin` attribute]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#diagnose-as-builtin 839[the docs for `pass_object_size`]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#pass-object-size-pass-dynamic-object-size 840[this type-aware requirement poses problems for us]: https://github.com/llvm/llvm-project/issues/55742 841[unconditionally call `__open_2`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/open.cpp#70 842