1*This document was originally written for a broad audience, and it was*
2*determined that it'd be good to hold in Bionic's docs, too. Due to the*
3*ever-changing nature of code, it tries to link to a stable tag of*
4*Bionic's libc, rather than the live code in Bionic. Same for Clang.*
5*Reader beware. :)*
6
7# The Anatomy of Clang FORTIFY
8
9## Objective
10
11The intent of this document is to run through the minutiae of how Clang FORTIFY
12actually works in Bionic at the time of writing. Other FORTIFY implementations
13that target Clang should use very similar mechanics. This document exists in part
14because many Clang-specific features serve multiple purposes simultaneously, so
15getting up-to-speed on how things function can be quite difficult.
16
17## Background
18
19FORTIFY is a broad suite of extensions to libc aimed at catching misuses of
20common library functions. Textually, these extensions exist purely in libc, but
21all implementations of FORTIFY rely heavily on C language extensions in order
22to function at all.
23
24Broadly, FORTIFY implementations try to guard against many misuses of C
25standard(-ish) libraries:
26- Buffer overruns in functions where pointers+sizes are passed (e.g., `memcpy`,
27  `poll`), or where sizes exist implicitly (e.g., `strcpy`).
28- Arguments with incorrect values passed to libc functions (e.g.,
29  out-of-bounds bits in `umask`).
30- Missing arguments to functions (e.g., `open()` with `O_CREAT`, but no mode
31  bits).
32
33FORTIFY is traditionally enabled by passing `-D_FORTIFY_SOURCE=N` to your
34compiler. `N==0` disables FORTIFY, whereas `N==1`, `N==2`, and `N==3` enable
35increasingly strict versions of it. In general, FORTIFY doesn't require user
36code changes; that said, some code patterns
37are [incompatible with stricter versions of FORTIFY checking]. This is largely
38because FORTIFY has significant flexibility in what it considers to be an
39"out-of-bounds" access.
40
41FORTIFY implementations use a mix of compiler diagnostics and runtime checks to
42flag and/or mitigate the impacts of the misuses mentioned above.
43
44Further, given FORTIFY's design, the effectiveness of FORTIFY is a function of
45-- among other things -- the optimization level you're compiling your code at.
46Many FORTIFY implementations are implicitly disabled when building with `-O0`,
47since FORTIFY's design for both Clang and GCC relies on optimizations in order
48to provide useful run-time checks. For the purpose of this document, all
49analysis of FORTIFY functions and commentary on builtins assume that code is
50being built with some optimization level > `-O0`.
51
52### A note on GCC
53
54This document talks specifically about Bionic's FORTIFY implementation targeted
55at Clang. While GCC also provides a set of language extensions necessary to
56implement FORTIFY, these tools are different from what Clang offers. This
57divergence is an artifact of Clang and GCC's differing architecture as
58compilers.
59
60Textually, quite a bit can be shared between a FORTIFY implementation for GCC
61and one for Clang (e.g., see [ChromeOS' Glibc patch]), but this kind of sharing
62requires things like macros that expand to unbalanced braces depending on your
63compiler:
64
65```c
66/*
67 * Highly simplified; if you're interested in FORTIFY's actual implementation,
68 * please see the patch linked above.
69 */
70#ifdef __clang__
71# define FORTIFY_PRECONDITIONS
72# define FORTIFY_FUNCTION_END
73#else
74# define FORTIFY_PRECONDITIONS {
75# define FORTIFY_FUNCTION_END }
76#endif
77
78/*
79 * FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN is not defined, due to its
80 * complexity and irrelevance. It turns into a compile-time warning if the
81 * compiler can determine `*buf` has fewer than `size` bytes available.
82 */
83
84char *getcwd(char *buf, size_t size)
85FORTIFY_PRECONDITIONS
86  FORTIFY_WARNING_ONLY_IF_SIZE_OF_BUF_LESS_THAN(buf, size, "`buf` is too smol.")
87{
88  // Actual shared function implementation goes here.
89}
90FORTIFY_FUNCTION_END
91```
92
93All talk of GCC-focused implementations and how to merge Clang and GCC
94implementations is out-of-scope for this doc, however.
95
96## The Life of a Clang FORTIFY Function
97
98As referenced in the Background section, FORTIFY performs many different checks
99for many functions. This section intends to go through real-world examples of
100FORTIFY functions in Bionic, breaking down how each part of these functions
101work, and how the pieces fit together to provide FORTIFY-like functionality.
102
103While FORTIFY implementations may differ between stdlibs, they broadly follow
104the same patterns when implementing their checks for Clang, and they try to
105make similar promises with respect to FORTIFY compiling to be zero-overhead in
106some cases, etc. Moreover, while this document specifically examines Bionic,
107many stdlibs will operate _very similarly_ to Bionic in their Clang FORTIFY
108implementations.
109
110**In general, when reading the below, be prepared for exceptions, subtlety, and
111corner cases. The individual function breakdowns below try to not offer
112redundant information. Each one focuses on different aspects of FORTIFY.**
113
114### Terminology
115
116Because FORTIFY should be mostly transparent to developers, there are inherent
117naming collisions here: `memcpy(x, y, z)` turns into fundamentally different
118generated code depending on the value of `_FORTIFY_SOURCE`. Further, said
119`memcpy` call with `_FORTIFY_SOURCE` enabled needs to be able to refer to the
120`memcpy` that would have been called, had `_FORTIFY_SOURCE` been disabled.
121Hence, the following convention is followed in the subsections below for all
122prose (namely, multiline code blocks are exempted from this):
123
124- Standard library function names preceded by `__builtin_` refer to the use of
125  the function with `_FORTIFY_SOURCE` disabled.
126- Standard library function names without a prefix refer to the use of the
127  function with `_FORTIFY_SOURCE` enabled.
128
129This convention also applies in `clang`. `__builtin_memcpy` will always call
130`memcpy` as though `_FORTIFY_SOURCE` were disabled.
131
132## Breakdown of `mempcpy`
133
134The [FORTIFY'ed version of `mempcpy`] is a full, featureful example of a
135FORTIFY'ed function from Bionic. From the user's perspective, it supports a few
136things:
137- Producing a compile-time error if the number of bytes to copy trivially
138  exceeds the number of bytes available at the destination pointer.
139- If the `mempcpy` has the potential to write to more bytes than what is
140  available at the destination, a run-time check is inserted to crash the
141  program if more bytes are written than what is allowed.
142- Compiling away to be zero overhead when none of the buffer sizes can be
143  determined at compile-time[^1].
144
145The declaration in Bionic's headers for `__builtin_mempcpy` is:
146```c
147void* mempcpy(void* __dst, const void* __src, size_t __n) __INTRODUCED_IN(23);
148```
149
150Which is annotated with nothing special, except for Bionic's versioner, which
151is Android-specific (and orthogonal to FORTIFY anyway), so it will be ignored.
152
153The [source for `mempcpy`] in Bionic's headers for is:
154```c
155__BIONIC_FORTIFY_INLINE
156void* mempcpy(void* const dst __pass_object_size0, const void* src, size_t copy_amount)
157        __overloadable
158        __clang_error_if(__bos_unevaluated_lt(__bos0(dst), copy_amount),
159                         "'mempcpy' called with size bigger than buffer") {
160#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
161    size_t bos_dst = __bos0(dst);
162    if (!__bos_trivially_ge(bos_dst, copy_amount)) {
163        return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);
164    }
165#endif
166    return __builtin_mempcpy(dst, src, copy_amount);
167}
168```
169
170Expanding some of the important macros here, this function expands to roughly:
171```c
172static
173__inline__
174__attribute__((no_stack_protector))
175__attribute__((always_inline))
176void* mempcpy(
177        void* const dst __attribute__((pass_object_size(0))),
178        const void* src,
179        size_t copy_amount)
180        __attribute__((overloadable))
181        __attribute__((diagnose_if(
182            __builtin_object_size(dst, 0) != -1 && __builtin_object_size(dst, 0) <= copy_amount),
183            "'mempcpy' called with size bigger than buffer"))) {
184#if __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
185    size_t bos_dst = __builtin_object_size(dst, 0);
186    if (!(__bos_trivially_ge(bos_dst, copy_amount))) {
187        return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);
188    }
189#endif
190    return __builtin_mempcpy(dst, src, copy_amount);
191}
192```
193
194So let's walk through this step by step, to see how FORTIFY does what it says on
195the tin here.
196
197[^1]: "Zero overhead" in a way [similar to C++11's `std::unique_ptr`]: this will
198turn into a direct call `__builtin_mempcpy` (or an optimized form thereof) with
199no other surrounding checks at runtime. However, the additional complexity may
200hinder optimizations that are performed before the optimizer can prove that the
201`if (...) { ... }` can be optimized out. Depending on how late this happens,
202the additional complexity may skew inlining costs, hide opportunities for e.g.,
203`memcpy` coalescing, etc etc.
204
205### How does Clang select `mempcpy`?
206
207First, it's critical to notice that `mempcpy` is marked `overloadable`. This
208function is a `static inline __attribute__((always_inline))` overload of
209`__builtin_mempcpy`:
210- `__attribute__((overloadable))` allows us to perform overloading in C.
211- `__attribute__((overloadable))` mangles all calls to functions marked with
212  `__attribute__((overloadable))`.
213- `__attribute__((overloadable))` allows exactly one function signature with a
214  given name to not be marked with `__attribute__((overloadable))`. Calls to
215  this overload will not be mangled.
216
217Second, one might note that this `mempcpy` implementation has the same C-level
218signature as `__builtin_mempcpy`. `pass_object_size` is a Clang attribute that
219is generally needed by FORTIFY, but it carries the side-effect that functions
220may be overloaded simply on the presence (or lack of presence) of
221`pass_object_size` attributes. Given two overloads of a function that only
222differ on the presence of `pass_object_size` attributes, the candidate with
223`pass_object_size` attributes is preferred.
224
225Finally, the prior paragraph gets thrown out if one tries to take the address of
226`mempcpy`. It is impossible to take the address of a function with one or more
227parameters that are annotated with `pass_object_size`. Hence,
228`&__builtin_mempcpy == &mempcpy`. Further, because this is an issue of overload
229resolution, `(&mempcpy)(x, y, z);` is functionally identical to
230`__builtin_mempcpy(x, y, z);`.
231
232All of this accomplishes the following:
233- Direct calls to `mempcpy` should call the FORTIFY-protected `mempcpy`.
234- Indirect calls to `&mempcpy` should call `__builtin_mempcpy`.
235
236### How does Clang offer compile-time diagnostics?
237
238Once one is convinced that the FORTIFY-enabled overload of `mempcpy` will be
239selected for direct calls, Clang's `diagnose_if` and `__builtin_object_size` do
240all of the work from there.
241
242Subtleties here primarily fall out of the discussion in the above section about
243`&__builtin_mempcpy == &mempcpy`:
244```c
245#define _FORTIFY_SOURCE 2
246#include <string.h>
247void example_code() {
248  char buf[4]; // ...Assume sizeof(char) == 1.
249  const char input_buf[] = "Hello, World";
250  mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued.
251
252  mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5.
253  __builtin_mempcpy(buf, input_buf, 5); // No compile-time error.
254  (&mempcpy)(buf, input_buf, 5); // No compile-time error, since __builtin_mempcpy is selected.
255}
256```
257
258Otherwise, the rest of this subsection is dedicated to preliminary discussion
259about `__builtin_object_size`.
260
261Clang's frontend can do one of two things with `__builtin_object_size(p, n)`:
262- Evaluate it as a constant.
263  - This can either mean declaring that the number of bytes at `p` is definitely
264    impossible to know, so the default value is used, or the number of bytes at
265    `p` can be known without optimizations.
266- Declare that the expression cannot form a constant, and lower it to
267  `@llvm.objectsize`, which is discussed in depth later.
268
269In the examples above, since `diagnose_if` is evaluated with context from the
270caller, Clang should be able to trivially determine that `buf` refers to a
271`char` array with 4 elements.
272
273The primary consequence of the above is that diagnostics can only be emitted if
274no optimizations are required to detect a broken code pattern. To be specific,
275clang's constexpr evaluator must be able to determine the logical object that
276any given pointer points to in order to fold `__builtin_object_size` to a
277constant, non-default answer:
278
279```c
280#define _FORTIFY_SOURCE 2
281#include <string.h>
282void example_code() {
283  char buf[4]; // ...Assume sizeof(char) == 1.
284  const char input_buf[] = "Hello, World";
285  mempcpy(buf, input_buf, 4); // Valid, no diagnostic issued.
286  mempcpy(buf, input_buf, 5); // Emits a compile-time error since sizeof(buf) < 5.
287  char *buf_ptr = buf;
288  mempcpy(buf_ptr, input_buf, 5); // No compile-time error; `buf_ptr`'s target can't be determined.
289}
290```
291
292### How does Clang insert run-time checks?
293
294This section expands on the following statement: FORTIFY has zero runtime cost
295in instances where there is no chance of catching a bug at run-time. Otherwise,
296it introduces a tiny additional run-time cost to ensure that functions aren't
297misused.
298
299In prior sections, the following was established:
300- `overloadable` and `pass_object_size` prompt Clang to always select this
301  overload of `mempcpy` over `__builtin_mempcpy` for direct calls.
302- If a call to `mempcpy` was trivially broken, Clang would produce a
303  compile-time error, rather than producing a binary.
304
305Hence, the case we're interested in here is one where Clang's frontend selected
306a FORTIFY'ed function's implementation for a function call, but was unable to
307find anything seriously wrong with said function call. Since the frontend is
308powerless to detect bugs at this point, our focus shifts to the mechanisms that
309LLVM uses to support FORTIFY.
310
311Going back to Bionic's `mempcpy` implementation, we have the following (ignoring
312diagnose_if and assuming run-time checks are enabled):
313```c
314static
315__inline__
316__attribute__((no_stack_protector))
317__attribute__((always_inline))
318void* mempcpy(
319        void* const dst __attribute__((pass_object_size(0))),
320        const void* src,
321        size_t copy_amount)
322        __attribute__((overloadable)) {
323    size_t bos_dst = __builtin_object_size(dst, 0);
324    if (bos_dst != -1 &&
325        !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) {
326        return __builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);
327    }
328    return __builtin_mempcpy(dst, src, copy_amount);
329}
330```
331
332In other words, we have a `static`, `always_inline` function which:
333- If `__builtin_object_size(dst, 0)` cannot be determined (in which case, it
334  returns -1), calls `__builtin_mempcpy`.
335- Otherwise, if `copy_amount` can be folded to a constant, and if
336  `__builtin_object_size(dst, 0) >= copy_amount`, calls `__builtin_mempcpy`.
337- Otherwise, calls `__builtin___mempcpy_chk`.
338
339
340How can this be "zero overhead"? Let's focus on the following part of the
341function:
342
343```c
344    size_t bos_dst = __builtin_object_size(dst, 0);
345    if (bos_dst != -1 &&
346        !(__builtin_constant_p(copy_amount) && bos_dst >= copy_amount)) {
347```
348
349If Clang's frontend cannot determine a value for `__builtin_object_size`, Clang
350lowers it to LLVM's `@llvm.objectsize` intrinsic. The `@llvm.objectsize`
351invocation corresponding to `__builtin_object_size(p, 0)` is guaranteed to
352always fold to a constant value by the time LLVM emits machine code.
353
354Hence, `bos_dst` is guaranteed to be a constant; if it's -1, the above branch
355can be eliminated entirely, since it folds to `if (false && ...)`. Further, the
356RHS of the `&&` in this branch has us call `__builtin_mempcpy` if `copy_amount`
357is a known value less than `bos_dst` (yet another constant value). Therefore,
358the entire condition is always knowable when LLVM is done with LLVM IR-level
359optimizations, so no condition is ever emitted to machine code in practice.
360
361#### Why is "zero overhead" in quotes? Why is `unique_ptr` relevant?
362
363`__builtin_object_size` and `__builtin_constant_p` are forced to be constants
364after most optimizations take place. Until LLVM replaces both of these with
365constants and optimizes them out, we have additional branches and function calls
366in our IR. This can have negative effects, such as distorting inlining costs and
367inhibiting optimizations that are conservative around branches in control-flow.
368
369So FORTIFY is free in these cases _in isolation of any of the code around it_.
370Due to its implementation, it may impact the optimizations that occur on code
371around the literal call to the FORTIFY-hardened libc function.
372
373`unique_ptr` was just the first thing that came to the author's mind for "the
374type should be zero cost with any level of optimization enabled, but edge-cases
375might make it only-mostly-free to use."
376
377### How is checking actually performed?
378
379In cases where checking can be performed (e.g., where we call
380`__builtin___mempcpy_chk(dst, src, copy_amount, bos_dst);`), Bionic provides [an
381implementation for `__mempcpy_chk`]. This is:
382
383```c
384extern "C" void* __mempcpy_chk(void* dst, const void* src, size_t count, size_t dst_len) {
385  __check_count("mempcpy", "count", count);
386  __check_buffer_access("mempcpy", "write into", count, dst_len);
387  return mempcpy(dst, src, count);
388}
389```
390This function itself boils down to a few small branches which abort the program
391if they fail, and a direct call to `__builtin_mempcpy`.
392
393### Wrapping up
394
395In the above breakdown, it was shown how Clang and Bionic work together to:
396- represent FORTIFY-hardened overloads of functions,
397- report misuses of stdlib functions at compile-time, and
398- insert run-time checks for uses of functions that might be incorrect, but only
399  if we have the potential of proving the incorrectness of these.
400
401## Breakdown of open
402
403In Bionic, the [FORTIFY'ed implementation of `open`] is quite large. Much like
404`mempcpy`, the `__builtin_open` declaration is simple:
405
406```c
407int open(const char* __path, int __flags, ...);
408```
409
410With some macros expanded, the FORTIFY-hardened header implementation is:
411```c
412int __open_2(const char*, int);
413int __open_real(const char*, int, ...) __asm__(open);
414
415#define __open_modes_useful(flags) (((flags) & O_CREAT) || ((flags) & O_TMPFILE) == O_TMPFILE)
416
417static
418int open(const char* pathname, int flags, mode_t modes, ...) __overloadable
419        __attribute__((diagnose_if(1, "error", "too many arguments")));
420
421static
422__inline__
423__attribute__((no_stack_protector))
424__attribute__((always_inline))
425int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags)
426        __attribute__((overloadable))
427        __attribute__((diagnose_if(
428            __open_modes_useful(flags),
429            "error",
430            "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) {
431#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
432    return __open_2(pathname, flags);
433#else
434    return __open_real(pathname, flags);
435#endif
436}
437static
438__inline__
439__attribute__((no_stack_protector))
440__attribute__((always_inline))
441int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes)
442        __attribute__((overloadable))
443        __clang_warning_if(!__open_modes_useful(flags) && modes,
444                           "'open' has superfluous mode bits; missing O_CREAT?") {
445    return __open_real(pathname, flags, modes);
446}
447```
448
449Which may be a lot to take in.
450
451Before diving too deeply, please note that the remainder of these subsections
452assume that the programmer didn't make any egregious typos. Moreover, there's no
453real way that Bionic tries to prevent calls to `open` like
454`open("foo", 0, "how do you convert a const char[N] to mode_t?");`. The only
455real C-compatible solution the author can think of is "stamp out many overloads
456to catch sort-of-common instances of this very uncommon typo." This isn't great.
457
458More directly, no effort is made below to recognize calls that, due to
459incompatible argument types, cannot go to any `open` implementation other than
460`__builtin_open`, since it's recognized right here. :)
461
462### Implementation breakdown
463
464This `open` implementation does a few things:
465- Turns calls to `open` with too many arguments into a compile-time error.
466- Diagnoses calls to `open` with missing modes at compile-time and run-time
467  (both cases turn into errors).
468- Emits warnings on calls to `open` with useless mode bits, unless the mode bits
469  are all 0.
470
471One common bit of code not explained below is the `__open_real` declaration above:
472```c
473int __open_real(const char*, int, ...) __asm__(open);
474```
475
476This exists as a way for us to call `__builtin_open` without needing clang to
477have a pre-defined `__builtin_open` function.
478
479#### Compile-time error on too many arguments
480
481```c
482static
483int open(const char* pathname, int flags, mode_t modes, ...) __overloadable
484        __attribute__((diagnose_if(1, "error", "too many arguments")));
485```
486
487Which matches most calls to open that supply too many arguments, since
488`int(const char *, int, ...)` matches less strongly than
489`int(const char *, int, mode_t, ...)` for calls where the 3rd arg can be
490converted to `mode_t` without too much effort. Because of the `diagnose_if`
491attribute, all of these calls turn into compile-time errors.
492
493#### Compile-time or run-time error on missing arguments
494The following overload handles all two-argument calls to `open`.
495```c
496static
497__inline__
498__attribute__((no_stack_protector))
499__attribute__((always_inline))
500int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags)
501        __attribute__((overloadable))
502        __attribute__((diagnose_if(
503            __open_modes_useful(flags),
504            "error",
505            "'open' called with O_CREAT or O_TMPFILE, but missing mode"))) {
506#if __ANDROID_API__ >= 17 && __BIONIC_FORTIFY_RUNTIME_CHECKS_ENABLED
507    return __open_2(pathname, flags);
508#else
509    return __open_real(pathname, flags);
510#endif
511}
512```
513
514Like `mempcpy`, `diagnose_if` handles emitting a compile-time error if the call
515to `open` is broken in a way that's visible to Clang's frontend. This
516essentially boils down to "`open` is being called with a `flags` value that
517requires mode bits to be set."
518
519If that fails to catch a bug, we [unconditionally call `__open_2`], which
520performs a run-time check:
521```c
522int __open_2(const char* pathname, int flags) {
523  if (needs_mode(flags)) __fortify_fatal("open: called with O_CREAT/O_TMPFILE but no mode");
524  return FDTRACK_CREATE_NAME("open", __openat(AT_FDCWD, pathname, force_O_LARGEFILE(flags), 0));
525}
526```
527
528#### Compile-time warning if modes are pointless
529
530Finally, we have the following `open` call:
531```c
532static
533__inline__
534__attribute__((no_stack_protector))
535__attribute__((always_inline))
536int open(const char* const __attribute__((pass_object_size(1))) pathname, int flags, mode_t modes)
537        __attribute__((overloadable))
538        __clang_warning_if(!__open_modes_useful(flags) && modes,
539                           "'open' has superfluous mode bits; missing O_CREAT?") {
540    return __open_real(pathname, flags, modes);
541}
542```
543
544This simply issues a warning if Clang's frontend can determine that `flags`
545isn't necessary. Due to conventions in existing code, a `modes` value of `0` is
546not diagnosed.
547
548#### What about `&open`?
549One yet-unaddressed aspect of the above is how `&open` works. This is thankfully
550a short answer:
551- It happens that `open` takes a parameter of type `const char*`.
552- It happens that `pass_object_size` -- an attribute only applicable to
553  parameters of type `T*` --  makes it impossible to take the address of a
554  function.
555
556Since clang doesn't support a "this function should never have its address
557taken," attribute, Bionic uses the next best thing: `pass_object_size`. :)
558
559## Breakdown of poll
560
561(Preemptively: at the time of writing, Clang has no literal `__builtin_poll`
562builtin. `__builtin_poll` is referenced below to remain consistent with the
563convention established in the Terminology section.)
564
565Bionic's `poll` implementation is closest to `mempcpy` above, though it has a
566few interesting aspects worth examining.
567
568The [full header implementation of `poll`] is, with some macros expanded:
569```c
570#define __bos_fd_count_trivially_safe(bos_val, fds, fd_count) \
571  ((bos_val) == -1) || \
572    (__builtin_constant_p(fd_count) && \
573    (bos_val) >= sizeof(*fds) * (fd_count)))
574
575static
576__inline__
577__attribute__((no_stack_protector))
578__attribute__((always_inline))
579int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout)
580    __attribute__((overloadable))
581    __attriubte__((diagnose_if(
582       __builtin_object_size(fds, 1) != -1 && __builtin_object_size(fds, 1) < sizeof(*fds) * fd_count,
583        "error",
584        "in call to 'poll', fd_count is larger than the given buffer"))) {
585  size_t bos_fds = __builtin_object_size(fds, 1);
586  if (!__bos_fd_count_trivially_safe(bos_fds, fds, fd_count)) {
587    return __poll_chk(fds, fd_count, timeout, bos_fds);
588  }
589  return (&poll)(fds, fd_count, timeout);
590}
591```
592
593To get the commonality with `mempcpy` and `open` out of the way:
594- This function is an overload with `__builtin_poll`.
595- The signature is the same, modulo the presence of a `pass_object_size`
596  attribute. Hence, for direct calls, overload resolution will always prefer it
597  over `__builtin_poll`. Taking the address of `poll` is forbidden, so all
598  references to `&poll` actually reference `__builtin_poll`.
599- When `fds` is too small to hold `fd_count` `pollfd`s, Clang will emit a
600  compile-time error if possible using `diagnose_if`.
601- If this can't be observed until run-time, `__poll_chk` verifies this.
602- When `fds` is a constant according to `__builtin_constant_p`, this always
603  compiles into `__poll_chk` for always-broken calls to `poll`, or
604  `__builtin_poll` for always-safe calls to `poll`.
605
606The critical bits to highlight here are on this line:
607```c
608int poll(struct pollfd* const fds __attribute__((pass_object_size(1))), nfds_t fd_count, int timeout)
609```
610
611And this line:
612```c
613  return (&poll)(fds, fd_count, timeout);
614```
615
616Starting with the simplest, we call `__builtin_poll` with `(&poll)(...);`. As
617referenced above, taking the address of an overloaded function where all but one
618overload has a `pass_object_size` attribute on one or more parameters always
619resolves to the function without any `pass_object_size` attributes.
620
621The other line deserves a section. The subtlety of it is almost entirely in the
622use of `pass_object_size(1)` instead of `pass_object_size(0)`. on the `fds`
623parameter, and the corresponding use of `__builtin_object_size(fds, 1);` in the
624body of `poll`.
625
626### Subtleties of __builtin_object_size(p, N)
627
628Earlier in this document, it was said that a full description of each
629attribute/builtin necessary to power FORTIFY was out of scope. This is... only
630somewhat the case when we talk about `__builtin_object_size` and
631`pass_object_size`, especially when their second argument is `1`.
632
633#### tl;dr
634`__builtin_object_size(p, N)` and `pass_object_size(N)`, where `(N & 1) == 1`,
635can only be accurately determined by Clang. LLVM's `@llvm.objectsize` intrinsic
636ignores the value of `N & 1`, since handling `(N & 1) == 1` accurately requires
637data that's currently entirely inaccessible to LLVM, and that is difficult to
638preserve through LLVM's optimization passes.
639
640`pass_object_size`'s "lifting" of the evaluation of
641`__builtin_object_size(p, N)` to the caller is critical, since it allows Clang
642full visibility into the expression passed to e.g., `poll(&foo->bar, baz, qux)`.
643It's not a perfect solution, but it allows `N == 1` to be fully accurate in at
644least some cases.
645
646#### Background
647Clang's implementation of `__builtin_object_size` aims to be compatible with
648GCC's, which has [a decent bit of documentation]. Put simply,
649`__builtin_object_size(p, N)` is intended to evaluate at compile-time how many
650bytes can be accessed after `p` in a well-defined way. Straightforward examples
651of this are:
652```c
653char buf[8];
654assert(__builtin_object_size(buf, N) == 8);
655assert(__builtin_object_size(buf + 1, N) == 7);
656```
657
658This should hold for all values of N that are valid to pass to
659`__builtin_object_size`. The `N` value of `__builtin_object_size` is a mask of
660settings.
661
662##### (N & 2) == ?
663
664This is mostly for completeness sake; in Bionic's FORTIFY implementation, N is
665always either 0 or 1.
666
667If there are multiple possible values of `p` in a call to
668`__builtin_object_size(p, N)`, the second bit in `N` determines the behavior of
669the compiler. If `(N & 2) == 0`, `__builtin_object_size` should return the
670greatest possible size for each possible value of `p`. Otherwise, it should
671return the least possible value. For example:
672
673```c
674char smol_buf[7];
675char buf[8];
676char *p = rand() ? smol_buf : buf;
677assert(__builtin_object_size(p, 0) == 8);
678assert(__builtin_object_size(p, 2) == 7);
679```
680
681##### (N & 1) == 0
682
683`__builtin_object_size(p, 0)` is more or less as simple as the example in the
684Background section directly above. When Clang attempts to evaluate
685`__builtin_object_size(p, 0);` and when LLVM tries to determine the result of a
686corresponding `@llvm.objectsize` call to, they search for the storage underlying
687the pointer in question. If that can be determined, Clang or LLVM can provide an
688answer; otherwise, they cannot.
689
690##### (N & 1) == 1, and the true magic of pass_object_size
691
692`__builtin_object_size(p, 1)` has a less uniform implementation between LLVM and
693Clang. According to GCC's documentation, "If the least significant bit [of
694__builtin_object_size's second argument] is clear, objects are whole variables,
695if it is set, a closest surrounding subobject is considered the object a pointer
696points to."
697
698The "closest surrounding subobject," means that `(N & 1) == 1` depends on type
699information in order to operate in many cases. Consider the following examples:
700```c
701struct Foo {
702  int a;
703  int b;
704};
705
706struct Foo foo;
707assert(__builtin_object_size(&foo, 0) == sizeof(foo));
708assert(__builtin_object_size(&foo, 1) == sizeof(foo));
709assert(__builtin_object_size(&foo->a, 0) == sizeof(foo));
710assert(__builtin_object_size(&foo->a, 1) == sizeof(int));
711
712struct Foo foos[2];
713assert(__builtin_object_size(&foos[0], 0) == 2 * sizeof(foo));
714assert(__builtin_object_size(&foos[0], 1) == sizeof(foo));
715assert(__builtin_object_size(&foos[0]->a, 0) == 2 * sizeof(foo));
716assert(__builtin_object_size(&foos[0]->a, 1) == sizeof(int));
717```
718
719...And perhaps somewhat surprisingly:
720```c
721void example(struct Foo *foo) {
722  // (As a reminder, `-1` is "I don't know" when `(N & 2) == 0`.)
723  assert(__builtin_object_size(foo, 0) == -1);
724  assert(__builtin_object_size(foo, 1) == -1);
725  assert(__builtin_object_size(foo->a, 0) == -1);
726  assert(__builtin_object_size(foo->a, 1) == sizeof(int));
727}
728```
729
730In Clang, [this type-aware requirement poses problems for us]: Clang's frontend
731knows everything we could possibly want about the types of variables, but
732optimizations are only performed by LLVM. LLVM has no reliable source for C or
733C++ data types, so calls to `__builtin_object_size(p, N)` that cannot be
734resolved by clang are lowered to the equivalent of
735`__builtin_object_size(p, N & ~1)` in LLVM IR.
736
737Moreover, Clang's frontend is the best-equipped part of the compiler to
738accurately determine the answer for `__builtin_object_size(p, N)`, given we know
739what `p` is. LLVM is the best-equipped part of the compiler to determine the
740value of `p`. This ordering issue is unfortunate.
741
742This is where `pass_object_size(N)` comes in. To summarize [the docs for
743`pass_object_size`], it evaluates `__builtin_object_size(p, N)` within the
744context of the caller of the function annotated with `pass_object_size`, and
745passes the value of that into the callee as an invisible parameter. All calls to
746`__builtin_object_size(parameter, N)` are substituted with references to this
747invisible parameter.
748
749Putting this plainly, Clang's frontend struggles to evaluate the following:
750```c
751int foo(void *p) {
752  return __builtin_object_size(p, 1);
753}
754
755void bar() {
756  struct { int i, j } k;
757  // The frontend can't figure this interprocedural objectsize out, so it gets lowered to
758  // LLVM, which determines that the answer here is sizeof(k).
759  int baz = foo(&k.i);
760}
761```
762
763However, with the magic of `pass_object_size`, we get one level of inlining to
764look through:
765```c
766int foo(void *const __attribute__((pass_object_size(1))) p) {
767  return __builtin_object_size(p, 1);
768}
769
770void bar() {
771  struct { int i, j } k;
772  // Due to pass_object_size, this is equivalent to:
773  // int baz = foo(&k.i, __builtin_object_size(&k.i, 1));
774  // ...and `int foo(void *)` is actually equivalent to:
775  // int foo(void *const, size_t size) {
776  //   return size;
777  // }
778  int baz = foo(&k.i);
779}
780```
781
782So we can obtain an accurate result in this case.
783
784##### What about pass_object_size(0)?
785It's sort of tangential, but if you find yourself wondering about the utility of
786`pass_object_size(0)` ... it's somewhat split. `pass_object_size(0)` in Bionic's
787FORTIFY exists mostly for visual consistency, simplicity, and as a useful way to
788have e.g., `&mempcpy` == `&__builtin_mempcpy`.
789
790Outside of these fringe benefits, all of the functions with
791`pass_object_size(0)` on parameters are marked with `always_inline`, so
792"lifting" the `__builtin_object_size` call isn't ultimately very helpful. In
793theory, users can always have something like:
794
795```c
796// In some_header.h
797// This function does cool and interesting things with the `__builtin_object_size` of its parameter,
798// and is able to work with that as though the function were defined inline.
799void out_of_line_function(void *__attribute__((pass_object_size(0))));
800```
801
802Though the author isn't aware of uses like this in practice, beyond a few folks
803on LLVM's mailing list seeming interested in trying it someday.
804
805#### Wrapping up
806In the (long) section above, two things were covered:
807- The use of `(&poll)(...);` is a convenient shorthand for calling
808  `__builtin_poll`.
809- `__builtin_object_size(p, N)` with `(N & 1) == 1` is not easy for Clang to
810  answer accurately, since it relies on type info only available in the
811  frontend, and it sometimes relies on optimizations only available in the
812  middle-end. `pass_object_size` helps mitigate this.
813
814## Miscellaneous Notes
815The above should be a roughly comprehensive view of how FORTIFY works in the
816real world. The main thing it fails to mention is the use of [the `diagnose_as_builtin` attribute] in Clang.
817
818As time has moved on, Clang has increasingly gained support for emitting
819warnings that were previously emitted by FORTIFY machinery.
820`diagnose_as_builtin` allows us to remove the `diagnose_if`s from some of the
821`static inline` overloads of stdlib functions above, so Clang may diagnose them
822instead.
823
824Clang's built-in diagnostics are often better than `diagnose_if` diagnostics,
825since Clang can format its diagnostics to include e.g., information about the
826sizes of buffers in a suspect call to a function. `diagnose_if` can only have
827the compiler output constant strings.
828
829[ChromeOS' Glibc patch]: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/90fa9b27731db10a6010c7f7c25b24028145b091/sys-libs/glibc/files/local/glibc-2.33/0007-glibc-add-clang-style-FORTIFY.patch
830[FORTIFY'ed implementation of `open`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/fcntl.h#41
831[FORTIFY'ed version of `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/string.h#45
832[a decent bit of documentation]: https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
833[an implementation for `__mempcpy_chk`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/fortify.cpp#501
834[full header implementation of `poll`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/bits/fortify/poll.h#43
835[incompatible with stricter versions of FORTIFY checking]: https://godbolt.org/z/fGfEYxfnf
836[similar to C++11's `std::unique_ptr`]: https://stackoverflow.com/questions/58339165/why-can-a-t-be-passed-in-register-but-a-unique-ptrt-cannot
837[source for `mempcpy`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/include/string.h#55
838[the `diagnose_as_builtin` attribute]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#diagnose-as-builtin
839[the docs for `pass_object_size`]: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#pass-object-size-pass-dynamic-object-size
840[this type-aware requirement poses problems for us]: https://github.com/llvm/llvm-project/issues/55742
841[unconditionally call `__open_2`]: https://android.googlesource.com/platform/bionic/+/refs/heads/android12-release/libc/bionic/open.cpp#70
842