1========
2GWP-ASan
3========
4
5.. contents::
6   :local:
7   :depth: 2
8
9Introduction
10============
11
12GWP-ASan is a sampled allocator framework that assists in finding use-after-free
13and heap-buffer-overflow bugs in production environments. It informally is a
14recursive acronym, "**G**\WP-ASan **W**\ill **P**\rovide **A**\llocation
15**SAN**\ity".
16
17GWP-ASan is based on the classic
18`Electric Fence Malloc Debugger <https://linux.die.net/man/3/efence>`_, with a
19key adaptation. Notably, we only choose a very small percentage of allocations
20to sample, and apply guard pages to these sampled allocations only. The sampling
21is small enough to allow us to have very low performance overhead.
22
23There is a small, tunable memory overhead that is fixed for the lifetime of the
24process. This is approximately ~40KiB per process using the default settings,
25depending on the average size of your allocations.
26
27GWP-ASan vs. ASan
28=================
29
30Unlike `AddressSanitizer <https://clang.llvm.org/docs/AddressSanitizer.html>`_,
31GWP-ASan does not induce a significant performance overhead. ASan often requires
32the use of dedicated canaries to be viable in production environments, and as
33such is often impractical.
34
35GWP-ASan is only capable of finding a subset of the memory issues detected by
36ASan. Furthermore, GWP-ASan's bug detection capabilities are only probabilistic.
37As such, we recommend using ASan over GWP-ASan in testing, as well as anywhere
38else that guaranteed error detection is more valuable than the 2x execution
39slowdown/binary size bloat. For the majority of production environments, this
40impact is too high, and GWP-ASan proves extremely useful.
41
42Design
43======
44
45**Please note:** The implementation of GWP-ASan is largely in-flux, and these
46details are subject to change. There are currently other implementations of
47GWP-ASan, such as the implementation featured in
48`Chromium <https://cs.chromium.org/chromium/src/components/gwp_asan/>`_. The
49long-term support goal is to ensure feature-parity where reasonable, and to
50support compiler-rt as the reference implementation.
51
52Allocator Support
53-----------------
54
55GWP-ASan is not a replacement for a traditional allocator. Instead, it works by
56inserting stubs into a supporting allocator to redirect allocations to GWP-ASan
57when they're chosen to be sampled. These stubs are generally implemented in the
58implementation of ``malloc()``, ``free()`` and ``realloc()``. The stubs are
59extremely small, which makes using GWP-ASan in most allocators fairly trivial.
60The stubs follow the same general pattern (example ``malloc()`` pseudocode
61below):
62
63.. code:: cpp
64
65  #ifdef INSTALL_GWP_ASAN_STUBS
66    gwp_asan::GuardedPoolAllocator GWPASanAllocator;
67  #endif
68
69  void* YourAllocator::malloc(..) {
70  #ifdef INSTALL_GWP_ASAN_STUBS
71    if (GWPASanAllocator.shouldSample(..))
72      return GWPASanAllocator.allocate(..);
73  #endif
74
75    // ... the rest of your allocator code here.
76  }
77
78Then, all the supporting allocator needs to do is compile with
79``-DINSTALL_GWP_ASAN_STUBS`` and link against the GWP-ASan library! For
80performance reasons, we strongly recommend static linkage of the GWP-ASan
81library.
82
83Guarded Allocation Pool
84-----------------------
85
86The core of GWP-ASan is the guarded allocation pool. Each sampled allocation is
87backed using its own *guarded* slot, which may consist of one or more accessible
88pages. Each guarded slot is surrounded by two *guard* pages, which are mapped as
89inaccessible. The collection of all guarded slots makes up the *guarded
90allocation pool*.
91
92Buffer Underflow/Overflow Detection
93-----------------------------------
94
95We gain buffer-overflow and buffer-underflow detection through these guard
96pages. When a memory access overruns the allocated buffer, it will touch the
97inaccessible guard page, causing memory exception. This exception is caught and
98handled by the internal crash handler. Because each allocation is recorded with
99metadata about where (and by what thread) it was allocated and deallocated, we
100can provide information that will help identify the root cause of the bug.
101
102Allocations are randomly selected to be either left- or right-aligned to provide
103equal detection of both underflows and overflows.
104
105Use after Free Detection
106------------------------
107
108The guarded allocation pool also provides use-after-free detection. Whenever a
109sampled allocation is deallocated, we map its guarded slot as inaccessible. Any
110memory accesses after deallocation will thus trigger the crash handler, and we
111can provide useful information about the source of the error.
112
113Please note that the use-after-free detection for a sampled allocation is
114transient. To keep memory overhead fixed while still detecting bugs, deallocated
115slots are randomly reused to guard future allocations.
116
117Usage
118=====
119
120GWP-ASan already ships by default in the
121`Scudo Hardened Allocator <https://llvm.org/docs/ScudoHardenedAllocator.html>`_,
122so building with ``-fsanitize=scudo`` is the quickest and easiest way to try out
123GWP-ASan.
124
125Options
126-------
127
128GWP-ASan's configuration is managed by the supporting allocator. We provide a
129generic configuration management library that is used by Scudo. It allows
130several aspects of GWP-ASan to be configured through the following methods:
131
132- When the GWP-ASan library is compiled, by setting
133  ``-DGWP_ASAN_DEFAULT_OPTIONS`` to the options string you want set by default.
134  If you're building GWP-ASan as part of a compiler-rt/LLVM build, add it during
135  cmake configure time (e.g. ``cmake ... -DGWP_ASAN_DEFAULT_OPTIONS="..."``). If
136  you're building GWP-ASan outside of compiler-rt, simply ensure that you
137  specify ``-DGWP_ASAN_DEFAULT_OPTIONS="..."`` when building
138  ``optional/options_parser.cpp``).
139
140- By defining a ``__gwp_asan_default_options`` function in one's program that
141  returns the options string to be parsed. Said function must have the following
142  prototype: ``extern "C" const char* __gwp_asan_default_options(void)``, with a
143  default visibility. This will override the compile time define;
144
145- Depending on allocator support (Scudo has support for this mechanism): Through
146  the environment variable ``GWP_ASAN_OPTIONS``, containing the options string
147  to be parsed. Options defined this way will override any definition made
148  through ``__gwp_asan_default_options``.
149
150The options string follows a syntax similar to ASan, where distinct options
151can be assigned in the same string, separated by colons.
152
153For example, using the environment variable:
154
155.. code:: console
156
157  GWP_ASAN_OPTIONS="MaxSimultaneousAllocations=16:SampleRate=5000" ./a.out
158
159Or using the function:
160
161.. code:: cpp
162
163  extern "C" const char *__gwp_asan_default_options() {
164    return "MaxSimultaneousAllocations=16:SampleRate=5000";
165  }
166
167The following options are available:
168
169+----------------------------+---------+--------------------------------------------------------------------------------+
170| Option                     | Default | Description                                                                    |
171+----------------------------+---------+--------------------------------------------------------------------------------+
172| Enabled                    | true    | Is GWP-ASan enabled?                                                           |
173+----------------------------+---------+--------------------------------------------------------------------------------+
174| PerfectlyRightAlign        | false   | When allocations are right-aligned, should we perfectly align them up to the   |
175|                            |         | page boundary? By default (false), we round up allocation size to the nearest  |
176|                            |         | power of two (2, 4, 8, 16) up to a maximum of 16-byte alignment for            |
177|                            |         | performance reasons. Setting this to true can find single byte                 |
178|                            |         | buffer-overflows at the cost of performance, and may be incompatible with      |
179|                            |         | some architectures.                                                            |
180+----------------------------+---------+--------------------------------------------------------------------------------+
181| MaxSimultaneousAllocations | 16      | Number of simultaneously-guarded allocations available in the pool.            |
182+----------------------------+---------+--------------------------------------------------------------------------------+
183| SampleRate                 | 5000    | The probability (1 / SampleRate) that a page is selected for GWP-ASan          |
184|                            |         | sampling. Sample rates up to (2^31 - 1) are supported.                         |
185+----------------------------+---------+--------------------------------------------------------------------------------+
186| InstallSignalHandlers      | true    | Install GWP-ASan signal handlers for SIGSEGV during dynamic loading. This      |
187|                            |         | allows better error reports by providing stack traces for allocation and       |
188|                            |         | deallocation when reporting a memory error. GWP-ASan's signal handler will     |
189|                            |         | forward the signal to any previously-installed handler, and user programs      |
190|                            |         | that install further signal handlers should make sure they do the same. Note,  |
191|                            |         | if the previously installed SIGSEGV handler is SIG_IGN, we terminate the       |
192|                            |         | process after dumping the error report.                                        |
193+----------------------------+---------+--------------------------------------------------------------------------------+
194
195Example
196-------
197
198The below code has a use-after-free bug, where the ``string_view`` is created as
199a reference to the temporary result of the ``string+`` operator. The
200use-after-free occurs when ``sv`` is dereferenced on line 8.
201
202.. code:: cpp
203
204  1: #include <iostream>
205  2: #include <string>
206  3: #include <string_view>
207  4:
208  5: int main() {
209  6:   std::string s = "Hellooooooooooooooo ";
210  7:   std::string_view sv = s + "World\n";
211  8:   std::cout << sv;
212  9: }
213
214Compiling this code with Scudo+GWP-ASan will probabilistically catch this bug
215and provide us a detailed error report:
216
217.. code:: console
218
219  $ clang++ -fsanitize=scudo -std=c++17 -g buggy_code.cpp
220  $ for i in `seq 1 200`; do
221      GWP_ASAN_OPTIONS="SampleRate=100" ./a.out > /dev/null;
222    done
223  |
224  | *** GWP-ASan detected a memory error ***
225  | Use after free at 0x7feccab26000 (0 bytes into a 41-byte allocation at 0x7feccab26000) by thread 31027 here:
226  |   ...
227  |   #9 ./a.out(_ZStlsIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_St17basic_string_viewIS3_S4_E+0x45) [0x55585c0afa55]
228  |   #10 ./a.out(main+0x9f) [0x55585c0af7cf]
229  |   #11 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fecc966952b]
230  |   #12 ./a.out(_start+0x2a) [0x55585c0867ba]
231  |
232  | 0x7feccab26000 was deallocated by thread 31027 here:
233  |   ...
234  |   #7 ./a.out(main+0x83) [0x55585c0af7b3]
235  |   #8 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fecc966952b]
236  |   #9 ./a.out(_start+0x2a) [0x55585c0867ba]
237  |
238  | 0x7feccab26000 was allocated by thread 31027 here:
239  |   ...
240  |   #12 ./a.out(main+0x57) [0x55585c0af787]
241  |   #13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fecc966952b]
242  |   #14 ./a.out(_start+0x2a) [0x55585c0867ba]
243  |
244  | *** End GWP-ASan report ***
245  | Segmentation fault
246
247To symbolize these stack traces, some care has to be taken. Scudo currently uses
248GNU's ``backtrace_symbols()`` from ``<execinfo.h>`` to unwind. The unwinder
249provides human-readable stack traces in ``function+offset`` form, rather than
250the normal ``binary+offset`` form. In order to use addr2line or similar tools to
251recover the exact line number, we must convert the ``function+offset`` to
252``binary+offset``. A helper script is available at
253``compiler-rt/lib/gwp_asan/scripts/symbolize.sh``. Using this script will
254attempt to symbolize each possible line, falling back to the previous output if
255anything fails. This results in the following output:
256
257.. code:: console
258
259  $ cat my_gwp_asan_error.txt | symbolize.sh
260  |
261  | *** GWP-ASan detected a memory error ***
262  | Use after free at 0x7feccab26000 (0 bytes into a 41-byte allocation at 0x7feccab26000) by thread 31027 here:
263  | ...
264  | #9 /usr/lib/gcc/x86_64-linux-gnu/8.0.1/../../../../include/c++/8.0.1/string_view:547
265  | #10 /tmp/buggy_code.cpp:8
266  |
267  | 0x7feccab26000 was deallocated by thread 31027 here:
268  | ...
269  | #7 /tmp/buggy_code.cpp:8
270  | #8 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fecc966952b]
271  | #9 ./a.out(_start+0x2a) [0x55585c0867ba]
272  |
273  | 0x7feccab26000 was allocated by thread 31027 here:
274  | ...
275  | #12 /tmp/buggy_code.cpp:7
276  | #13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fecc966952b]
277  | #14 ./a.out(_start+0x2a) [0x55585c0867ba]
278  |
279  | *** End GWP-ASan report ***
280  | Segmentation fault
281