1==========================
2UndefinedBehaviorSanitizer
3==========================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector.
12UBSan modifies the program at compile-time to catch various kinds of undefined
13behavior during program execution, for example:
14
15* Using misaligned or null pointer
16* Signed integer overflow
17* Conversion to, from, or between floating-point types which would
18  overflow the destination
19
20See the full list of available :ref:`checks <ubsan-checks>` below.
21
22UBSan has an optional run-time library which provides better error reporting.
23The checks have small runtime cost and no impact on address space layout or ABI.
24
25How to build
26============
27
28Build LLVM/Clang with `CMake <http://llvm.org/docs/CMake.html>`_.
29
30Usage
31=====
32
33Use ``clang++`` to compile and link your program with ``-fsanitize=undefined``
34flag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your
35executable is linked with proper UBSan runtime libraries. You can use ``clang``
36instead of ``clang++`` if you're compiling/linking C code.
37
38.. code-block:: console
39
40  % cat test.cc
41  int main(int argc, char **argv) {
42    int k = 0x7fffffff;
43    k += argc;
44    return 0;
45  }
46  % clang++ -fsanitize=undefined test.cc
47  % ./a.out
48  test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
49
50You can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan,
51and define the desired behavior for each kind of check:
52
53* print a verbose error report and continue execution (default);
54* print a verbose error report and exit the program;
55* execute a trap instruction (doesn't require UBSan run-time support).
56
57For example if you compile/link your program as:
58
59.. code-block:: console
60
61  % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment
62
63the program will continue execution after signed integer overflows, exit after
64the first invalid use of a null pointer, and trap after the first use of misaligned
65pointer.
66
67.. _ubsan-checks:
68
69Availablle checks
70=================
71
72Available checks are:
73
74  -  ``-fsanitize=alignment``: Use of a misaligned pointer or creation
75     of a misaligned reference.
76  -  ``-fsanitize=bool``: Load of a ``bool`` value which is neither
77     ``true`` nor ``false``.
78  -  ``-fsanitize=bounds``: Out of bounds array indexing, in cases
79     where the array bound can be statically determined.
80  -  ``-fsanitize=enum``: Load of a value of an enumerated type which
81     is not in the range of representable values for that enumerated
82     type.
83  -  ``-fsanitize=float-cast-overflow``: Conversion to, from, or
84     between floating-point types which would overflow the
85     destination.
86  -  ``-fsanitize=float-divide-by-zero``: Floating point division by
87     zero.
88  -  ``-fsanitize=function``: Indirect call of a function through a
89     function pointer of the wrong type (Linux, C++ and x86/x86_64 only).
90  -  ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
91  -  ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
92     parameter which is declared to never be null.
93  -  ``-fsanitize=null``: Use of a null pointer or creation of a null
94     reference.
95  -  ``-fsanitize=object-size``: An attempt to use bytes which the
96     optimizer can determine are not part of the object being
97     accessed. The sizes of objects are determined using
98     ``__builtin_object_size``, and consequently may be able to detect
99     more problems at higher optimization levels.
100  -  ``-fsanitize=return``: In C++, reaching the end of a
101     value-returning function without returning a value.
102  -  ``-fsanitize=returns-nonnull-attribute``: Returning null pointer
103     from a function which is declared to never return null.
104  -  ``-fsanitize=shift``: Shift operators where the amount shifted is
105     greater or equal to the promoted bit-width of the left hand side
106     or less than zero, or where the left hand side is negative. For a
107     signed left shift, also checks for signed overflow in C, and for
108     unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or
109     ``-fsanitize=shift-exponent`` to check only left-hand side or
110     right-hand side of shift operation, respectively.
111  -  ``-fsanitize=signed-integer-overflow``: Signed integer overflow,
112     including all the checks added by ``-ftrapv``, and checking for
113     overflow in signed division (``INT_MIN / -1``).
114  -  ``-fsanitize=unreachable``: If control flow reaches
115     ``__builtin_unreachable``.
116  -  ``-fsanitize=unsigned-integer-overflow``: Unsigned integer
117     overflows.
118  -  ``-fsanitize=vla-bound``: A variable-length array whose bound
119     does not evaluate to a positive value.
120  -  ``-fsanitize=vptr``: Use of an object whose vptr indicates that
121     it is of the wrong dynamic type, or that its lifetime has not
122     begun or has ended. Incompatible with ``-fno-rtti``. Link must
123     be performed by ``clang++``, not ``clang``, to make sure C++-specific
124     parts of the runtime library and C++ standard libraries are present.
125
126You can also use the following check groups:
127  -  ``-fsanitize=undefined``: All of the checks listed above other than
128     ``unsigned-integer-overflow``.
129  -  ``-fsanitize=undefined-trap``: Deprecated alias of
130     ``-fsanitize=undefined``.
131  -  ``-fsanitize=integer``: Checks for undefined or suspicious integer
132     behavior (e.g. unsigned integer overflow).
133
134Stack traces and report symbolization
135=====================================
136If you want UBSan to print symbolized stack trace for each error report, you
137will need to:
138
139#. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug
140   information in your binary.
141#. Run your program with environment variable
142   ``UBSAN_OPTIONS=print_stacktrace=1``.
143#. Make sure ``llvm-symbolizer`` binary is in ``PATH``.
144
145Issue Suppression
146=================
147
148UndefinedBehaviorSanitizer is not expected to produce false positives.
149If you see one, look again; most likely it is a true positive!
150
151Disabling Instrumentation with ``__attribute__((no_sanitize("undefined")))``
152----------------------------------------------------------------------------
153
154You disable UBSan checks for particular functions with
155``__attribute__((no_sanitize("undefined")))``. You can use all values of
156``-fsanitize=`` flag in this attribute, e.g. if your function deliberately
157contains possible signed integer overflow, you can use
158``__attribute__((no_sanitize("signed-integer-overflow")))``.
159
160This attribute may not be
161supported by other compilers, so consider using it together with
162``#if defined(__clang__)``.
163
164Suppressing Errors in Recompiled Code (Blacklist)
165-------------------------------------------------
166
167UndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in
168:doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports
169in the specified source files or functions.
170
171Supported Platforms
172===================
173
174UndefinedBehaviorSanitizer is supported on the following OS:
175
176* Android
177* Linux
178* FreeBSD
179* OS X 10.6 onwards
180
181and for the following architectures:
182
183* i386/x86\_64
184* ARM
185* AArch64
186* PowerPC64
187* MIPS/MIPS64
188
189Current Status
190==============
191
192UndefinedBehaviorSanitizer is available on selected platforms starting from LLVM
1933.3. The test suite is integrated into the CMake build and can be run with
194``check-ubsan`` command.
195
196More Information
197================
198
199* From LLVM project blog:
200  `What Every C Programmer Should Know About Undefined Behavior
201  <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_
202* From John Regehr's *Embedded in Academia* blog:
203  `A Guide to Undefined Behavior in C and C++
204  <http://blog.regehr.org/archives/213>`_
205