1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3<html> 4<head> 5 <title>Open Projects</title> 6 <link type="text/css" rel="stylesheet" href="menu.css"> 7 <link type="text/css" rel="stylesheet" href="content.css"> 8 <script type="text/javascript" src="scripts/menu.js"></script> 9</head> 10<body> 11 12<div id="page"> 13<!--#include virtual="menu.html.incl"--> 14<div id="content"> 15 16<h1>Open Projects</h1> 17 18<p>This page lists several projects that would boost analyzer's usability and 19power. Most of the projects listed here are infrastructure-related so this list 20is an addition to the <a href="potential_checkers.html">potential checkers 21list</a>. If you are interested in tackling one of these, please send an email 22to the <a href=https://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev 23mailing list</a> to notify other members of the community.</p> 24 25<ul> 26 <li>Release checkers from "alpha" 27 <p>New checkers which were contributed to the analyzer, 28 but have not passed a rigorous evaluation process, 29 are committed as "alpha checkers" (from "alpha version"), 30 and are not enabled by default.</p> 31 32 <p>Ideally, only the checkers which are actively being worked on should be in 33 "alpha", 34 but over the years the development of many of those has stalled. 35 Such checkers should either be improved 36 up to a point where they can be enabled by default, 37 or removed from the analyzer entirely. 38 39 <ul> 40 <li><code>alpha.security.ArrayBound</code> and 41 <code>alpha.security.ArrayBoundV2</code> 42 <p>Array bounds checking is a desired feature, 43 but having an acceptable rate of false positives might not be possible 44 without a proper 45 <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">loop widening</a> support. 46 Additionally, it might be more promising to perform index checking based on 47 <a href="https://en.wikipedia.org/wiki/Taint_checking">tainted</a> index values. 48 <p><i>(Difficulty: Medium)</i></p></p> 49 </li> 50 51 <li><code>alpha.unix.StreamChecker</code> 52 <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 53 Hours talk 54 (<a href="https://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a> 55 <a href="https://youtu.be/kdxlsP5QVPw">video</a>).</p> 56 57 <p>This alpha checker is an attempt to write a production grade stream checker. 58 However, it was found to have an unacceptably high false positive rate. 59 One of the found problems was that eagerly splitting the state 60 based on whether the system call may fail leads to too many reports. 61 A <em>delayed</em> split where the implication is stored in the state 62 (similarly to nullability implications in <code>TrustNonnullChecker</code>) 63 may produce much better results.</p> 64 <p><i>(Difficulty: Medium)</i></p> 65 </li> 66 </ul> 67 </li> 68 69 <li>Improve C++ support 70 <ul> 71 <li>Handle construction as part of aggregate initialization. 72 <p><a href="https://en.cppreference.com/w/cpp/language/aggregate_initialization">Aggregates</a> 73 are objects that can be brace-initialized without calling a 74 constructor (that is, <code><a href="https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html"> 75 CXXConstructExpr</a></code> does not occur in the AST), 76 but potentially calling 77 constructors for their fields and base classes 78 These 79 constructors of sub-objects need to know what object they are constructing. 80 Moreover, if the aggregate contains 81 references, lifetime extension needs to be properly modeled. 82 83 One can start untangling this problem by trying to replace the 84 current ad-hoc <code><a href="https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html"> 85 ParentMap</a></code> lookup in <a href="https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430"> 86 <code>CXXConstructExpr::CK_NonVirtualBase</code></a> branch of 87 <code>ExprEngine::VisitCXXConstructExpr()</code> 88 with proper support for the feature. 89 <p><i>(Difficulty: Medium) </i></p></p> 90 </li> 91 92 <li>Handle array constructors. 93 <p>When an array of objects is allocated (say, using the 94 <code>operator new[]</code> or defining a stack array), 95 constructors for all elements of the array are called. 96 We should model (potentially some of) such evaluations, 97 and the same applies for destructors called from 98 <code>operator delete[]</code>. 99 See tests cases in <a href="https://github.com/llvm/llvm-project/tree/master/clang/test/Analysis/handle_constructors_with_new_array.cpp">handle_constructors_with_new_array.cpp</a>. 100 </p> 101 <p> 102 Constructing an array requires invoking multiple (potentially unknown) 103 amount of constructors with the same construct-expression. 104 Apart from the technical difficulties of juggling program points around 105 correctly to avoid accidentally merging paths together, we'll have to 106 be a judge on when to exit the loop and how to widen it. 107 Given that the constructor is going to be a default constructor, 108 a nice 95% solution might be to execute exactly one constructor and 109 then default-bind the resulting LazyCompoundVal to the whole array; 110 it'll work whenever the default constructor doesn't touch global state 111 but only initializes the object to various default values. 112 But if, say, we're making an array of strings, 113 depending on the implementation you might have to allocate a new buffer 114 for each string, and in this case default-binding won't cut it. 115 We might want to come up with an auxiliary analysis in order to perform 116 widening of these simple loops more precisely. 117 </p> 118 </li> 119 120 <li>Handle constructors that can be elided due to Named Return Value Optimization (NRVO) 121 <p>Local variables which are returned by values on all return statements 122 may be stored directly at the address for the return value, 123 eliding the copy or move constructor call. 124 Such variables can be identified using the AST call <code>VarDecl::isNRVOVariable</code>. 125 </p> 126 </li> 127 128 <li>Handle constructors of lambda captures 129 <p>Variables which are captured by value into a lambda require a call to 130 a copy constructor. 131 This call is not currently modeled. 132 </p> 133 </li> 134 135 <li>Handle constructors for default arguments 136 <p>Default arguments in C++ are recomputed at every call, 137 and are therefore local, and not static, variables. 138 See tests cases in <a href="https://github.com/llvm/llvm-project/tree/master/clang/test/Analysis/handle_constructors_for_default_arguments.cpp">handle_constructors_for_default_arguments.cpp</a>. 139 </p> 140 <p> 141 Default arguments are annoying because the initializer expression is 142 evaluated at the call site but doesn't syntactically belong to the 143 caller's AST; instead it belongs to the ParmVarDecl for the default 144 parameter. This can lead to situations when the same expression has to 145 carry different values simultaneously - 146 when multiple instances of the same function are evaluated as part of the 147 same full-expression without specifying the default arguments. 148 Even simply calling the function twice (not necessarily within the 149 same full-expression) may lead to program points agglutinating because 150 it's the same expression. There are some nasty test cases already 151 in temporaries.cpp (struct DefaultParam and so on). I recommend adding a 152 new LocationContext kind specifically to deal with this problem. It'll 153 also help you figure out the construction context when you evaluate the 154 construct-expression (though you might still need to do some additional 155 CFG work to get construction contexts right). 156 </p> 157 </li> 158 159 <li>Enhance the modeling of the standard library. 160 <p>The analyzer needs a better understanding of STL in order to be more 161 useful on C++ codebases. 162 While full library modeling is not an easy task, 163 large gains can be achieved by supporting only a few cases: 164 e.g. calling <code>.length()</code> on an empty 165 <code>std::string</code> always yields zero. 166 <p><i>(Difficulty: Medium)</i></p><p> 167 </li> 168 169 <li>Enhance CFG to model exception-handling. 170 <p>Currently exceptions are treated as "black holes", and exception-handling 171 control structures are poorly modeled in order to be conservative. 172 This could be improved for both C++ and Objective-C exceptions. 173 <p><i>(Difficulty: Hard)</i></p></p> 174 </li> 175 </ul> 176 </li> 177 178 <li>Core Analyzer Infrastructure 179 <ul> 180 <li>Handle unions. 181 <p>Currently in the analyzer the value of a union is always regarded as 182 an unknown. 183 This problem was 184 previously <a href="https://lists.llvm.org/pipermail/cfe-dev/2017-March/052864.html">discussed</a> 185 on the mailing list, but no solution was implemented. 186 <p><i> (Difficulty: Medium) </i></p></p> 187 </li> 188 189 <li>Floating-point support. 190 <p>Currently, the analyzer treats all floating-point values as unknown. 191 This project would involve adding a new <code>SVal</code> kind 192 for constant floats, generalizing the constraint manager to handle floats, 193 and auditing existing code to make sure it doesn't 194 make incorrect assumptions (most notably, that <code>X == X</code> 195 is always true, since it does not hold for <code>NaN</code>). 196 <p><i> (Difficulty: Medium)</i></p></p> 197 </li> 198 199 <li>Improved loop execution modeling. 200 <p>The analyzer simply unrolls each loop <tt>N</tt> times before 201 dropping the path, for a fixed constant <tt>N</tt>. 202 However, that results in lost coverage in cases where the loop always 203 executes more than <tt>N</tt> times. 204 A Google Summer Of Code 205 <a href="https://summerofcode.withgoogle.com/archive/2017/projects/6071606019358720/">project</a> 206 was completed to make the loop bound parameterizable, 207 but the <a href="https://en.wikipedia.org/wiki/Widening_(computer_science)">widening</a> 208 problem still remains open. 209 210 <p><i> (Difficulty: Hard)</i></p></p> 211 </li> 212 213 <li>Basic function summarization support 214 <p>The analyzer performs inter-procedural analysis using 215 either inlining or "conservative evaluation" (invalidating all data 216 passed to the function). 217 Often, a very simple summary 218 (e.g. "this function is <a href="https://en.wikipedia.org/wiki/Pure_function">pure</a>") would be 219 enough to be a large improvement over conservative evaluation. 220 Such summaries could be obtained either syntactically, 221 or using a dataflow framework. 222 <p><i>(Difficulty: Hard)</i></p><p> 223 </li> 224 225 <li>Implement a dataflow flamework. 226 <p>The analyzer core 227 implements a <a href="https://en.wikipedia.org/wiki/Symbolic_execution">symbolic execution</a> 228 engine, which performs checks 229 (use-after-free, uninitialized value read, etc.) 230 over a <em>single</em> program path. 231 However, many useful properties 232 (dead code, check-after-use, etc.) require 233 reasoning over <em>all</em> possible in a program. 234 Such reasoning requires a 235 <a href="https://en.wikipedia.org/wiki/Data-flow_analysis">dataflow analysis</a> framework. 236 Clang already implements 237 a few dataflow analyses (most notably, liveness), 238 but they implemented in an ad-hoc fashion. 239 A proper framework would enable us writing many more useful checkers. 240 <p><i> (Difficulty: Hard) </i></p></p> 241 </li> 242 243 <li>Track type information through casts more precisely. 244 <p>The <code>DynamicTypePropagation</code> 245 checker is in charge of inferring a region's 246 dynamic type based on what operations the code is performing. 247 Casts are a rich source of type information that the analyzer currently ignores. 248 <p><i>(Difficulty: Medium)</i></p></p> 249 </li> 250 251 </ul> 252 </li> 253 254 <li>Fixing miscellaneous bugs 255 <p>Apart from the open projects listed above, 256 contributors are welcome to fix any of the outstanding 257 <a href="https://bugs.llvm.org/buglist.cgi?component=Static%20Analyzer&list_id=147756&product=clang&resolution=---">bugs</a> 258 in the Bugzilla. 259 <p><i>(Difficulty: Anything)</i></p></p> 260 </li> 261 262</ul> 263 264</div> 265</div> 266</body> 267</html> 268