PerformanceTips.rst - OpenGrok cross reference for /external/llvm/docs/Frontend/PerformanceTips.rst

Lines Matching +refs:is +refs:effective +refs:target
12 The intended audience of this document is developers of language frontends 
13 targeting LLVM IR. This document is home to a collection of tips on how to 
24 mature frontend for LLVM is Clang.  As a result, the further your IR gets from what Clang might emi…
37    target triple. Without these pieces, non of the target specific optimization
42    make LLVM's inter-procedural optimizations much more effective.
45    of predecessors).  Among other issues, the register allocator is known to 
47    this guidance is that a unified return block with high in-degree is fine.
57 multiple basic blocks. The end result is that a following alloca instruction 
62 SSA is the canonical form expected by much of the optimizer; if allocas can 
63 not be eliminated by Mem2Reg or SROA, the optimizer is likely to be less 
64 effective than it could be.
75 be an effective way to represent collections of small packed fields.  
80 On some architectures (X86_64 is one), sign extension can involve an extra 
109 The alignment is used to guarantee the alignment on allocas and globals, 
110 though in most cases this is unnecessary (most targets have a sufficiently 
111 high default alignment that they’ll be fine).  It is also used to provide a 
113 it is undefined behavior’.  This means that the back end is free to emit 
121 As a result, alignment is mandatory for atomic loads and stores.
137 #. If calling a function which is known to throw an exception (unwind), use 
142    desired.  This is generally not required because the optimizer will convert
146    dynamic profiling information is not available.  This can make a large 
151    block is a loop exiting conditional branch, the effectiveness of LICM will
152    be limited for loads not in the header.  (This is due to the fact that LLVM 
153    may not know such a load is safe to speculatively execute and thus can't 
155    condition is not taken.)  It can be profitable, in some cases, to emit such 
158    apply if the condition which terminates the loop header is itself invariant,
165    improvement.  Note that this is not always profitable and does involve a 
170    the type of comparison is inverted, but GVN only runs late in the pipeline.
177    is quite good at reasoning about general control flow and arithmetic, it is
180    intrinsics itself late in the optimization pipeline.  It is *very* rarely 
186    that fact is critical for optimization purposes.  Assumes are a great 
188    time and optimization effectiveness.  The former is fixable with enough 
189    effort, but the later is fairly fundamental to their designed purpose.
203 additional semantic information.  It is *strongly* recommended that you become 
204 highly familiar with this document.  The list below is intended to highlight a 
205 couple of items of particular interest, but is by no means exhaustive.
209 #. Add nsw/nuw flags as appropriate.  Reasoning about overflow is 
248 One of the most common mistakes made by new language frontend projects is to 
249 use the existing -O2 or -O3 pass pipelines as is.  These pass pipelines make a
251 been carefully tuned for C and C++, not your target language.  You will almost 
258    which is tuned for C and C++ applications, may not be sufficient to remove 
261 #. If you language uses range checks, consider using the IRCE pass.  It is not 
264 #. A useful sanity check to run is to run your optimized IR back through the 
275 need to ensure that your proposal is sufficiently general so that it benefits 
294 context you are able to give to your question, the more likely it is to be