Lines Matching +refs:is +refs:effective +refs:target

12 C/C++ with LLVM. It is aimed at both users who want to compile CUDA with LLVM
21 CUDA support is still in development and works the best in the trunk version
22 of LLVM. Below is a quick summary of downloading and building the trunk
103 The command line for compilation is similar to what you would use for C++.
116 ``<CUDA install path>`` is the root directory where you installed CUDA SDK,
117 typically ``/usr/local/cuda``. ``<GPU arch>`` is `the compute capability of
125 Although clang's CUDA implementation is largely compatible with NVCC's, you may
128 This is tricky, because NVCC may invoke clang as part of its own compilation
132 When clang is actually compiling CUDA code -- rather than being used as a
133 subtool of NVCC's -- it defines the ``__CUDA__`` macro. ``__CUDA_ARCH__`` is
134 defined only in device mode (but will be defined if NVCC is using clang as a
174 equivalents, but because the intermediate result in an fma is not rounded,
177 * ``-fcuda-flush-denormals-to-zero`` (default: off) When this is enabled,
183 * ``-fcuda-approx-transcendentals`` (default: off) When this is enabled, the
189 This is implied by ``-ffast-math``.
195 typical CPU has branch prediction, out-of-order execution, and is superscalar,
203 customizable target-independent optimization pipeline.
216 control flow transfer in GPU is more expensive. They also promote other
218 code by over 10x. An empirical inline threshold for GPUs is 1100. This
219 configuration has yet to be upstreamed with a target-specific optimization
226 <http://llvm.org/docs/doxygen/html/SpeculativeExecution_8cpp_source.html>`_ is
228 effective on code along dominator paths.
234 target-specific alias analysis.