1====================== 2Using Polly with Clang 3====================== 4 5This documentation discusses how Polly can be used in Clang to automatically 6optimize C/C++ code during compilation. 7 8 9.. warning:: 10 11 Warning: clang/LLVM/Polly need to be in sync (compiled from the same SVN 12 revision). 13 14Make Polly available from Clang 15=============================== 16 17Polly is available through clang, opt, and bugpoint, if Polly was checked out 18into tools/polly before compilation. No further configuration is needed. 19 20Optimizing with Polly 21===================== 22 23Optimizing with Polly is as easy as adding -O3 -mllvm -polly to your compiler 24flags (Polly is not available unless optimizations are enabled, such as 25-O1,-O2,-O3; Optimizing for size with -Os or -Oz is not recommended). 26 27.. code-block:: console 28 29 clang -O3 -mllvm -polly file.c 30 31Automatic OpenMP code generation 32================================ 33 34To automatically detect parallel loops and generate OpenMP code for them you 35also need to add -mllvm -polly-parallel -lgomp to your CFLAGS. 36 37.. code-block:: console 38 39 clang -O3 -mllvm -polly -mllvm -polly-parallel -lgomp file.c 40 41Switching the OpenMP backend 42---------------------------- 43 44The following CL switch allows to choose Polly's OpenMP-backend: 45 46 -polly-omp-backend[=BACKEND] 47 choose the OpenMP backend; BACKEND can be 'GNU' (the default) or 'LLVM'; 48 49The OpenMP backends can be further influenced using the following CL switches: 50 51 52 -polly-num-threads[=NUM] 53 set the number of threads to use; NUM may be any positive integer (default: 0, which equals automatic/OMP runtime); 54 55 -polly-scheduling[=SCHED] 56 set the OpenMP scheduling type; SCHED can be 'static', 'dynamic', 'guided' or 'runtime' (the default); 57 58 -polly-scheduling-chunksize[=CHUNK] 59 set the chunksize (for the selected scheduling type); CHUNK may be any strictly positive integer (otherwise it will default to 1); 60 61Note that at the time of writing, the GNU backend may only use the 62`polly-num-threads` and `polly-scheduling` switches, where the latter also has 63to be set to "runtime". 64 65Example: Use alternative backend with dynamic scheduling, four threads and 66chunksize of one (additional switches). 67 68.. code-block:: console 69 70 -mllvm -polly-omp-backend=LLVM -mllvm -polly-num-threads=4 71 -mllvm -polly-scheduling=dynamic -mllvm -polly-scheduling-chunksize=1 72 73Automatic Vector code generation 74================================ 75 76Automatic vector code generation can be enabled by adding -mllvm 77-polly-vectorizer=stripmine to your CFLAGS. 78 79.. code-block:: console 80 81 clang -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine file.c 82 83Isolate the Polly passes 84======================== 85 86Polly's analysis and transformation passes are run with many other 87passes of the pass manager's pipeline. Some of passes that run before 88Polly are essential for its working, for instance the canonicalization 89of loop. Therefore Polly is unable to optimize code straight out of 90clang's -O0 output. 91 92To get the LLVM-IR that Polly sees in the optimization pipeline, use the 93command: 94 95.. code-block:: console 96 97 clang file.c -c -O3 -mllvm -polly -mllvm -polly-dump-before-file=before-polly.ll 98 99This writes a file 'before-polly.ll' containing the LLVM-IR as passed to 100polly, after SSA transformation, loop canonicalization, inlining and 101other passes. 102 103Thereafter, any Polly pass can be run over 'before-polly.ll' using the 104'opt' tool. To found out which Polly passes are active in the standard 105pipeline, see the output of 106 107.. code-block:: console 108 109 clang file.c -c -O3 -mllvm -polly -mllvm -debug-pass=Arguments 110 111The Polly's passes are those between '-polly-detect' and 112'-polly-codegen'. Analysis passes can be omitted. At the time of this 113writing, the default Polly pass pipeline is: 114 115.. code-block:: console 116 117 opt before-polly.ll -polly-simplify -polly-optree -polly-delicm -polly-simplify -polly-prune-unprofitable -polly-opt-isl -polly-codegen 118 119Note that this uses LLVM's old/legacy pass manager. 120 121For completeness, here are some other methods that generates IR 122suitable for processing with Polly from C/C++/Objective C source code. 123The previous method is the recommended one. 124 125The following generates unoptimized LLVM-IR ('-O0', which is the 126default) and runs the canonicalizing passes on it 127('-polly-canonicalize'). This does /not/ include all the passes that run 128before Polly in the default pass pipeline. The '-disable-O0-optnone' 129option is required because otherwise clang adds an 'optnone' attribute 130to all functions such that it is skipped by most optimization passes. 131This is meant to stop LTO builds to optimize these functions in the 132linking phase anyway. 133 134.. code-block:: console 135 136 clang file.c -c -O0 -Xclang -disable-O0-optnone -emit-llvm -S -o - | opt -polly-canonicalize -S 137 138The option '-disable-llvm-passes' disables all LLVM passes, even those 139that run at -O0. Passing -O1 (or any optimization level other than -O0) 140avoids that the 'optnone' attribute is added. 141 142.. code-block:: console 143 144 clang file.c -c -O1 -Xclang -disable-llvm-passes -emit-llvm -S -o - | opt -polly-canonicalize -S 145 146As another alternative, Polly can be pushed in front of the pass 147pipeline, and then its output dumped. This implicitly runs the 148'-polly-canonicalize' passes. 149 150.. code-block:: console 151 152 clang file.c -c -O3 -mllvm -polly -mllvm -polly-position=early -mllvm -polly-dump-before-file=before-polly.ll 153 154Further options 155=============== 156Polly supports further options that are mainly useful for the development or the 157analysis of Polly. The relevant options can be added to clang by appending 158-mllvm -option-name to the CFLAGS or the clang command line. 159 160Limit Polly to a single function 161-------------------------------- 162 163To limit the execution of Polly to a single function, use the option 164-polly-only-func=functionname. 165 166Disable LLVM-IR generation 167-------------------------- 168 169Polly normally regenerates LLVM-IR from the Polyhedral representation. To only 170see the effects of the preparing transformation, but to disable Polly code 171generation add the option polly-no-codegen. 172 173Graphical view of the SCoPs 174--------------------------- 175Polly can use graphviz to show the SCoPs it detects in a program. The relevant 176options are -polly-show, -polly-show-only, -polly-dot and -polly-dot-only. The 177'show' options automatically run dotty or another graphviz viewer to show the 178scops graphically. The 'dot' options store for each function a dot file that 179highlights the detected SCoPs. If 'only' is appended at the end of the option, 180the basic blocks are shown without the statements the contain. 181 182Change/Disable the Optimizer 183---------------------------- 184 185Polly uses by default the isl scheduling optimizer. The isl optimizer optimizes 186for data-locality and parallelism using the Pluto algorithm. 187To disable the optimizer entirely use the option -polly-optimizer=none. 188 189Disable tiling in the optimizer 190------------------------------- 191 192By default both optimizers perform tiling, if possible. In case this is not 193wanted the option -polly-tiling=false can be used to disable it. (This option 194disables tiling for both optimizers). 195 196Import / Export 197--------------- 198 199The flags -polly-import and -polly-export allow the export and reimport of the 200polyhedral representation. By exporting, modifying and reimporting the 201polyhedral representation externally calculated transformations can be 202applied. This enables external optimizers or the manual optimization of 203specific SCoPs. 204 205Viewing Polly Diagnostics with opt-viewer 206----------------------------------------- 207 208The flag -fsave-optimization-record will generate .opt.yaml files when compiling 209your program. These yaml files contain information about each emitted remark. 210Ensure that you have Python 2.7 with PyYaml and Pygments Python Packages. 211To run opt-viewer: 212 213.. code-block:: console 214 215 llvm/tools/opt-viewer/opt-viewer.py -source-dir /path/to/program/src/ \ 216 /path/to/program/src/foo.opt.yaml \ 217 /path/to/program/src/bar.opt.yaml \ 218 -o ./output 219 220Include all yaml files (use \*.opt.yaml when specifying which yaml files to view) 221to view all diagnostics from your program in opt-viewer. Compile with `PGO 222<https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation>`_ to view 223Hotness information in opt-viewer. Resulting html files can be viewed in an internet browser. 224