1============================================================= 2How To Build Clang and LLVM with Profile-Guided Optimizations 3============================================================= 4 5Introduction 6============ 7 8PGO (Profile-Guided Optimization) allows your compiler to better optimize code 9for how it actually runs. Users report that applying this to Clang and LLVM can 10decrease overall compile time by 20%. 11 12This guide walks you through how to build Clang with PGO, though it also applies 13to other subprojects, such as LLD. 14 15If you want to build other software with PGO, see the `end-user documentation 16for PGO <https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization>`_. 17 18 19Using preconfigured CMake caches 20================================ 21 22See https://llvm.org/docs/AdvancedBuilds.html#multi-stage-pgo 23 24Using the script 25================ 26 27We have a script at ``utils/collect_and_build_with_pgo.py``. This script is 28tested on a few Linux flavors, and requires a checkout of LLVM, Clang, and 29compiler-rt. Despite the name, it performs four clean builds of Clang, so it 30can take a while to run to completion. Please see the script's ``--help`` for 31more information on how to run it, and the different options available to you. 32If you want to get the most out of PGO for a particular use-case (e.g. compiling 33a specific large piece of software), please do read the section below on 34'benchmark' selection. 35 36Please note that this script is only tested on a few Linux distros. Patches to 37add support for other platforms, as always, are highly appreciated. :) 38 39This script also supports a ``--dry-run`` option, which causes it to print 40important commands instead of running them. 41 42 43Selecting 'benchmarks' 44====================== 45 46PGO does best when the profiles gathered represent how the user plans to use the 47compiler. Notably, highly accurate profiles of llc building x86_64 code aren't 48incredibly helpful if you're going to be targeting ARM. 49 50By default, the script above does two things to get solid coverage. It: 51 52- runs all of Clang and LLVM's lit tests, and 53- uses the instrumented Clang to build Clang, LLVM, and all of the other 54 LLVM subprojects available to it. 55 56Together, these should give you: 57 58- solid coverage of building C++, 59- good coverage of building C, 60- great coverage of running optimizations, 61- great coverage of the backend for your host's architecture, and 62- some coverage of other architectures (if other arches are supported backends). 63 64Altogether, this should cover a diverse set of uses for Clang and LLVM. If you 65have very specific needs (e.g. your compiler is meant to compile a large browser 66for four different platforms, or similar), you may want to do something else. 67This is configurable in the script itself. 68 69 70Building Clang with PGO 71======================= 72 73If you prefer to not use the script or the cmake cache, this briefly goes over 74how to build Clang/LLVM with PGO. 75 76First, you should have at least LLVM, Clang, and compiler-rt checked out 77locally. 78 79Next, at a high level, you're going to need to do the following: 80 811. Build a standard Release Clang and the relevant libclang_rt.profile library 822. Build Clang using the Clang you built above, but with instrumentation 833. Use the instrumented Clang to generate profiles, which consists of two steps: 84 85 - Running the instrumented Clang/LLVM/lld/etc. on tasks that represent how 86 users will use said tools. 87 - Using a tool to convert the "raw" profiles generated above into a single, 88 final PGO profile. 89 904. Build a final release Clang (along with whatever other binaries you need) 91 using the profile collected from your benchmark 92 93In more detailed steps: 94 951. Configure a Clang build as you normally would. It's highly recommended that 96 you use the Release configuration for this, since it will be used to build 97 another Clang. Because you need Clang and supporting libraries, you'll want 98 to build the ``all`` target (e.g. ``ninja all`` or ``make -j4 all``). 99 1002. Configure a Clang build as above, but add the following CMake args: 101 102 - ``-DLLVM_BUILD_INSTRUMENTED=IR`` -- This causes us to build everything 103 with instrumentation. 104 - ``-DLLVM_BUILD_RUNTIME=No`` -- A few projects have bad interactions when 105 built with profiling, and aren't necessary to build. This flag turns them 106 off. 107 - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in 108 step 1. 109 - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above. 110 111 In this build directory, you simply need to build the ``clang`` target (and 112 whatever supporting tooling your benchmark requires). 113 1143. As mentioned above, this has two steps: gathering profile data, and then 115 massaging it into a useful form: 116 117 a. Build your benchmark using the Clang generated in step 2. The 'standard' 118 benchmark recommended is to run ``check-clang`` and ``check-llvm`` in your 119 instrumented Clang's build directory, and to do a full build of Clang/LLVM 120 using your instrumented Clang. So, create yet another build directory, 121 with the following CMake arguments: 122 123 - ``-DCMAKE_C_COMPILER=/path/to/stage2/clang`` - Use the Clang we built in 124 step 2. 125 - ``-DCMAKE_CXX_COMPILER=/path/to/stage2/clang++`` - Same as above. 126 127 If your users are fans of debug info, you may want to consider using 128 ``-DCMAKE_BUILD_TYPE=RelWithDebInfo`` instead of 129 ``-DCMAKE_BUILD_TYPE=Release``. This will grant better coverage of 130 debug info pieces of clang, but will take longer to complete and will 131 result in a much larger build directory. 132 133 It's recommended to build the ``all`` target with your instrumented Clang, 134 since more coverage is often better. 135 136 b. You should now have a few ``*.profraw`` files in 137 ``path/to/stage2/profiles/``. You need to merge these using 138 ``llvm-profdata`` (even if you only have one! The profile merge transforms 139 profraw into actual profile data, as well). This can be done with 140 ``/path/to/stage1/llvm-profdata merge 141 -output=/path/to/output/profdata.prof path/to/stage2/profiles/*.profraw``. 142 1434. Now, build your final, PGO-optimized Clang. To do this, you'll want to pass 144 the following additional arguments to CMake. 145 146 - ``-DLLVM_PROFDATA_FILE=/path/to/output/profdata.prof`` - Use the PGO 147 profile from the previous step. 148 - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in 149 step 1. 150 - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above. 151 152 From here, you can build whatever targets you need. 153 154 .. note:: 155 You may see warnings about a mismatched profile in the build output. These 156 are generally harmless. To silence them, you can add 157 ``-DCMAKE_C_FLAGS='-Wno-backend-plugin' 158 -DCMAKE_CXX_FLAGS='-Wno-backend-plugin'`` to your CMake invocation. 159 160 161Congrats! You now have a Clang built with profile-guided optimizations, and you 162can delete all but the final build directory if you'd like. 163 164If this worked well for you and you plan on doing it often, there's a slight 165optimization that can be made: LLVM and Clang have a tool called tblgen that's 166built and run during the build process. While it's potentially nice to build 167this for coverage as part of step 3, none of your other builds should benefit 168from building it. You can pass the CMake options 169``-DCLANG_TABLEGEN=/path/to/stage1/bin/clang-tblgen 170-DLLVM_TABLEGEN=/path/to/stage1/bin/llvm-tblgen`` to steps 2 and onward to avoid 171these useless rebuilds. 172