1============
2CMake Primer
3============
4
5.. contents::
6   :local:
7
8.. warning::
9   Disclaimer: This documentation is written by LLVM project contributors `not`
10   anyone affiliated with the CMake project. This document may contain
11   inaccurate terminology, phrasing, or technical details. It is provided with
12   the best intentions.
13
14
15Introduction
16============
17
18The LLVM project and many of the core projects built on LLVM build using CMake.
19This document aims to provide a brief overview of CMake for developers modifying
20LLVM projects or building their own projects on top of LLVM.
21
22The official CMake language references is available in the cmake-language
23manpage and `cmake-language online documentation
24<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.
25
2610,000 ft View
27==============
28
29CMake is a tool that reads script files in its own language that describe how a
30software project builds. As CMake evaluates the scripts it constructs an
31internal representation of the software project. Once the scripts have been
32fully processed, if there are no errors, CMake will generate build files to
33actually build the project. CMake supports generating build files for a variety
34of command line build tools as well as for popular IDEs.
35
36When a user runs CMake it performs a variety of checks similar to how autoconf
37worked historically. During the checks and the evaluation of the build
38description scripts CMake caches values into the CMakeCache. This is useful
39because it allows the build system to skip long-running checks during
40incremental development. CMake caching also has some drawbacks, but that will be
41discussed later.
42
43Scripting Overview
44==================
45
46CMake's scripting language has a very simple grammar. Every language construct
47is a command that matches the pattern _name_(_args_). Commands come in three
48primary types: language-defined (commands implemented in C++ in CMake), defined
49functions, and defined macros. The CMake distribution also contains a suite of
50CMake modules that contain definitions for useful functionality.
51
52The example below is the full CMake build for building a C++ "Hello World"
53program. The example uses only CMake language-defined functions.
54
55.. code-block:: cmake
56
57   cmake_minimum_required(VERSION 3.15)
58   project(HelloWorld)
59   add_executable(HelloWorld HelloWorld.cpp)
60
61The CMake language provides control flow constructs in the form of foreach loops
62and if blocks. To make the example above more complicated you could add an if
63block to define "APPLE" when targeting Apple platforms:
64
65.. code-block:: cmake
66
67   cmake_minimum_required(VERSION 3.15)
68   project(HelloWorld)
69   add_executable(HelloWorld HelloWorld.cpp)
70   if(APPLE)
71     target_compile_definitions(HelloWorld PUBLIC APPLE)
72   endif()
73
74Variables, Types, and Scope
75===========================
76
77Dereferencing
78-------------
79
80In CMake variables are "stringly" typed. All variables are represented as
81strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
82and results in a literal substitution of the name for the value. CMake refers to
83this as "variable evaluation" in their documentation. Dereferences are performed
84*before* the command being called receives the arguments. This means
85dereferencing a list results in multiple separate arguments being passed to the
86command.
87
88Variable dereferences can be nested and be used to model complex data. For
89example:
90
91.. code-block:: cmake
92
93   set(var_name var1)
94   set(${var_name} foo) # same as "set(var1 foo)"
95   set(${${var_name}}_var bar) # same as "set(foo_var bar)"
96
97Dereferencing an unset variable results in an empty expansion. It is a common
98pattern in CMake to conditionally set variables knowing that it will be used in
99code paths that the variable isn't set. There are examples of this throughout
100the LLVM CMake build system.
101
102An example of variable empty expansion is:
103
104.. code-block:: cmake
105
106   if(APPLE)
107     set(extra_sources Apple.cpp)
108   endif()
109   add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
110
111In this example the ``extra_sources`` variable is only defined if you're
112targeting an Apple platform. For all other targets the ``extra_sources`` will be
113evaluated as empty before add_executable is given its arguments.
114
115Lists
116-----
117
118In CMake lists are semi-colon delimited strings, and it is strongly advised that
119you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
120defining lists:
121
122.. code-block:: cmake
123
124   # Creates a list with members a, b, c, and d
125   set(my_list a b c d)
126   set(my_list "a;b;c;d")
127
128   # Creates a string "a b c d"
129   set(my_string "a b c d")
130
131Lists of Lists
132--------------
133
134One of the more complicated patterns in CMake is lists of lists. Because a list
135cannot contain an element with a semi-colon to construct a list of lists you
136make a list of variable names that refer to other lists. For example:
137
138.. code-block:: cmake
139
140   set(list_of_lists a b c)
141   set(a 1 2 3)
142   set(b 4 5 6)
143   set(c 7 8 9)
144
145With this layout you can iterate through the list of lists printing each value
146with the following code:
147
148.. code-block:: cmake
149
150   foreach(list_name IN LISTS list_of_lists)
151     foreach(value IN LISTS ${list_name})
152       message(${value})
153     endforeach()
154   endforeach()
155
156You'll notice that the inner foreach loop's list is doubly dereferenced. This is
157because the first dereference turns ``list_name`` into the name of the sub-list
158(a, b, or c in the example), then the second dereference is to get the value of
159the list.
160
161This pattern is used throughout CMake, the most common example is the compiler
162flags options, which CMake refers to using the following variable expansions:
163CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
164
165Other Types
166-----------
167
168Variables that are cached or specified on the command line can have types
169associated with them. The variable's type is used by CMake's UI tool to display
170the right input field. A variable's type generally doesn't impact evaluation,
171however CMake does have special handling for some variables such as PATH.
172You can read more about the special handling in `CMake's set documentation
173<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.
174
175Scope
176-----
177
178CMake inherently has a directory-based scoping. Setting a variable in a
179CMakeLists file, will set the variable for that file, and all subdirectories.
180Variables set in a CMake module that is included in a CMakeLists file will be
181set in the scope they are included from, and all subdirectories.
182
183When a variable that is already set is set again in a subdirectory it overrides
184the value in that scope and any deeper subdirectories.
185
186The CMake set command provides two scope-related options. PARENT_SCOPE sets a
187variable into the parent scope, and not the current scope. The CACHE option sets
188the variable in the CMakeCache, which results in it being set in all scopes. The
189CACHE option will not set a variable that already exists in the CACHE unless the
190FORCE option is specified.
191
192In addition to directory-based scope, CMake functions also have their own scope.
193This means variables set inside functions do not bleed into the parent scope.
194This is not true of macros, and it is for this reason LLVM prefers functions
195over macros whenever reasonable.
196
197.. note::
198  Unlike C-based languages, CMake's loop and control flow blocks do not have
199  their own scopes.
200
201Control Flow
202============
203
204CMake features the same basic control flow constructs you would expect in any
205scripting language, but there are a few quirks because, as with everything in
206CMake, control flow constructs are commands.
207
208If, ElseIf, Else
209----------------
210
211.. note::
212  For the full documentation on the CMake if command go
213  `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
214  far more complete.
215
216In general CMake if blocks work the way you'd expect:
217
218.. code-block:: cmake
219
220  if(<condition>)
221    message("do stuff")
222  elseif(<condition>)
223    message("do other stuff")
224  else()
225    message("do other other stuff")
226  endif()
227
228The single most important thing to know about CMake's if blocks coming from a C
229background is that they do not have their own scope. Variables set inside
230conditional blocks persist after the ``endif()``.
231
232Loops
233-----
234
235The most common form of the CMake ``foreach`` block is:
236
237.. code-block:: cmake
238
239  foreach(var ...)
240    message("do stuff")
241  endforeach()
242
243The variable argument portion of the ``foreach`` block can contain dereferenced
244lists, values to iterate, or a mix of both:
245
246.. code-block:: cmake
247
248  foreach(var foo bar baz)
249    message(${var})
250  endforeach()
251  # prints:
252  #  foo
253  #  bar
254  #  baz
255
256  set(my_list 1 2 3)
257  foreach(var ${my_list})
258    message(${var})
259  endforeach()
260  # prints:
261  #  1
262  #  2
263  #  3
264
265  foreach(var ${my_list} out_of_bounds)
266    message(${var})
267  endforeach()
268  # prints:
269  #  1
270  #  2
271  #  3
272  #  out_of_bounds
273
274There is also a more modern CMake foreach syntax. The code below is equivalent
275to the code above:
276
277.. code-block:: cmake
278
279  foreach(var IN ITEMS foo bar baz)
280    message(${var})
281  endforeach()
282  # prints:
283  #  foo
284  #  bar
285  #  baz
286
287  set(my_list 1 2 3)
288  foreach(var IN LISTS my_list)
289    message(${var})
290  endforeach()
291  # prints:
292  #  1
293  #  2
294  #  3
295
296  foreach(var IN LISTS my_list ITEMS out_of_bounds)
297    message(${var})
298  endforeach()
299  # prints:
300  #  1
301  #  2
302  #  3
303  #  out_of_bounds
304
305Similar to the conditional statements, these generally behave how you would
306expect, and they do not have their own scope.
307
308CMake also supports ``while`` loops, although they are not widely used in LLVM.
309
310Modules, Functions and Macros
311=============================
312
313Modules
314-------
315
316Modules are CMake's vehicle for enabling code reuse. CMake modules are just
317CMake script files. They can contain code to execute on include as well as
318definitions for commands.
319
320In CMake macros and functions are universally referred to as commands, and they
321are the primary method of defining code that can be called multiple times.
322
323In LLVM we have several CMake modules that are included as part of our
324distribution for developers who don't build our project from source. Those
325modules are the fundamental pieces needed to build LLVM-based projects with
326CMake. We also rely on modules as a way of organizing the build system's
327functionality for maintainability and re-use within LLVM projects.
328
329Argument Handling
330-----------------
331
332When defining a CMake command handling arguments is very useful. The examples
333in this section will all use the CMake ``function`` block, but this all applies
334to the ``macro`` block as well.
335
336CMake commands can have named arguments that are required at every call site. In
337addition, all commands will implicitly accept a variable number of extra
338arguments (In C parlance, all commands are varargs functions). When a command is
339invoked with extra arguments (beyond the named ones) CMake will store the full
340list of arguments (both named and unnamed) in a list named ``ARGV``, and the
341sublist of unnamed arguments in ``ARGN``. Below is a trivial example of
342providing a wrapper function for CMake's built in function ``add_dependencies``.
343
344.. code-block:: cmake
345
346   function(add_deps target)
347     add_dependencies(${target} ${ARGN})
348   endfunction()
349
350This example defines a new macro named ``add_deps`` which takes a required first
351argument, and just calls another function passing through the first argument and
352all trailing arguments.
353
354CMake provides a module ``CMakeParseArguments`` which provides an implementation
355of advanced argument parsing. We use this all over LLVM, and it is recommended
356for any function that has complex argument-based behaviors or optional
357arguments. CMake's official documentation for the module is in the
358``cmake-modules`` manpage, and is also available at the
359`cmake-modules online documentation
360<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.
361
362.. note::
363  As of CMake 3.5 the cmake_parse_arguments command has become a native command
364  and the CMakeParseArguments module is empty and only left around for
365  compatibility.
366
367Functions Vs Macros
368-------------------
369
370Functions and Macros look very similar in how they are used, but there is one
371fundamental difference between the two. Functions have their own scope, and
372macros don't. This means variables set in macros will bleed out into the calling
373scope. That makes macros suitable for defining very small bits of functionality
374only.
375
376The other difference between CMake functions and macros is how arguments are
377passed. Arguments to macros are not set as variables, instead dereferences to
378the parameters are resolved across the macro before executing it. This can
379result in some unexpected behavior if using unreferenced variables. For example:
380
381.. code-block:: cmake
382
383   macro(print_list my_list)
384     foreach(var IN LISTS my_list)
385       message("${var}")
386     endforeach()
387   endmacro()
388
389   set(my_list a b c d)
390   set(my_list_of_numbers 1 2 3 4)
391   print_list(my_list_of_numbers)
392   # prints:
393   # a
394   # b
395   # c
396   # d
397
398Generally speaking this issue is uncommon because it requires using
399non-dereferenced variables with names that overlap in the parent scope, but it
400is important to be aware of because it can lead to subtle bugs.
401
402LLVM Project Wrappers
403=====================
404
405LLVM projects provide lots of wrappers around critical CMake built-in commands.
406We use these wrappers to provide consistent behaviors across LLVM components
407and to reduce code duplication.
408
409We generally (but not always) follow the convention that commands prefaced with
410``llvm_`` are intended to be used only as building blocks for other commands.
411Wrapper commands that are intended for direct use are generally named following
412with the project in the middle of the command name (i.e. ``add_llvm_executable``
413is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
414all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
415distribution. It can be included and used by any LLVM sub-project that requires
416LLVM.
417
418.. note::
419
420   Not all LLVM projects require LLVM for all use cases. For example compiler-rt
421   can be built without LLVM, and the compiler-rt sanitizer libraries are used
422   with GCC.
423
424Useful Built-in Commands
425========================
426
427CMake has a bunch of useful built-in commands. This document isn't going to
428go into details about them because The CMake project has excellent
429documentation. To highlight a few useful functions see:
430
431* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
432* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
433* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
434* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
435* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
436* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_
437
438The full documentation for CMake commands is in the ``cmake-commands`` manpage
439and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_
440