1Kati internals
2==============
3
4This is an informal document about internals of kati. This document is not meant
5to be a comprehensive document of kati or GNU make. This explains some random
6topics which other programmers may be interested in.
7
8Motivation
9----------
10
11The motivation of kati was to speed up Android platform build. Especially, its
12incremental build time was the main focus. Android platform's build system is a
13very unique system. It provides a DSL, (ab)using Turing-completeness of GNU
14make. The DSL allows developers to write build rules in a descriptive way, but
15the downside is it's complicated and slow.
16
17When we say a build system is slow, we consider "null build" and "full
18build". Null build is a build which does nothing, because all output files are
19already up-to-date. Full build is a build which builds everything, because there
20were nothing which have been already built. Actual builds in daily development
21are somewhere between null build and full build. Most benchmarks below were done
22for null build.
23
24For Android with my fairly beefy workstation, null build took ~100 secs with GNU
25make. This means you needed to wait ~100 secs to see if there's a compile error
26when you changed a single C file. To be fair, things were not that bad. There
27are tools called mm/mmm. They allow developers to build an individual module. As
28they ignore dependencies between modules, they are fast. However, you need to be
29somewhat experienced to use them properly. You should know which modules will be
30affected by your change. It would be nicer if you can just type "make" whenever
31you change something.
32
33This is why we started this project. We decided to create a GNU make clone from
34scratch, but there were some other options. One option was to replace all
35Android.mk by files with a better format. There is actually a longer-term
36project for this. Kati was planned to be a short-term project. Another option
37was to hack GNU make instead of developing a clone. We didn't take this option
38because we thought the source code of GNU make is somewhat complicated due to
39historical reason. It's written in old-style C, has a lot of ifdefs for some
40unknown architectures, etc.
41
42Currently, kati's main mode is --ninja mode. Instead of executing build commands
43by itself, kati generates build.ninja file and
44[ninja](https://github.com/martine/ninja) actually runs commands. There were
45some back-and-forths before kati became the current form. Some experiments
46succeeded and some others failed. We even changed the language for kati. At
47first, we wrote kati in Go. We naively expected we can get enough performance
48with Go. I guessed at least one of the following statements are true: 1. GNU
49make is not very optimized for computation heavy Makefiles, 2. Go is fast for
50our purpose, or 3. we can come up with some optimization tricks for Android's
51build system. As for 3, some of such optimization succeeded but it's performance
52gain didn't cancel the slowness of Go.
53
54Go's performance would be somewhat interesting topic. I didn't study the
55performance difference in detail, but it seemed both our use of Go and Go
56language itself were making the Go version of kati slower. As for our fault, I
57think Go version has more unnecessary string allocations than C++ version
58has. As for Go itself, it seemed GC was the main show-stopper. For example,
59Android's build system defines about one million make variables, and buffers for
60them will be never freed. IIRC, this kind of allocation pattern isn't good for
61non-generational GC.
62
63Go version and test cases were written by ukai and me, and C++ rewrite was done
64mostly by me. The rest of this document is mostly about the C++ version.
65
66Overall architecture
67--------------------
68
69Kati consists of the following components:
70
71* Parser
72* Evaluator
73* Dependency builder
74* Executor
75* Ninja generator
76
77A Makefile has some statements which consist of zero or more expressions. There
78are two parsers and two evaluators - one for statements and the other for
79expressions.
80
81Most of users of GNU make may not care about the evaluator much. However, GNU
82make's evaluator is very powerful and is Turing-complete. For Android's null
83build, most time is spent in this phase. Other tasks, such as building
84dependency graphs and calling stat function for build targets, are not the
85bottleneck. This would be a very Android specific characteristics. Android's
86build system uses a lot of GNU make black magics.
87
88The evaluator outputs a list of build rules and a variable table. The dependency
89builder creates a dependency graph from the list of build rules. Note this step
90doesn't use the variable table.
91
92Then either executor or ninja generator will be used. Either way, kati runs its
93evaluator again for command lines. The variable table is used again for this
94step.
95
96We'll look at each components closely. GNU make is a somewhat different language
97from modern languages. Let's see.
98
99Parser for statements
100---------------------
101
102I'm not 100% sure, but I think GNU make parses and evaluates Makefiles
103simultaneously, but kati has two phases for parsing and evaluation. The reason
104of this design is for performance. For Android build, kati (or GNU make) needs
105to read ~3k files ~50k times. The file which is read most often is read ~5k
106times. It's waste of time to parse such files again and again. Kati can re-use
107parsed results when it needs to evaluate a Makefile second time. If we stop
108caching the parsed results, kati will be two times slower for Android's
109build. Caching parsed statements is done in *file_cache.cc*.
110
111The statement parser is defined in *parser.cc*. In kati, there are four kinds of
112statements:
113
114* Rules
115* Assignments
116* Commands
117* Make directives
118
119Data structures for them are defined in *stmt.h*. Here are examples of these
120statements:
121
122    VAR := yay!      # An assignment
123    all:             # A rule
124    	echo $(VAR)  # A command
125    include xxx.mk   # A make directive (include)
126
127In addition to include directive, there are ifeq/ifneq/ifdef/ifndef directives
128and export/unexport directives. Also, kati internally uses "parse error
129statement". As GNU make doesn't show parse errors in branches which are not
130taken, we need to delay parse errors to evaluation time.
131
132### Context dependent parser
133
134A tricky point of parsing make statements is that the parsing depends on the
135context of the evaluation. See the following Makefile chunk for example:
136
137    $(VAR)
138    	X=hoge echo $${X}
139
140You cannot tell whether the second line is a command or an assignment until
141*$(VAR)* is evaluated. If *$(VAR)* is a rule statement, the second line is a
142command and otherwise it's an assignment. If the previous line is
143
144    VAR := target:
145
146the second line will turn out to be a command.
147
148For some reason, GNU make expands expressions before it decides the type of
149a statement only for rules. Storing assignments or directives in a variable
150won't work as assignments or directives. For example
151
152    ASSIGN := A=B
153    $(ASSIGN):
154
155doesn't assign "*B:*" to *A*, but defines a build rule whose target is *A=B*.
156
157Anyway, as a line starts with a tab character can be either a command statement
158or other statements depending on the evaluation result of the previous line,
159sometimes kati's parser cannot tell the statement type of a line. In this case,
160kati's parser speculatively creates a command statement object, keeping the
161original line. If it turns out the line is actually not a command statement,
162the evaluator re-runs the parser.
163
164### Line concatenations and comments
165
166In most programming languages, line concatenations by a backslash character and
167comments are handled at a very early stage of a language
168implementation. However, GNU make changes the behavior for them depending on
169parse/eval context. For example, the following Makefile outputs "has space" and
170"hasnospace":
171
172    VAR := has\
173    space
174    all:
175    	echo $(VAR)
176    	echo has\
177    nospace
178
179GNU make usually inserts a whitespace between lines, but for command lines it
180doesn't. As we've seen in the previous subsection, sometimes kati cannot tell
181a line is a command statement or not. This means we should handle them after
182evaluating statements. Similar discussion applies for comments. GNU make usually
183trims characters after '#', but it does nothing for '#' in command lines.
184
185We have a bunch of comment/backslash related testcases in the testcase directory
186of kati's repository.
187
188Parser for expressions
189----------------------
190
191A statement may have one or more expressions. The number of expressions in a
192statement depends on the statement's type. For example,
193
194    A := $(X)
195
196This is an assignment statement, which has two expressions - *A* and
197*$(X)*. Types of expressions and their parser are defined in *expr.cc*. Like
198other programming languages, an expression is a tree of expressions. The type of
199a leaf expression is either literal, variable reference,
200[substitution references](http://www.gnu.org/software/make/manual/make.html#Substitution-Refs),
201or make functions.
202
203As written, backslashes and comments change their behavior depending on the
204context. Kati handles them in this phase. *ParseExprOpt* is the enum for the
205contexts.
206
207As a nature of old systems, GNU make is very permissive. For some reason, it
208allows some kind of unmatched pairs of parentheses. For example, GNU make
209doesn't think *$($(foo)* is an error - this is a reference to variable
210*$(foo*. If you have some experiences with parsers, you may wonder how one can
211implement a parser which allows such expressions. It seems GNU make
212intentionally allows this:
213
214http://git.savannah.gnu.org/cgit/make.git/tree/expand.c#n285
215
216No one won't use this feature intentionally. However, as GNU make allows this,
217some Makefiles have unmatched parentheses, so kati shouldn't raise an error for
218them, unfortunately.
219
220GNU make has a bunch of functions. Most users would use only simple ones such as
221*$(wildcard ...)* and *$(subst ...)*. There are also more complex functions such
222as *$(if ...)* and *$(call ...)*, which make GNU make Turing-complete. Make
223functions are defined in *func.cc*. Though *func.cc* is not short, the
224implementation is fairly simple. There is only one weirdness I remember around
225functions. GNU make slightly changes its parsing for *$(if ...)*, *$(and ...)*,
226and *$(or ...)*. See *trim_space* and *trim_right_space_1st* in *func.h* and how
227they are used in *expr.cc*.
228
229Evaluator for statements
230------------------------
231
232Evaluator for statements are defined in *eval.cc*. As written, there are four
233kinds of statements:
234
235* Rules
236* Assignments
237* Commands
238* Make directives
239
240There is nothing tricky around commands and make directives. A rule statement
241have some forms and should be parsed after evaluating expression by the third
242parser. This will be discussed in the next section.
243
244Assignments in GNU make is tricky a bit. There are two kinds of variables in GNU
245make - simple variables and recursive variables. See the following code snippet:
246
247    A = $(info world!)   # recursive
248    B := $(info Hello,)  # simple
249    $(A)
250    $(B)
251
252This code outputs "Hello," and "world!", in this order. The evaluation of
253a recursive variable is delayed until the variable is referenced. So the first
254line, which is an assignment of a recursive variable, outputs nothing. The
255content of the variable *$(A)* will be *$(info world!)* after the first
256line. The assignment in the second line uses *:=* which means this is a simple
257variable assignment. For simple variables, the right hand side is evaluated
258immediately. So "Hello," will be output and the value of *$(B)* will be an empty
259string ($(info ...) returns an empty string). Then, "world!" will be shown when
260the third line is evaluated as *$(A)* is evaluated, and lastly the forth line
261does nothing, as *$(B)* is an empty string.
262
263There are two more kinds of assignments (i.e., *+=* and *?=*). These assignments
264keep the type of the original variable. Evaluation of them will be done
265immediately only when the left hand side of the assignment is already defined
266and is a simple variable.
267
268Parser for rules
269----------------
270
271After evaluating a rule statement, kati needs to parse the evaluated result. A
272rule statement can actually be the following four things:
273
274* A rule
275* A [target specific variable](http://www.gnu.org/software/make/manual/make.html#Target_002dspecific)
276* An empty line
277* An error (there're non-whitespace characters without a colon)
278
279Parsing them is mostly done in *rule.cc*.
280
281### Rules
282
283A rule is something like *all: hello.exe*. You should be familiar with it. There
284are several kinds of rules such as pattern rules, double colon rules, and order
285only dependencies, but they don't complicate the rule parser.
286
287A feature which complicates the parser is semicolon. You can write the first
288build command on the same line as the rule. For example,
289
290    target:
291    	echo hi!
292
293and
294
295    target: ; echo hi!
296
297have the same meaning. This is tricky because kati shouldn't evaluate expressions
298in a command until the command is actually invoked. As a semicolon can appear as
299the result of expression evaluation, there are some corner cases. A tricky
300example:
301
302    all: $(info foo) ; $(info bar)
303    $(info baz)
304
305should output *foo*, *baz*, and then *bar*, in this order, but
306
307    VAR := all: $(info foo) ; $(info bar)
308    $(VAR)
309    $(info baz)
310
311outputs *foo*, *bar*, and then *baz*.
312
313Again, for the command line after a semicolon, kati should also change how
314backslashes and comments are handled.
315
316    target: has\
317    space ; echo no\
318    space
319
320The above example says *target* depends on two targets, *has* and *space*, and
321to build *target*, *echo nospace* should be executed.
322
323### Target specific variables
324
325You may not familiar with target specific variables. This feature allows you to
326define variable which can be referenced only from commands in a specified
327target. See the following code:
328
329    VAR := X
330    target1: VAR := Y
331    target1:
332    	echo $(VAR)
333    target2:
334    	echo $(VAR)
335
336In this example, *target1* shows *Y* and *target2* shows *X*. I think this
337feature is somewhat similar to namespaces in other programming languages. If a
338target specific variable is specified for a non-leaf target, the variable will
339be used even in build commands of prerequisite targets.
340
341In general, I like GNU make, but this is the only GNU make's feature I don't
342like. See the following Makefile:
343
344    hello: CFLAGS := -g
345    hello: hello.o
346    	gcc $(CFLAGS) $< -o $@
347    hello.o: hello.c
348    	gcc $(CFLAGS) -c $< -o $@
349
350If you run make for the target *hello*, *CFLAGS* is applied for both commands:
351
352    $ make hello
353    gcc -g -c hello.c -o hello.o
354    gcc -g hello.o -o hello
355
356However, *CFLAGS* for *hello* won't be used when you build only *hello.o*:
357
358    $ make hello.o
359    gcc  -c hello.c -o hello.o
360
361Things could be even worse when two targets with different target specific
362variables depend on a same target. The build result will be inconsistent. I
363think there is no valid usage of this feature for non-leaf targets.
364
365Let's go back to the parsing. Like for semicolons, we need to delay the
366evaluation of the right hand side of the assignment for recursive variables. Its
367implementation is very similar to the one for semicolons, but the combination of
368the assignment and the semicolon makes parsing a bit trickier. An example:
369
370    target1: ;X=Y echo $(X)  # A rule with a command
371    target2: X=;Y echo $(X)  # A target specific variable
372
373Evaluator for expressions
374-------------------------
375
376Evaluation of expressions is done in *expr.cc*, *func.cc*, and
377*command.cc*. The amount of code for this step is fairly large especially
378because of the number of GNU make functions. However, their implementations are
379fairly straightforward.
380
381One tricky function is $(wildcard ...). It seems GNU make is doing some kind of
382optimization only for this function and $(wildcard ...) in commands seem to be
383evaluated before the evaluation phase for commands. Both C++ kati and Go kati
384are different from GNU make's behavior in different ways, but it seems this
385incompatibility is OK for Android build.
386
387There is an important optimization done for Android. Android's build system has
388a lot of $(shell find ...) calls to create a list of all .java/.mk files under a
389directory, and they are slow. For this, kati has a builtin emulator of GNU
390find. The find emulator traverses the directory tree and creates an in-memory
391directory tree. Then the find emulator returns results of find commands using
392the cached tree. For my environment, the find command emulator makes kati ~1.6x
393faster for AOSP.
394
395The implementations of some IO-related functions in commands are tricky in the
396ninja generation mode. This will be described later.
397
398Dependency builder
399------------------
400
401Now we get a list of rules and a variable table. *dep.cc* builds a dependency
402graph using the list of rules. I think this step is what GNU make is supposed to
403do for normal users.
404
405This step is fairly complex like other components but there's nothing
406strange. There are three types of rules in GNU make:
407
408* explicit rule
409* implicit rule
410* suffix rule
411
412The following code shows the three types:
413
414    all: foo.o
415    foo.o:
416    	echo explicit
417    %.o:
418    	echo implicit
419    .c.o:
420    	echo suffix
421
422In the above example, all of these three rules match the target *foo.o*. GNU
423make prioritizes explicit rules first. When there's no explicit rule for a
424target, it uses an implicit rule with longer pattern string. Suffix rules are
425used only when there are no explicit/implicit rules.
426
427Android has more than one thousand implicit rules and there are ten thousands of
428targets. It's too slow to do matching for them with a naive O(NM)
429algorithm. Kati uses a trie to speed up this step.
430
431Multiple rules without commands should be merged into the rule with a
432command. For example:
433
434    foo.o: foo.h
435    %.o: %.c
436    	$(CC) -c $< -o $@
437
438*foo.o* depends not only on *foo.c*, but also on *foo.h*.
439
440Executor
441--------
442
443C++ kati's executor is fairly simple. This is defined in *exec.cc*. This is
444useful only for testing because this lacks some important features for a build
445system (e.g., parallel build).
446
447Expressions in commands are evaluated at this stage. When they are evaluated,
448target specific variables and some special variables (e.g., $< and $@) should be
449considered. *command.cc* is handling them. This file is used by both the
450executor and the ninja generator.
451
452Evaluation at this stage is tricky when both *+=* and target specific variables
453are involved. Here is an example code:
454
455    all: test1 test2 test3 test4
456
457    A:=X
458    B=X
459    X:=foo
460
461    test1: A+=$(X)
462    test1:
463    	@echo $(A)  # X bar
464
465    test2: B+=$(X)
466    test2:
467    	@echo $(B)  # X bar
468
469    test3: A:=
470    test3: A+=$(X)
471    test3:
472    	@echo $(A)  # foo
473
474    test4: B=
475    test4: B+=$(X)
476    test4:
477    	@echo $(B)  # bar
478
479    X:=bar
480
481*$(A)* in *test3* is a simple variable. Though *$(A)* in the global scope is
482simple, *$(A)* in *test1* is a recursive variable. This means types of global
483variables don't affect types of target specific variables. However, The result
484of *test1* ("X bar") shows the value of a target specific variable is
485concatenated to the value of a global variable.
486
487Ninja generator
488---------------
489
490*ninja.cc* generates a ninja file using the results of other components. This
491step is actually fairly complicated because kati needs to map GNU make's
492features to ninja's.
493
494A build rule in GNU make may have multiple commands, while ninja's has always a
495single command. To mitigate this, the ninja generator translates multiple
496commands into something like *(cmd1) && (cmd2) && ...*. Kati should also escape
497some special characters for ninja and shell.
498
499The tougher thing is $(shell ...) in commands. Current kati's implementation
500translates it into shell's $(...). This works for many cases. But this approach
501won't work when the result of $(shell ...) is passed to another make
502function. For example
503
504    all:
505    	echo $(if $(shell echo),FAIL,PASS)
506
507should output PASS, because the result of $(shell echo) is an empty string. GNU
508make and kati's executor mode output PASS correctly. However, kati's ninja
509generator emits a ninja file which shows FAIL.
510
511I wrote a few experimental patches for this issue, but they didn't
512work well. The current kati's implementation has an Android specific workaround
513for this. See *HasNoIoInShellScript* in *func.cc* for detail.
514
515Ninja regeneration
516------------------
517
518C++ kati has --regen flag. If this flag is specified, kati checks if anything
519in your environment was changed after the previous run. If kati thinks it doesn't
520need to regenerate the ninja file, it finishes quickly. For Android, running
521kati takes ~30 secs at the first run but the second run takes only ~1 sec.
522
523Kati thinks it needs to regenerate the ninja file when one of the followings is
524changed:
525
526* The command line flags passed to kati
527* A timestamp of a Makefile used to generate the previous ninja file
528* An environment variable used while evaluating Makefiles
529* A result of $(wildcard ...)
530* A result of $(shell ...)
531
532Quickly doing the last check is not trivial. It takes ~18 secs to run all
533$(shell ...) in Android's build system due to the slowness of $(shell find
534...). So, for find commands executed by kati's find emulator, kati stores the
535timestamps of traversed directories with the find command itself. For each find
536commands, kati checks the timestamps of them. If they are not changed, kati
537skips re-running the find command.
538
539Kati doesn't run $(shell date ...) and $(shell echo ...) during this check. The
540former always changes so there's no sense to re-run them. Android uses the
541latter to create a file and the result of them are empty strings. We don't want
542to update these files to get empty strings.
543
544TODO
545----
546
547A big TODO is sub-makes invoked by $(MAKE). I wrote some experimental patches
548but nothing is ready to be used as of writing.
549