README.md
1# Metalava
2
3(Also known as "doclava2", but deliberately not named doclava2 since crucially it
4does not generate docs; it's intended only for **meta**data extraction and generation.)
5
6Metalava is a metadata generator intended for the Android source tree, used for
7a number of purposes:
8
9* Allow extracting the API (into signature text files, into stub API files (which
10 in turn get compiled into android.jar, the Android SDK library)
11 and more importantly to hide code intended to be implementation only, driven
12 by javadoc comments like @hide, @$doconly, @removed, etc, as well as various
13 annotations.
14
15* Extracting source level annotations into external annotations file (such as
16 the typedef annotations, which cannot be stored in the SDK as .class level
17 annotations).
18
19* Diffing versions of the API and determining whether a newer version is compatible
20 with the older version.
21
22## Building and running
23
24To build:
25
26 $ ./gradlew
27
28This builds a binary distribution in `../../out/host/common/install/metalava/bin/metalava`.
29
30To run metalava:
31
32 $ ../../out/host/common/install/metalava/bin/metalava
33 _ _
34 _ __ ___ ___| |_ __ _| | __ ___ ____ _
35 | '_ ` _ \ / _ \ __/ _` | |/ _` \ \ / / _` |
36 | | | | | | __/ || (_| | | (_| |\ V / (_| |
37 |_| |_| |_|\___|\__\__,_|_|\__,_| \_/ \__,_|
38
39 metalava extracts metadata from source code to generate artifacts such as the
40 signature files, the SDK stub files, external annotations etc.
41
42 Usage: metalava <flags>
43
44 Flags:
45
46 --help This message.
47 --quiet Only include vital output
48 --verbose Include extra diagnostic output
49
50 ...
51(*output truncated*)
52
53Metalava has a new command line syntax, but it also understands the doclava1
54flags and translates them on the fly. Flags that are ignored are listed on
55the command line. If metalava is dropped into an Android framework build for
56example, you'll see something like this (unless running with --quiet) :
57
58 metalava: Ignoring unimplemented doclava1 flag -encoding (UTF-8 assumed)
59 metalava: Ignoring unimplemented doclava1 flag -source (1.8 assumed)
60 metalava: Ignoring javadoc-related doclava1 flag -J-Xmx1600m
61 metalava: Ignoring javadoc-related doclava1 flag -J-XX:-OmitStackTraceInFastThrow
62 metalava: Ignoring javadoc-related doclava1 flag -XDignore.symbol.file
63 metalava: Ignoring javadoc-related doclava1 flag -doclet
64 metalava: Ignoring javadoc-related doclava1 flag -docletpath
65 metalava: Ignoring javadoc-related doclava1 flag -templatedir
66 metalava: Ignoring javadoc-related doclava1 flag -htmldir
67 ...
68
69## Features
70
71* Compatibility with doclava1: in compat mode, metalava spits out the same
72 signature files for the framework as doclava1.
73
74* Ability to read in an existing android.jar file instead of from source, which means
75 we can regenerate signature files etc for older versions according to new formats
76 (e.g. to fix past errors in doclava, such as annotation instance methods which were
77 accidentally not included.)
78
79* Ability to merge in data (annotations etc) from external sources, such as
80 IntelliJ external annotations data as well as signature files containing
81 annotations. This isn't just merged at export time, it's merged at codebase
82 load time such that it can be part of the API analysis.
83
84* Support for an updated signature file format:
85
86 * Address errors in the doclava1 format which for example was missing annotation
87 class instance methods
88
89 * Improve the signature format such that it for example labels enums "enum"
90 instead of "abstract class extends java.lang.Enum", annotations as "@interface"
91 instead of "abstract class extends java.lang.Annotation", sorts modifiers in
92 the canonical modifier order, using "extends" instead of "implements" for
93 the superclass of an interface, and many other similar tweaks outlined
94 in the `Compatibility` class. (Metalava also allows (and ignores) block
95 comments in the signature files.)
96
97 * Add support for writing (and reading) annotations into the signature
98 files. This is vital now that some of these annotations become part of
99 the API contract (in particular nullness contracts, as well as parameter
100 names and default values.)
101
102 * Support for a "compact" nullness format -- one based on Kotlin's syntax. Since
103 the goal is to have **all** API elements explicitly state their nullness
104 contract, the signature files would very quickly become bloated with
105 @NonNull and @Nullable annotations everywhere. So instead, the signature
106 format now uses a suffix of `?` for nullable, `!` for not yet annotated, and
107 nothing for non-null.
108
109 Instead of
110
111 method public java.lang.Double convert0(java.lang.Float);
112 method @Nullable public java.lang.Double convert1(@NonNull java.lang.Float);
113
114 we have
115
116 method public java.lang.Double! convert0(java.lang.Float!);
117 method public java.lang.Double? convert1(java.lang.Float);
118
119
120 * Other compactness improvements: Skip packages in some cases both for
121 export and reinsert during import. Specifically, drop "java.lang."
122 from package names such that you have
123
124 method public void onUpdate(int, String);
125
126 instead of
127
128 method public void onUpdate(int, java.lang.String);
129
130 Similarly, annotations (the ones considered part of the API; unknown
131 annotations are not included in signature files) use just the simple
132 name instead of the full package name, e.g. `@UiThread` instead of
133 `@android.annotation.UiThread`.
134
135 * Misc documentation handling; for example, it attempts to fix sentences
136 that javadoc will mistreat, such as sentences that "end" with "e.g. ".
137 It also looks for various common typos and fixes those; here's a sample
138 error message running metalava on master:
139 Enhancing docs:
140
141 frameworks/base/core/java/android/content/res/AssetManager.java:166: error: Replaced Kitkat with KitKat in documentation for Method android.content.res.AssetManager.getLocales() [Typo]
142 frameworks/base/core/java/android/print/PrinterCapabilitiesInfo.java:122: error: Replaced Kitkat with KitKat in documentation for Method android.print.PrinterCapabilitiesInfo.Builder.setColorModes(int, int) [Typo]
143
144* Built-in support for injecting new annotations for use by the Kotlin compiler,
145 not just nullness annotations found in the source code and annotations merged
146 in from external sources, but also inferring whether nullness annotations
147 have recently changed and if so marking them as @Migrate (which lets the
148 Kotlin compiler treat errors in the user code as warnings instead of errors.)
149
150* Support for generating documentation into the stubs files (so we can run javadoc or
151 [Dokka](https://github.com/Kotlin/dokka) on the stubs files instead of the source
152 code). This means that the documentation tool itself does not need to be able to
153 figure out which parts of the source code is included in the API and which one is
154 implementation; it is simply handed the filtered API stub sources that include
155 documentation.
156
157* Support for parsing Kotlin files. API files can now be implemented in Kotlin
158 as well and metalava will parse and extract API information from them just
159 as is done for Java files.
160
161* Like doclava1, metalava can diff two APIs and warn about API compatibility
162 problems such as removing API elements. Metalava adds new warnings around
163 nullness, such as attempting to change a nullness contract incompatibly
164 (e.g. you can change a parameter from non null to nullable for final classes,
165 but not versa). It also lets you diff directly on a source tree; it does not
166 require you to create two signature files to diff.
167
168* Consistent stubs: In doclava1, the code which iterated over the API and generated
169 the signature files and generated the stubs had diverged, so there was some
170 inconsistency. In metalava the stub files contain **exactly** the same signatures
171 as in the signature files.
172
173 (This turned out to be incredibly important; this revealed for example that
174 StringBuilder.setLength(int) was missing from the API signatures since it is
175 a public method inherited from a package protected super class, which the
176 API extraction code in doclava1 missed, but accidentally included in the SDK
177 anyway since it packages package private classes. Metalava strictly applies
178 the exact same API as is listed in the signature files, and once this was
179 hooked up to the build it immediately became apparent that it was missing
180 important methods that should really be part of the API.)
181
182* Metalava can generate reports about nullness annotation coverage (which helps
183 target efforts since we plan to annotate the entire API). First, it can
184 generate a raw count:
185
186 Nullness Annotation Coverage Statistics:
187 1279 out of 46900 methods were annotated (2%)
188 2 out of 21683 fields were annotated (0%)
189 2770 out of 47492 parameters were annotated (5%)
190
191 More importantly, you can also point it to some existing compiled applications
192 (.class or .jar files) and it will then measure the annotation coverage of
193 the APIs used by those applications. This lets us target the most important
194 APIs that are currently used by a corpus of apps and target our annotation
195 efforts in a targeted way. For example, running the analysis on the current
196 version of framework, and pointing it to the
197 [Plaid](https://github.com/nickbutcher/plaid) app's compiled output with
198
199 ... --annotation-coverage-of ~/plaid/app/build/intermediates/classes/debug
200
201 This produces the following output:
202
203 324 methods and fields were missing nullness annotations out of 650 total API references.
204 API nullness coverage is 50%
205
206 ```
207 | Qualified Class Name | Usage Count |
208 |--------------------------------------------------------------|-----------------:|
209 | android.os.Parcel | 146 |
210 | android.view.View | 119 |
211 | android.view.ViewPropertyAnimator | 114 |
212 | android.content.Intent | 104 |
213 | android.graphics.Rect | 79 |
214 | android.content.Context | 61 |
215 | android.widget.TextView | 53 |
216 | android.transition.TransitionValues | 49 |
217 | android.animation.Animator | 34 |
218 | android.app.ActivityOptions | 34 |
219 | android.view.LayoutInflater | 31 |
220 | android.app.Activity | 28 |
221 | android.content.SharedPreferences | 26 |
222 | android.content.SharedPreferences.Editor | 26 |
223 | android.text.SpannableStringBuilder | 23 |
224 | android.view.ViewGroup.MarginLayoutParams | 21 |
225 | ... (99 more items | |
226 ```
227 Top referenced un-annotated members:
228
229 ```
230 | Member | Usage Count |
231 |--------------------------------------------------------------|-----------------:|
232 | Parcel.readString() | 62 |
233 | Parcel.writeString(String) | 62 |
234 | TextView.setText(CharSequence) | 34 |
235 | TransitionValues.values | 28 |
236 | View.getContext() | 28 |
237 | ViewPropertyAnimator.setDuration(long) | 26 |
238 | ViewPropertyAnimator.setInterpolator(android.animation.Ti... | 26 |
239 | LayoutInflater.inflate(int, android.view.ViewGroup, boole... | 23 |
240 | Rect.left | 22 |
241 | Rect.top | 22 |
242 | Intent.Intent(android.content.Context, Class<?>) | 21 |
243 | Rect.bottom | 21 |
244 | TransitionValues.view | 21 |
245 | VERSION.SDK_INT | 18 |
246 | Context.getResources() | 18 |
247 | EditText.getText() | 18 |
248 | ... (309 more items | |
249 ```
250
251 From this it's clear that it would be useful to start annotating android.os.Parcel
252 and android.view.View for example where there are unannotated APIs that are
253 frequently used, at least by this app.
254
255* Built on top of a full, type-resolved AST. Doclava1 was integrated with javadoc,
256 which meant that most of the source tree was opaque. Therefore, as just one example,
257 the code which generated documentation for typedef constants had to require the
258 constants to all share a single prefix it could look for. However, in metalava,
259 annotation references are available at the AST level, so it can resolve references
260 and map them back to the original field references and include those directly.
261
262* Support for extracting annotations. Metalava can also generate the external annotation
263 files needed by Studio and lint in Gradle, which captures the typedefs (@IntDef and
264 @StringDef classes) in the source code. Prior to this this was generated manually
265 via the development/tools/extract code. This also merges in manually curated data;
266 some of this is in the manual/ folder in this project.
267
268* Support for extracting API levels (api-versions.xml). This was generated by separate
269 code (tools/base/misc/api-generator), invoked during the build. This functionality
270 is now rolled into metalava, which has one very important attribute: metalava
271 will use this information when recording API levels for API usage. (Prior to this,
272 this was based on signature file parsing in doclava, which sometimes generated
273 incorrect results. Metalava uses the android.jar files themselves to ensure that
274 it computes the exact available SDK data for each API level.)
275
276## Architecture & Implementation
277
278Metalava is implemented on top of IntelliJ parsing APIs (PSI and UAST). However,
279these are hidden behind a "model": an abstraction layer which only exposes high
280level concepts like packages, classes and inner classes, methods, fields, and
281modifier lists (including annotations).
282
283This is done for multiple reasons:
284
285(1) It allows us to have multiple "back-ends": for example, metalava can read
286 in a model not just from parsing source code, but from reading older SDK
287 android.jar files (e.g. backed by bytecode) or reading previous signature
288 files. Reading in multiple versions of an API lets doclava perform "diffing",
289 such as warning if an API is changing in an incompatible way. It can also
290 generate signature files in the new format (including data that was missing
291 in older signature files, such as annotation methods) without having to
292 parse older source code which may no longer be easy to parse.
293
294(2) There's a lot of logic for deciding whether code found in the source tree
295 should be included in the API. With the model approach we can build up an
296 API and for example mark a subset of its methods as included. By having
297 a separate hierarchy we can easily perform this work once and pass around
298 our filtered model instead of passing around PsiClass and PsiMethod instances
299 and having to keep the filtered data separately and remembering to always
300 consult the filter, not the PSI elements directly.
301
302The basic API element class is "Item". (In doclava1 this was called a "DocInfo".)
303There are several sub interfaces of Item: PackageItem, ClassItem, MemberItem,
304MethodItem, FieldItem, ParameterItem, etc. And then there are several
305implementation hierarchies: One is PSI based, where you point metalava to a
306source tree or a .jar file, and it constructs Items built on top of PSI:
307PsiPackageItem, PsiClassItem, PsiMethodItem, etc. Another is textual, based
308on signature files: TextPackageItem, TextClassItem, and so on.
309
310The "Codebase" class captures a complete API snapshot (including classes
311that are hidden, which is why it's called a "Codebase" rather than an "API").
312
313There are methods to load codebases - from source folders, from a .jar file,
314from a signature file. That's how API diffing is performed: you load two
315codebases (from whatever source you want, typically a previous API signature
316file and the current set of source folders), and then you "diff" the two.
317
318There are several key helpers that help with the implementation, detailed next.
319
320### Visiting Items
321
322First, metalava provides an ItemVisitor. This lets you visit the API easily.
323For example, here's how you can visit every class:
324
325 coebase.accept(object : ItemVisitor() {
326 override fun visitClass(cls: ClassItem) {
327 // code operating on the class here
328 }
329 })
330
331Similarly you can visit all items (regardless of type) by overriding
332`visitItem`, or to specifically visit methods, fields and so on
333overriding `visitPackage`, `visitClass`, `visitMethod`, etc.
334
335There is also an `ApiVisitor`. This is a subclass of the `ItemVisitor`,
336but which limits itself to visiting code elements that are part of the
337API.
338
339This is how for example the SignatureWriter and the StubWriter are both
340implemented: they simply extend `ApiVisitor`, which means they'll
341only export the API items in the codebase, and then in each relevant
342method they emit the signature or stub data:
343
344 class SignatureWriter(
345 private val writer: PrintWriter,
346 private val generateDefaultConstructors: Boolean,
347 private val filter: (Item) -> Boolean) : ApiVisitor(
348 visitConstructorsAsMethods = false) {
349
350 ....
351
352 override fun visitConstructor(constructor: ConstructorItem) {
353 writer.print(" ctor ")
354 writeModifiers(constructor)
355 writer.print(constructor.containingClass().fullName())
356 writeParameterList(constructor)
357 writeThrowsList(constructor)
358 writer.print(";\n")
359 }
360
361 ....
362
363### Visiting Types
364
365There is a `TypeVisitor` similar to `ItemVisitor` which you can use
366to visit all types in the codebase.
367
368When computing the API, all types that are included in the API should be
369included (e.g. if `List<Foo>` is part of the API then `Foo` must be too).
370This is easy to do with the `TypeVisitor`.
371
372### Diffing Codebases
373
374Another visitor which helps with implementation is the ComparisonVisitor:
375
376 open class ComparisonVisitor {
377 open fun compare(old: Item, new: Item) {}
378 open fun added(item: Item) {}
379 open fun removed(item: Item) {}
380
381 open fun compare(old: PackageItem, new: PackageItem) { }
382 open fun compare(old: ClassItem, new: ClassItem) { }
383 open fun compare(old: MethodItem, new: MethodItem) { }
384 open fun compare(old: FieldItem, new: FieldItem) { }
385 open fun compare(old: ParameterItem, new: ParameterItem) { }
386
387 open fun added(item: PackageItem) { }
388 open fun added(item: ClassItem) { }
389 open fun added(item: MethodItem) { }
390 open fun added(item: FieldItem) { }
391 open fun added(item: ParameterItem) { }
392
393 open fun removed(item: PackageItem) { }
394 open fun removed(item: ClassItem) { }
395 open fun removed(item: MethodItem) { }
396 open fun removed(item: FieldItem) { }
397 open fun removed(item: ParameterItem) { }
398 }
399
400This makes it easy to perform API comparison operations.
401
402For example, metalava has a feature to mark "newly annotated" nullness annotations
403as migrated. To do this, it just extends `ComparisonVisitor`, overrides the
404`compare(old: Item, new: Item)` method, and checks whether the old item
405has no nullness annotations and the new one does, and if so, also marks
406the new annotations as @Migrate.
407
408Similarly, the API Check can simply override
409
410 open fun removed(item: Item) {
411 reporter.report(error, item, "Removing ${Item.describe(item)} is not allowed")
412 }
413
414to flag all API elements that have been removed as invalid (since you cannot
415remove API. (The real check is slightly more complicated; it looks into the
416hierarchy to see if there still is an inherited method with the same signature,
417in which case the deletion is allowed.))
418
419### Documentation Generation
420
421As mentioned above, metalava generates documentation directly into the stubs
422files, which can then be processed by Dokka and Javadoc to generate the
423same docs as before.
424
425Doclava1 was integrated with javadoc directly, so the way it generated
426metadata docs (such as documenting permissions, ranges and typedefs from
427annotations) was to insert auxiliary tags (`@range`, `@permission`, etc) and
428then this would get converted into English docs later via `macros_override.cs`.
429
430This it not how metalava does it; it generates the English documentation
431directly. This was not just convenient for the implementation (since metalava
432does not use javadoc data structures to pass maps like the arguments for
433the typedef macro), but should also help Dokka -- and arguably the Kotlin
434code which generates the documentation is easier to reason about and to
435update when it's handling loop conditionals. (As a result I for example
436improved some of the grammar, e.g. when it's listing a number of possible
437constants the conjunction is usually "or", but if it's a flag, the sentence
438begins with "a combination of " and then the conjunction at the end should
439be "and").
440
441## Current Status
442
443Some things are still missing before this tool can be integrated:
444
445- there are some remaining bugs around type resolution in Kotlin and
446 reified methods (also in Kotlin) are not included
447
448- the code needs cleanup, and some performance optimizations (it's about 3x
449 slower than doclava1)
450