1page.title=Performance Tips
2page.article=true
3@jd:body
4
5<div id="tb-wrapper">
6<div id="tb">
7
8<h2>In this document</h2>
9<ol class="nolist">
10  <li><a href="#ObjectCreation">Avoid Creating Unnecessary Objects</a></li>
11  <li><a href="#PreferStatic">Prefer Static Over Virtual</a></li>
12  <li><a href="#UseFinal">Use Static Final For Constants</a></li>
13  <li><a href="#GettersSetters">Avoid Internal Getters/Setters</a></li>
14  <li><a href="#Loops">Use Enhanced For Loop Syntax</a></li>
15  <li><a href="#PackageInner">Consider Package Instead of Private Access with Private Inner Classes</a></li>
16  <li><a href="#AvoidFloat">Avoid Using Floating-Point</a></li>
17  <li><a href="#UseLibraries">Know and Use the Libraries</a></li>
18  <li><a href="#NativeMethods">Use Native Methods Carefully</a></li>
19  <li><a href="#native_methods">Use Native Methods Judiciously</a></li>
20  <li><a href="#closing_notes">Closing Notes</a></li>
21</ol>
22
23</div>
24</div>
25
26<p>This document primarily covers micro-optimizations that can improve overall app performance
27when combined, but it's unlikely that these changes will result in dramatic
28performance effects. Choosing the right algorithms and data structures should always be your
29priority, but is outside the scope of this document. You should use the tips in this document
30as general coding practices that you can incorporate into your habits for general code
31efficiency.</p>
32
33<p>There are two basic rules for writing efficient code:</p>
34<ul>
35    <li>Don't do work that you don't need to do.</li>
36    <li>Don't allocate memory if you can avoid it.</li>
37</ul>
38
39<p>One of the trickiest problems you'll face when micro-optimizing an Android
40app is that your app is certain to be running on multiple types of
41hardware. Different versions of the VM running on different
42processors running at different speeds. It's not even generally the case
43that you can simply say "device X is a factor F faster/slower than device Y",
44and scale your results from one device to others. In particular, measurement
45on the emulator tells you very little about performance on any device. There
46are also huge differences between devices with and without a
47<acronym title="Just In Time compiler">JIT</acronym>: the best
48code for a device with a JIT is not always the best code for a device
49without.</p>
50
51<p>To ensure your app performs well across a wide variety of devices, ensure
52your code is efficient at all levels and agressively optimize your performance.</p>
53
54
55<h2 id="ObjectCreation">Avoid Creating Unnecessary Objects</h2>
56
57<p>Object creation is never free. A generational garbage collector with per-thread allocation
58pools for temporary objects can make allocation cheaper, but allocating memory
59is always more expensive than not allocating memory.</p>
60
61<p>As you allocate more objects in your app, you will force a periodic
62garbage collection, creating little "hiccups" in the user experience. The
63concurrent garbage collector introduced in Android 2.3 helps, but unnecessary work
64should always be avoided.</p>
65
66<p>Thus, you should avoid creating object instances you don't need to.  Some
67examples of things that can help:</p>
68
69<ul>
70    <li>If you have a method returning a string, and you know that its result
71    will always be appended to a {@link java.lang.StringBuffer} anyway, change your signature
72    and implementation so that the function does the append directly,
73    instead of creating a short-lived temporary object.</li>
74    <li>When extracting strings from a set of input data, try
75    to return a substring of the original data, instead of creating a copy.
76    You will create a new {@link java.lang.String} object, but it will share the {@code char[]}
77    with the data. (The trade-off being that if you're only using a small
78    part of the original input, you'll be keeping it all around in memory
79    anyway if you go this route.)</li>
80</ul>
81
82<p>A somewhat more radical idea is to slice up multidimensional arrays into
83parallel single one-dimension arrays:</p>
84
85<ul>
86    <li>An array of {@code int}s is a much better than an array of {@link java.lang.Integer}
87    objects,
88    but this also generalizes to the fact that two parallel arrays of ints
89    are also a <strong>lot</strong> more efficient than an array of {@code (int,int)}
90    objects.  The same goes for any combination of primitive types.</li>
91
92    <li>If you need to implement a container that stores tuples of {@code (Foo,Bar)}
93    objects, try to remember that two parallel {@code Foo[]} and {@code Bar[]} arrays are
94    generally much better than a single array of custom {@code (Foo,Bar)} objects.
95    (The exception to this, of course, is when you're designing an API for
96    other code to access. In those cases, it's usually better to make a small
97    compromise to the speed in order to achieve a good API design. But in your own internal
98    code, you should try and be as efficient as possible.)</li>
99</ul>
100
101<p>Generally speaking, avoid creating short-term temporary objects if you
102can.  Fewer objects created mean less-frequent garbage collection, which has
103a direct impact on user experience.</p>
104
105
106
107
108<h2 id="PreferStatic">Prefer Static Over Virtual</h2>
109
110<p>If you don't need to access an object's fields, make your method static.
111Invocations will be about 15%-20% faster.
112It's also good practice, because you can tell from the method
113signature that calling the method can't alter the object's state.</p>
114
115
116
117
118
119<h2 id="UseFinal">Use Static Final For Constants</h2>
120
121<p>Consider the following declaration at the top of a class:</p>
122
123<pre>
124static int intVal = 42;
125static String strVal = "Hello, world!";
126</pre>
127
128<p>The compiler generates a class initializer method, called
129<code>&lt;clinit&gt;</code>, that is executed when the class is first used.
130The method stores the value 42 into <code>intVal</code>, and extracts a
131reference from the classfile string constant table for <code>strVal</code>.
132When these values are referenced later on, they are accessed with field
133lookups.</p>
134
135<p>We can improve matters with the "final" keyword:</p>
136
137<pre>
138static final int intVal = 42;
139static final String strVal = "Hello, world!";
140</pre>
141
142<p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
143because the constants go into static field initializers in the dex file.
144Code that refers to <code>intVal</code> will use
145the integer value 42 directly, and accesses to <code>strVal</code> will
146use a relatively inexpensive "string constant" instruction instead of a
147field lookup.</p>
148
149<p class="note"><strong>Note:</strong> This optimization applies only to primitive types and
150{@link java.lang.String} constants, not arbitrary reference types. Still, it's good
151practice to declare constants <code>static final</code> whenever possible.</p>
152
153
154
155
156
157<h2 id="GettersSetters">Avoid Internal Getters/Setters</h2>
158
159<p>In native languages like C++ it's common practice to use getters
160(<code>i = getCount()</code>) instead of accessing the field directly (<code>i
161= mCount</code>). This is an excellent habit for C++ and is often practiced in other
162object oriented languages like C# and Java, because the compiler can
163usually inline the access, and if you need to restrict or debug field access
164you can add the code at any time.</p>
165
166<p>However, this is a bad idea on Android.  Virtual method calls are expensive,
167much more so than instance field lookups.  It's reasonable to follow
168common object-oriented programming practices and have getters and setters
169in the public interface, but within a class you should always access
170fields directly.</p>
171
172<p>Without a <acronym title="Just In Time compiler">JIT</acronym>,
173direct field access is about 3x faster than invoking a
174trivial getter. With the JIT (where direct field access is as cheap as
175accessing a local), direct field access is about 7x faster than invoking a
176trivial getter.</p>
177
178<p>Note that if you're using <a href="{@docRoot}tools/help/proguard.html">ProGuard</a>,
179you can have the best of both worlds because ProGuard can inline accessors for you.</p>
180
181
182
183
184
185<h2 id="Loops">Use Enhanced For Loop Syntax</h2>
186
187<p>The enhanced <code>for</code> loop (also sometimes known as "for-each" loop) can be used
188for collections that implement the {@link java.lang.Iterable} interface and for arrays.
189With collections, an iterator is allocated to make interface calls
190to {@code hasNext()} and {@code next()}. With an {@link java.util.ArrayList},
191a hand-written counted loop is
192about 3x faster (with or without JIT), but for other collections the enhanced
193for loop syntax will be exactly equivalent to explicit iterator usage.</p>
194
195<p>There are several alternatives for iterating through an array:</p>
196
197<pre>
198static class Foo {
199    int mSplat;
200}
201
202Foo[] mArray = ...
203
204public void zero() {
205    int sum = 0;
206    for (int i = 0; i &lt; mArray.length; ++i) {
207        sum += mArray[i].mSplat;
208    }
209}
210
211public void one() {
212    int sum = 0;
213    Foo[] localArray = mArray;
214    int len = localArray.length;
215
216    for (int i = 0; i &lt; len; ++i) {
217        sum += localArray[i].mSplat;
218    }
219}
220
221public void two() {
222    int sum = 0;
223    for (Foo a : mArray) {
224        sum += a.mSplat;
225    }
226}
227</pre>
228
229<p><code>zero()</code> is slowest, because the JIT can't yet optimize away
230the cost of getting the array length once for every iteration through the
231loop.</p>
232
233<p><code>one()</code> is faster. It pulls everything out into local
234variables, avoiding the lookups. Only the array length offers a performance
235benefit.</p>
236
237<p><code>two()</code> is fastest for devices without a JIT, and
238indistinguishable from <strong>one()</strong> for devices with a JIT.
239It uses the enhanced for loop syntax introduced in version 1.5 of the Java
240programming language.</p>
241
242<p>So, you should use the enhanced <code>for</code> loop by default, but consider a
243hand-written counted loop for performance-critical {@link java.util.ArrayList} iteration.</p>
244
245<p class="note"><strong>Tip:</strong>
246Also see Josh Bloch's <em>Effective Java</em>, item 46.</p>
247
248
249
250<h2 id="PackageInner">Consider Package Instead of Private Access with Private Inner Classes</h2>
251
252<p>Consider the following class definition:</p>
253
254<pre>
255public class Foo {
256    private class Inner {
257        void stuff() {
258            Foo.this.doStuff(Foo.this.mValue);
259        }
260    }
261
262    private int mValue;
263
264    public void run() {
265        Inner in = new Inner();
266        mValue = 27;
267        in.stuff();
268    }
269
270    private void doStuff(int value) {
271        System.out.println("Value is " + value);
272    }
273}</pre>
274
275<p>What's important here is that we define a private inner class
276(<code>Foo$Inner</code>) that directly accesses a private method and a private
277instance field in the outer class. This is legal, and the code prints "Value is
27827" as expected.</p>
279
280<p>The problem is that the VM considers direct access to <code>Foo</code>'s
281private members from <code>Foo$Inner</code> to be illegal because
282<code>Foo</code> and <code>Foo$Inner</code> are different classes, even though
283the Java language allows an inner class to access an outer class' private
284members. To bridge the gap, the compiler generates a couple of synthetic
285methods:</p>
286
287<pre>
288/*package*/ static int Foo.access$100(Foo foo) {
289    return foo.mValue;
290}
291/*package*/ static void Foo.access$200(Foo foo, int value) {
292    foo.doStuff(value);
293}</pre>
294
295<p>The inner class code calls these static methods whenever it needs to
296access the <code>mValue</code> field or invoke the <code>doStuff()</code> method
297in the outer class. What this means is that the code above really boils down to
298a case where you're accessing member fields through accessor methods.
299Earlier we talked about how accessors are slower than direct field
300accesses, so this is an example of a certain language idiom resulting in an
301"invisible" performance hit.</p>
302
303<p>If you're using code like this in a performance hotspot, you can avoid the
304overhead by declaring fields and methods accessed by inner classes to have
305package access, rather than private access. Unfortunately this means the fields
306can be accessed directly by other classes in the same package, so you shouldn't
307use this in public API.</p>
308
309
310
311
312<h2 id="AvoidFloat">Avoid Using Floating-Point</h2>
313
314<p>As a rule of thumb, floating-point is about 2x slower than integer on
315Android-powered devices.</p>
316
317<p>In speed terms, there's no difference between <code>float</code> and
318<code>double</code> on the more modern hardware. Space-wise, <code>double</code>
319is 2x larger. As with desktop machines, assuming space isn't an issue, you
320should prefer <code>double</code> to <code>float</code>.</p>
321
322<p>Also, even for integers, some processors have hardware multiply but lack
323hardware divide. In such cases, integer division and modulus operations are
324performed in software&mdash;something to think about if you're designing a
325hash table or doing lots of math.</p>
326
327
328
329
330<h2 id="UseLibraries">Know and Use the Libraries</h2>
331
332<p>In addition to all the usual reasons to prefer library code over rolling
333your own, bear in mind that the system is at liberty to replace calls
334to library methods with hand-coded assembler, which may be better than the
335best code the JIT can produce for the equivalent Java. The typical example
336here is {@link java.lang.String#indexOf String.indexOf()} and
337related APIs, which Dalvik replaces with
338an inlined intrinsic. Similarly, the {@link java.lang.System#arraycopy
339System.arraycopy()} method
340is about 9x faster than a hand-coded loop on a Nexus One with the JIT.</p>
341
342
343<p class="note"><strong>Tip:</strong>
344Also see Josh Bloch's <em>Effective Java</em>, item 47.</p>
345
346
347
348
349<h2 id="NativeMethods">Use Native Methods Carefully</h2>
350
351<p>Developing your app with native code using the
352<a href="{@docRoot}tools/sdk/ndk/index.html">Android NDK</a>
353isn't necessarily more efficient than programming with the
354Java language. For one thing,
355there's a cost associated with the Java-native transition, and the JIT can't
356optimize across these boundaries. If you're allocating native resources (memory
357on the native heap, file descriptors, or whatever), it can be significantly
358more difficult to arrange timely collection of these resources. You also
359need to compile your code for each architecture you wish to run on (rather
360than rely on it having a JIT). You may even have to compile multiple versions
361for what you consider the same architecture: native code compiled for the ARM
362processor in the G1 can't take full advantage of the ARM in the Nexus One, and
363code compiled for the ARM in the Nexus One won't run on the ARM in the G1.</p>
364
365<p>Native code is primarily useful when you have an existing native codebase
366that you want to port to Android, not for "speeding up" parts of your Android app
367written with the Java language.</p>
368
369<p>If you do need to use native code, you should read our
370<a href="{@docRoot}guide/practices/jni.html">JNI Tips</a>.</p>
371
372<p class="note"><strong>Tip:</strong>
373Also see Josh Bloch's <em>Effective Java</em>, item 54.</p>
374
375
376
377
378
379<h2 id="Myths">Performance Myths</h2>
380
381
382<p>On devices without a JIT, it is true that invoking methods via a
383variable with an exact type rather than an interface is slightly more
384efficient. (So, for example, it was cheaper to invoke methods on a
385<code>HashMap map</code> than a <code>Map map</code>, even though in both
386cases the map was a <code>HashMap</code>.) It was not the case that this
387was 2x slower; the actual difference was more like 6% slower. Furthermore,
388the JIT makes the two effectively indistinguishable.</p>
389
390<p>On devices without a JIT, caching field accesses is about 20% faster than
391repeatedly accessing the field. With a JIT, field access costs about the same
392as local access, so this isn't a worthwhile optimization unless you feel it
393makes your code easier to read. (This is true of final, static, and static
394final fields too.)
395
396
397
398<h2 id="Measure">Always Measure</h2>
399
400<p>Before you start optimizing, make sure you have a problem that you
401need to solve. Make sure you can accurately measure your existing performance,
402or you won't be able to measure the benefit of the alternatives you try.</p>
403
404<p>Every claim made in this document is backed up by a benchmark. The source
405to these benchmarks can be found in the <a
406href="http://code.google.com/p/dalvik/source/browse/#svn/trunk/benchmarks">code.google.com
407"dalvik" project</a>.</p>
408
409<p>The benchmarks are built with the
410<a href="http://code.google.com/p/caliper/">Caliper</a> microbenchmarking
411framework for Java. Microbenchmarks are hard to get right, so Caliper goes out
412of its way to do the hard work for you, and even detect some cases where you're
413not measuring what you think you're measuring (because, say, the VM has
414managed to optimize all your code away). We highly recommend you use Caliper
415to run your own microbenchmarks.</p>
416
417<p>You may also find
418<a href="{@docRoot}tools/debugging/debugging-tracing.html">Traceview</a> useful
419for profiling, but it's important to realize that it currently disables the JIT,
420which may cause it to misattribute time to code that the JIT may be able to win
421back. It's especially important after making changes suggested by Traceview
422data to ensure that the resulting code actually runs faster when run without
423Traceview.</p>
424
425<p>For more help profiling and debugging your apps, see the following documents:</p>
426
427<ul>
428  <li><a href="{@docRoot}tools/debugging/debugging-tracing.html">Profiling with
429    Traceview and dmtracedump</a></li>
430  <li><a href="{@docRoot}tools/debugging/systrace.html">Analysing Display and Performance
431    with Systrace</a></li>
432</ul>
433
434