1# Image classification
2
3<img src="../images/image.png" class="attempt-right">
4
5Use a pre-trained and optimized model to identify hundreds of classes of
6objects, including people, activities, animals, plants, and places.
7
8## Get started
9
10If you are unfamiliar with the concept of image classification, you should start
11by reading <a href="#what_is_image_classification">What is image
12classification?</a>
13
14If you understand image classification, you’re new to TensorFlow Lite, and
15you’re working with Android or iOS, we recommend following the corresponding
16tutorial that will walk you through our sample code.
17
18<a class="button button-primary" href="android.md">Android</a>
19<a class="button button-primary" href="ios.md">iOS</a>
20
21We also provide <a href="example_applications">example applications</a> you can
22use to get started.
23
24If you are using a platform other than Android or iOS, or you are already
25familiar with the <a href="https://www.tensorflow.org/api_docs/python/tf/lite">TensorFlow Lite APIs</a>, you can
26download our starter image classification model and the accompanying labels.
27
28<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">Download
29starter model and labels</a>
30
31Once you have the starter model running on your target device, you can
32experiment with different models to find the optimal balance between
33performance, accuracy, and model size. For guidance, see
34<a href="#choose_a_different_model">Choose a different model</a>.
35
36If you are using a platform other than Android or iOS, or you are already
37familiar with the <a href="https://www.tensorflow.org/api_docs/python/tf/lite">TensorFlow Lite APIs</a>, you can
38download our starter image classification model and the accompanying labels.
39
40<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">Download
41starter model and labels</a>
42
43### Example applications
44
45We have example applications for image classification for both Android and iOS.
46
47<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android">Android
48example</a>
49<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios.md">iOS
50example</a>
51
52The following screenshot shows the Android image classification example:
53
54<img src="images/android_banana.png" alt="Screenshot of Android example" width="30%">
55
56## What is image classification?
57
58A common use of machine learning is to identify what an image represents. For
59example, we might want to know what type of animal appears in the following
60photograph.
61
62<img src="images/dog.png" alt="dog" width="50%">
63
64The task of predicting what an image represents is called _image
65classification_. An image classification model is trained to recognize various
66classes of images. For example, a model might be trained to recognize photos
67representing three different types of animals: rabbits, hamsters, and dogs.
68
69When we subsequently provide a new image as input to the model, it will output
70the probabilities of the image representing each of the types of animal it was
71trained on. An example output might be as follows:
72
73<table style="width: 40%;">
74  <thead>
75    <tr>
76      <th>Animal type</th>
77      <th>Probability</th>
78    </tr>
79  </thead>
80  <tbody>
81    <tr>
82      <td>Rabbit</td>
83      <td>0.07</td>
84    </tr>
85    <tr>
86      <td>Hamster</td>
87      <td>0.02</td>
88    </tr>
89    <tr>
90      <td style="background-color: #fcb66d;">Dog</td>
91      <td style="background-color: #fcb66d;">0.91</td>
92    </tr>
93  </tbody>
94</table>
95
96Based on the output, we can see that the classification model has predicted that
97the image has a high probability of representing a dog.
98
99Note: Image classification can only tell you the probability that an image
100represents one or more of the classes that the model was trained on. It cannot
101tell you the position or identity of objects within the image. If you need to
102identify objects and their positions within images, you should use an
103<a href="../object_detection/overview.md">object detection</a> model.
104
105### Training, labels, and inference
106
107During training, an image classification model is fed images and their
108associated _labels_. Each label is the name of a distinct concept, or class,
109that the model will learn to recognize.
110
111Given sufficient training data (often hundreds or thousands of images per
112label), an image classification model can learn to predict whether new images
113belong to any of the classes it has been trained on. This process of prediction
114is called _inference_.
115
116To perform inference, an image is passed as input to a model. The model will
117then output an array of probabilities between 0 and 1. With our example model,
118this process might look like the following:
119
120<table style="width: 60%">
121  <tr style="border-top: 0px;">
122    <td style="width: 40%"><img src="images/dog.png" alt="dog"></td>
123    <td style="width: 20%; font-size: 2em; vertical-align: middle; text-align: center;">→</td>
124    <td style="width: 40%; vertical-align: middle; text-align: center;">[0.07, 0.02, 0.91]</td>
125</table>
126
127Each number in the output corresponds to a label in our training data.
128Associating our output with the three labels the model was trained on, we can
129see the model has predicted a high probability that the image represents a dog.
130
131<table style="width: 40%;">
132  <thead>
133    <tr>
134      <th>Label</th>
135      <th>Probability</th>
136    </tr>
137  </thead>
138  <tbody>
139    <tr>
140      <td>rabbit</td>
141      <td>0.07</td>
142    </tr>
143    <tr>
144      <td>hamster</td>
145      <td>0.02</td>
146    </tr>
147    <tr>
148      <td style="background-color: #fcb66d;">dog</td>
149      <td style="background-color: #fcb66d;">0.91</td>
150    </tr>
151  </tbody>
152</table>
153
154You might notice that the sum of all the probabilities (for rabbit, hamster, and
155dog) is equal to 1. This is a common type of output for models with multiple
156classes (see
157<a href="https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/softmax">Softmax</a>
158for more information).
159
160### Ambiguous results
161
162Since the probabilities will always sum to 1, if the image is not confidently
163recognized as belonging to any of the classes the model was trained on you may
164see the probability distributed throughout the labels without any one value
165being significantly larger.
166
167For example, the following might indicate an ambiguous result:
168
169<table style="width: 40%;">
170  <thead>
171    <tr>
172      <th>Label</th>
173      <th>Probability</th>
174    </tr>
175  </thead>
176  <tbody>
177    <tr>
178      <td>rabbit</td>
179      <td>0.31</td>
180    </tr>
181    <tr>
182      <td>hamster</td>
183      <td>0.35</td>
184    </tr>
185    <tr>
186      <td>dog</td>
187      <td>0.34</td>
188    </tr>
189  </tbody>
190</table>
191
192### Uses and limitations
193
194The image classification models that we provide are useful for single-label
195classification, which means predicting which single label the image is most
196likely to represent. They are trained to recognize 1000 classes of image. For a
197full list of classes, see the labels file in the
198<a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">model
199zip</a>.
200
201If you want to train a model to recognize new classes, see
202<a href="#customize_model">Customize model</a>.
203
204For the following use cases, you should use a different type of model:
205
206<ul>
207  <li>Predicting the type and position of one or more objects within an image (see <a href="../object_detection/overview.md">object detection</a>)</li>
208  <li>Predicting the composition of an image, for example subject versus background (see <a href="../segmentation/overview.md">segmentation</a>)</li>
209</ul>
210
211Once you have the starter model running on your target device, you can
212experiment with different models to find the optimal balance between
213performance, accuracy, and model size. For guidance, see
214<a href="#choose_a_different_model">Choose a different model</a>.
215
216## Choose a different model
217
218There are a large number of image classification models available on our
219<a href="../../guide/hosted_models.md">List of hosted models</a>. You should aim to choose the
220optimal model for your application based on performance, accuracy and model
221size. There are trade-offs between each of them.
222
223### Performance
224
225We measure performance in terms of the amount of time it takes for a model to
226run inference on a given piece of hardware. The less time, the faster the model.
227
228The performance you require depends on your application. Performance can be
229important for applications like real-time video, where it may be important to
230analyze each frame in the time before the next frame is drawn (e.g. inference
231must be faster than 33ms to perform real-time inference on a 30fps video
232stream).
233
234Our quantized Mobilenet models’ performance ranges from 3.7ms to 80.3 ms.
235
236### Accuracy
237
238We measure accuracy in terms of how often the model correctly classifies an
239image. For example, a model with a stated accuracy of 60% can be expected to
240classify an image correctly an average of 60% of the time.
241
242Our <a href="../../guide/hosted_models.md">list of hosted models</a> provides Top-1 and Top-5
243accuracy statistics. Top-1 refers to how often the correct label appears as the
244label with the highest probability in the model’s output. Top-5 refers to how
245often the correct label appears in the top 5 highest probabilities in the
246model’s output.
247
248Our quantized Mobilenet models’ Top-5 accuracy ranges from 64.4 to 89.9%.
249
250### Size
251
252The size of a model on-disk varies with its performance and accuracy. Size may
253be important for mobile development (where it might impact app download sizes)
254or when working with hardware (where available storage might be limited).
255
256Our quantized Mobilenet models’ size ranges from 0.5 to 3.4 Mb.
257
258### Architecture
259
260There are several different architectures of models available on
261<a href="../../guide/hosted_models.md">List of hosted models</a>, indicated by the model’s name.
262For example, you can choose between Mobilenet, Inception, and others.
263
264The architecture of a model impacts its performance, accuracy, and size. All of
265our hosted models are trained on the same data, meaning you can use the provided
266statistics to compare them and choose which is optimal for your application.
267
268Note: The image classification models we provide accept varying sizes of input. For some models, this is indicated in the filename. For example, the Mobilenet_V1_1.0_224 model accepts an input of 224x224 pixels. <br /><br />All of the models require three color channels per pixel (red, green, and blue). Quantized models require 1 byte per channel, and float models require 4 bytes per channel.<br /><br />Our <a href="android.md">Android</a> and <a href="ios.md">iOS</a> code samples demonstrate how to process full-sized camera images into the required format for each model.
269
270## Customize model
271
272The pre-trained models we provide are trained to recognize 1000 classes of
273image. For a full list of classes, see the labels file in the
274<a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">model
275zip</a>.
276
277You can use a technique known as _transfer learning_ to re-train a model to
278recognize classes not in the original set. For example, you could re-train the
279model to distinguish between different species of tree, despite there being no
280trees in the original training data. To do this, you will need a set of training
281images for each of the new labels you wish to train.
282
283Learn how to perform transfer learning in the
284<a href="https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/">TensorFlow
285for Poets</a> codelab.
286