1# Image classification 2 3<img src="../images/image.png" class="attempt-right"> 4 5The task of identifying what an image represents is called _image 6classification_. An image classification model is trained to recognize various 7classes of images. For example, you may train a model to recognize photos 8representing three different types of animals: rabbits, hamsters, and dogs. 9TensorFlow Lite provides optimized pre-trained models that you can deploy in 10your mobile applications. Learn more about image classification using TensorFlow 11[here](https://www.tensorflow.org/tutorials/images/classification). 12 13The following image shows the output of the image classification model on 14Android. 15 16<img src="images/android_banana.png" alt="Screenshot of Android example" width="30%"> 17 18## Get started 19 20If you are new to TensorFlow Lite and are working with Android or iOS, it is 21recommended you explore the following example applications that can help you get 22started. 23 24You can leverage the out-of-box API from 25[TensorFlow Lite Task Library](../../inference_with_metadata/task_library/image_classifier) 26to integrate image classification models in just a few lines of code. You can 27also build your own custom inference pipeline using the 28[TensorFlow Lite Support Library](../../inference_with_metadata/lite_support). 29 30The Android example below demonstrates the implementation for both methods as 31[lib_task_api](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_task_api) 32and 33[lib_support](https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/lib_support), 34respectively. 35 36<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android">View 37Android example</a> 38 39<a class="button button-primary" href="https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios">View 40iOS example</a> 41 42If you are using a platform other than Android/iOS, or if you are already 43familiar with the 44[TensorFlow Lite APIs](https://www.tensorflow.org/api_docs/python/tf/lite), 45download the starter model and supporting files (if applicable). 46 47<a class="button button-primary" href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">Download 48starter model</a> 49 50## Model description 51 52### How it works 53 54During training, an image classification model is fed images and their 55associated _labels_. Each label is the name of a distinct concept, or class, 56that the model will learn to recognize. 57 58Given sufficient training data (often hundreds or thousands of images per 59label), an image classification model can learn to predict whether new images 60belong to any of the classes it has been trained on. This process of prediction 61is called _inference_. Note that you can also use 62[transfer learning](https://www.tensorflow.org/tutorials/images/transfer_learning) 63to identify new classes of images by using a pre-existing model. Transfer 64learning does not require a very large training dataset. 65 66When you subsequently provide a new image as input to the model, it will output 67the probabilities of the image representing each of the types of animal it was 68trained on. An example output might be as follows: 69 70<table style="width: 40%;"> 71 <thead> 72 <tr> 73 <th>Animal type</th> 74 <th>Probability</th> 75 </tr> 76 </thead> 77 <tbody> 78 <tr> 79 <td>Rabbit</td> 80 <td>0.07</td> 81 </tr> 82 <tr> 83 <td>Hamster</td> 84 <td>0.02</td> 85 </tr> 86 <tr> 87 <td style="background-color: #fcb66d;">Dog</td> 88 <td style="background-color: #fcb66d;">0.91</td> 89 </tr> 90 </tbody> 91</table> 92 93Each number in the output corresponds to a label in the training data. 94Associating the output with the three labels the model was trained on, you can 95see that the model has predicted a high probability that the image represents a 96dog. 97 98You might notice that the sum of all the probabilities (for rabbit, hamster, and 99dog) is equal to 1. This is a common type of output for models with multiple 100classes (see 101<a href="https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/softmax">Softmax</a> 102for more information). 103 104Note: Image classification can only tell you the probability that an image 105represents one or more of the classes that the model was trained on. It cannot 106tell you the position or identity of objects within the image. If you need to 107identify objects and their positions within images, you should use an 108<a href="../object_detection/overview.md">object detection</a> model. 109 110<h4>Ambiguous results</h4> 111 112Since the output probabilities will always sum to 1, if an image is not 113confidently recognized as belonging to any of the classes the model was trained 114on you may see the probability distributed throughout the labels without any one 115value being significantly larger. 116 117For example, the following might indicate an ambiguous result: 118 119<table style="width: 40%;"> 120 <thead> 121 <tr> 122 <th>Label</th> 123 <th>Probability</th> 124 </tr> 125 </thead> 126 <tbody> 127 <tr> 128 <td>rabbit</td> 129 <td>0.31</td> 130 </tr> 131 <tr> 132 <td>hamster</td> 133 <td>0.35</td> 134 </tr> 135 <tr> 136 <td>dog</td> 137 <td>0.34</td> 138 </tr> 139 </tbody> 140</table> 141If your model frequently returns ambiguous results, you may need a different, 142more accurate model. 143 144<h3>Choosing a model architecture</h3> 145 146TensorFlow Lite provides you with a variety of image classification models which 147are all trained on the original dataset. Model architectures like MobileNet, 148Inception, and NASNet are available on the 149<a href="../../guide/hosted_models.md">hosted models page</a>. To choose the best model for 150your use case, you need to consider the individual architectures as well as some 151of the tradeoffs between various models. Some of these model tradeoffs are based 152on metrics such as performance, accuracy, and model size. For example, you might 153need a faster model for building a bar code scanner while you might prefer a 154slower, more accurate model for a medical imaging app. 155 156Note that the <a href=https://www.tensorflow.org/lite/guide/hosted_models#image_classification>image classification models</a> provided accept varying sizes of input. For some models, this is indicated in the filename. For example, the Mobilenet_V1_1.0_224 model accepts an input of 224x224 pixels. All of the models require three color channels per pixel (red, green, and blue). Quantized models require 1 byte per channel, and float models require 4 bytes per channel. The <a href="https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android/EXPLORE_THE_CODE.md">Android</a> and <a href="https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/ios/EXPLORE_THE_CODE.md">iOS</a> code samples demonstrate how to process full-sized camera images into the required format for each model. 157 158<h3>Uses and limitations</h3> 159 160The TensorFlow Lite image classification models are useful for single-label 161classification; that is, predicting which single label the image is most likely to 162represent. They are trained to recognize 1000 image classes. For a full list of 163classes, see the labels file in the 164<a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">model 165zip</a>. 166 167If you want to train a model to recognize new classes, see 168<a href="#customize_model">Customize model</a>. 169 170For the following use cases, you should use a different type of model: 171 172<ul> 173 <li>Predicting the type and position of one or more objects within an image (see <a href="../object_detection/overview.md">Object detection</a>)</li> 174 <li>Predicting the composition of an image, for example subject versus background (see <a href="../segmentation/overview.md">Segmentation</a>)</li> 175</ul> 176 177Once you have the starter model running on your target device, you can 178experiment with different models to find the optimal balance between 179performance, accuracy, and model size. 180 181<h3>Customize model</h3> 182 183The pre-trained models provided are trained to recognize 1000 classes of images. 184For a full list of classes, see the labels file in the 185<a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">model 186zip</a>. 187 188You can also use transfer learning to re-train a model to 189recognize classes not in the original set. For example, you could re-train the 190model to distinguish between different species of tree, despite there being no 191trees in the original training data. To do this, you will need a set of training 192images for each of the new labels you wish to train. 193 194Learn how to perform transfer learning in the 195<a href="https://codelabs.developers.google.com/codelabs/recognize-flowers-with-tensorflow-on-android/index.html#0">Recognize 196flowers with TensorFlow</a> codelab, or with the 197<a href="https://www.tensorflow.org/lite/tutorials/model_maker_image_classification">Model Maker library</a>. 198 199<h2>Performance benchmarks</h2> 200 201Model performance is measured in terms of the amount of time it takes for a 202model to run inference on a given piece of hardware. The lower the time, the faster 203the model. 204 205The performance you require depends on your application. Performance can be 206important for applications like real-time video, where it may be important to 207analyze each frame in the time before the next frame is drawn (e.g. inference 208must be faster than 33ms to perform real-time inference on a 30fps video 209stream). 210 211The TensorFlow Lite quantized MobileNet models' performance range from 3.7ms to 21280.3 ms. 213 214Performance benchmark numbers are generated with the 215<a href="https://www.tensorflow.org/lite/performance/benchmarks">benchmarking tool</a>. 216 217<table> 218 <thead> 219 <tr> 220 <th>Model Name</th> 221 <th>Model size </th> 222 <th>Device </th> 223 <th>NNAPI</th> 224 <th>CPU</th> 225 </tr> 226 </thead> 227 <tr> 228 <td rowspan = 3> 229 <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip">Mobilenet_V1_1.0_224_quant</a> 230 </td> 231 <td rowspan = 3> 232 4.3 Mb 233 </td> 234 <td>Pixel 3 (Android 10) </td> 235 <td>6ms</td> 236 <td>13ms*</td> 237 </tr> 238 <tr> 239 <td>Pixel 4 (Android 10) </td> 240 <td>3.3ms</td> 241 <td>5ms*</td> 242 </tr> 243 <tr> 244 <td>iPhone XS (iOS 12.4.1) </td> 245 <td></td> 246 <td>11ms** </td> 247 </tr> 248</table> 249 250\* 4 threads used. 251 252\*\* 2 threads used on iPhone for the best performance result. 253 254### Model accuracy 255 256Accuracy is measured in terms of how often the model correctly classifies an 257image. For example, a model with a stated accuracy of 60% can be expected to 258classify an image correctly an average of 60% of the time. 259 260The [list of hosted models](../../guide/hosted_models.md) provides Top-1 and 261Top-5 accuracy statistics. Top-1 refers to how often the correct label appears 262as the label with the highest probability in the model’s output. Top-5 refers to 263how often the correct label appears in the 5 highest probabilities in the 264model’s output. 265 266The TensorFlow Lite quantized MobileNet models’ Top-5 accuracy range from 64.4 267to 89.9%. 268 269### Model size 270 271The size of a model on-disk varies with its performance and accuracy. Size may 272be important for mobile development (where it might impact app download sizes) 273or when working with hardware (where available storage might be limited). 274 275The TensorFlow Lite quantized MobileNet models' sizes range from 0.5 to 3.4 MB. 276 277## Further reading and resources 278 279Use the following resources to learn more about concepts related to image 280classification: 281 282* [Image classification using TensorFlow](https://www.tensorflow.org/tutorials/images/classification) 283* [Image classification with CNNs](https://www.tensorflow.org/tutorials/images/cnn) 284* [Transfer learning](https://www.tensorflow.org/tutorials/images/transfer_learning) 285* [Data augmentation](https://www.tensorflow.org/tutorials/images/data_augmentation) 286