Copyright 2017 The Android Open Source Project Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ------------------------------------------------------------------ This directory contains files for the Android MLTS (Machine Learning Test Suite). MLTS allows to evaluate NNAPI acceleration latency and accuracy on an Android device, using few selected ML models and datesets. Models and datasets used description and licensing can be found in platform/test/mlts/models/README.txt file. Usage: * Connect a target device to your workstation, make sure it's reachable through adb. Export target device ANDROID_SERIAL environment variable if more than one device is connected. * cd into android top-level source directory > source build/ > lunch aosp_arm-userdebug # Or aosp_arm64-userdebug if available. > ./test/mlts/benchmark/ * At the end of a benchmark run, its results will be presented as html page, passed to xdg-open. # Crash test The MLTS suite contains a series of tests to validate the behaviour of the drivers under stress or in corner case conditions. To run the tests use the specific targets available in the script. By default, every test gets run on each available accelerator in isolation. It is possible to filter the accelerators to test against by invoking the script with the option -f (--filter-driver) and specifying a regular expression to filter the acccelerator names with. It is also possible to run additional tests without specified target accelerator to let NNAPI partition the model and assign the best available one(s) by using the -r (--include-nnapi-reference) option. Currently available tests are: * parallel-inference-stress: to test the behaviour of drivers with different amount of inference executed in parallel. Tests are running in a separate process so crashes can be detected and notified as test failures. * parallel-inference-stress-in-process: same as parallel-inference-stress but the tests are running in the same process of the test so in case of crash the testing app will crash too * client-early-termination-stress: to test the resilience of device drivers to failing clients. It spawns a separate process each running a set of parallel threads compiling different models. The process is then forcibly terminated. The test validates that the targeted driver is not crashing or hanging * multi-process-inference-stress: this extends the `parallel-inference-stress` running inference on a single model in multiple processes and threads with different probabilities in client process early termination * multi-process-model-load-stress: this extends the `parallel-inference-stress` running inference on a single model in multiple processes and threads with different probabilities in client process early termination * memory-mapped-model-load-stress: runs a series of parallel model compilation with memory mapped TFLite models * model-load-random-stress: test compiling a large set of randomly generated models * inference-random-stress: test running a large set of randomly generated models * performance-degradation-stress: verifies that accelerator inference speed is not degrading over a certain threshold when running concurrent workload