1# Output pipelines in gemmlowp
2
3In gemmlowp, the "output pipeline" is the process that takes a final `int32`
4accumulator value (the output of the compute/kernel stage), and processes it to
5obtain the final value (typically a `uint8` value) and write it to the
6destination matrix.
7
8Gemmlowp has some genericity in what arithmetic transformations take place in
9the output pipeline, so as to allow different users to implement different
10quantization paradigms. See [low-precision.md](low-precision.md) and
11[quantization.md](quantization.md).
12
13Besides implementing a quantization paradigm, the other thing that output
14pipelines is good for, is implementing fused operations where a matrix
15multiplication feeds into other operations applied to its result, without
16additional array traversals. For instance, when implementing neural network
17inference, one might have a Convolutional layer with a bias-addition and an
18activation. One then wants to feed the result of the matrix multiplication
19implementing the Convolutional operator itself, directly into the bias-addition
20and activation function. gemmlowp's output pipelines allow implementing that:
21the bias-addition and activation function are just additional stages in the
22output pipeline.
23
24## Usage
25
26The gemmlowp entry point allowing to use an arbitrary output pipeline is
27`GemmWithOutputPipeline` in [public/gemmlowp.h](../public/gemmlowp.h).
28
29The output pipeline is specified as a `std::tuple` of "output stages", each of
30which defining an elementary arithmetic transformation.
31
32All available output stages are defined in
33[public/output_stages.h](../public/output_stages.h).
34
35## Example usage
36
37The best part to see examples of using various output pipelines is in the unit
38test,
39
40```
41test/test.cc
42```
43
44specifically in this function:
45
46```
47TestOutputStages
48```
49
50Separately, a self-contained example showing how to use gemmlowp to compute a
51quantized matrix multiplication with a sounds quantization paradigm, is here:
52
53[doc/quantization_example.cc](quantization_example.cc)
54