1ARM Neon reference tests
2========================
3This package contains extensive tests for the ARM/Neon instructions.
4
5It works by building a program which uses all of them, and then
6executing it on an actual target or a simulator.
7
8It can be used to validate the simulator against an actual HW target,
9or to validate C compilers in presence of Neon intrinsics calls.
10
11The supplied Makefile enables to build with both ARM RVCT compiler and
12GNU GCC (for the ARM target), and supports execution with ARM RVDEBUG
13on an ARM simulator and with QEMU.
14
15For convenience, the ARM ELF binary file (as compiled with RVCT) is
16supplied (compute_ref.axf), as well as expected output (ref-rvct.txt).
17
18A second file containing expected output is also supplied:
19ref-rvct-neon.txt, which contains only the results of the Neon
20instrinsics tests. It is aimed at being used to check GCC's results,
21since this compiler does not support the integer & dsp builtins whose
22results are also present in ref-rvct.txt.
23
24Typical usage when used to debug QEmu:
25$ make all # to build the test program with ARM rvct and execute with QEmu
26$ make check # to compare the results with the expected output
27
28
29Known issues:
30-------------
31Some tests currently fail to build with GCC/ARM:
32- missing include files: dspfns.h, armdsp.h
33
34As GCC/ARM provides no support for the
35Neon_Cumulative_Saturation/fpsrc register, auxiliary accessor
36functions have been implemented in stm-arm-neon-ref.h.
37
38Engineering:
39------------
40In order to cover all the Neon instructions extensively, these tests
41make intensive use of the C-preprocessor, to save maintenance efforts.
42
43Most tests (the more regular ones) share a common basic structure. In
44general, variable names are suffixed by their type name, so as to
45differentiate variables with the same purpose but of differente types.
46Hence vector1_int8x8, vector1_int16x4 etc...
47
48For instance in ref_vmul.c the layout of the code is as follows:
49
50- declare input and output vectors (named 'vector1', 'vector2' and
51  'vector_res') of each possible type (s/u, 8/16/32/64 bits).
52
53- clean the result buffers.
54
55- initialize input vectors 'vector1' and 'vector2'.
56
57- call each variant of the intrinsic and store the result in a buffer
58  named 'buffer', whose contents is printed after execution.
59
60One can then compare the actual result with the expected one.
61