1//===-- README.txt - Notes for Blackfin Target ------------------*- org -*-===//
2
3* Condition codes
4** DONE Problem with asymmetric SETCC operations
5The instruction
6
7 CC = R0 < 2
8
9is not symmetric - there is no R0 > 2 instruction. On the other hand, IF CC
10JUMP can take both CC and !CC as a condition. We cannot pattern-match (brcond
11(not cc), target), the DAG optimizer removes that kind of thing.
12
13This is handled by creating a pseudo-register NCC that aliases CC. Register
14classes JustCC and NotCC are used to control the inversion of CC.
15
16** DONE CC as an i32 register
17The AnyCC register class pretends to hold i32 values. It can only represent the
18values 0 and 1, but we can copy to and from the D class. This hack makes it
19possible to represent the setcc instruction without having i1 as a legal type.
20
21In most cases, the CC register is set by a "CC = .." or BITTST instruction, and
22then used in a conditional branch or move. The code generator thinks it is
23moving 32 bits, but the value stays in CC. In other cases, the result of a
24comparison is actually used as am i32 number, and CC will be copied to a D
25register.
26
27* Stack frames
28** TODO Use Push/Pop instructions
29We should use the push/pop instructions when saving callee-saved
30registers. The are smaller, and we may even use push multiple instructions.
31
32** TODO requiresRegisterScavenging
33We need more intelligence in determining when the scavenger is needed. We
34should keep track of:
35- Spilling D16 registers
36- Spilling AnyCC registers
37
38* Assembler
39** TODO Implement PrintGlobalVariable
40** TODO Remove LOAD32sym
41It's a hack combining two instructions by concatenation.
42
43* Inline Assembly
44
45These are the GCC constraints from bfin/constraints.md:
46
47| Code | Register class | LLVM |
48|-------+-------------------------------------------+------|
49| a | P | C |
50| d | D | C |
51| z | Call clobbered P (P0, P1, P2) | X |
52| D | EvenD | X |
53| W | OddD | X |
54| e | Accu | C |
55| A | A0 | S |
56| B | A1 | S |
57| b | I | C |
58| v | B | C |
59| f | M | C |
60| c | Circular I, B, L | X |
61| C | JustCC | S |
62| t | LoopTop | X |
63| u | LoopBottom | X |
64| k | LoopCount | X |
65| x | GR | C |
66| y | RET*, ASTAT, SEQSTAT, USP | X |
67| w | ALL | C |
68| Z | The FD-PIC GOT pointer (P3) | S |
69| Y | The FD-PIC function pointer register (P1) | S |
70| q0-q7 | R0-R7 individually | |
71| qA | P0 | |
72|-------+-------------------------------------------+------|
73| Code | Constant | |
74|-------+-------------------------------------------+------|
75| J | 1<<N, N<32 | |
76| Ks3 | imm3 | |
77| Ku3 | uimm3 | |
78| Ks4 | imm4 | |
79| Ku4 | uimm4 | |
80| Ks5 | imm5 | |
81| Ku5 | uimm5 | |
82| Ks7 | imm7 | |
83| KN7 | -imm7 | |
84| Ksh | imm16 | |
85| Kuh | uimm16 | |
86| L | ~(1<<N) | |
87| M1 | 0xff | |
88| M2 | 0xffff | |
89| P0-P4 | 0-4 | |
90| PA | Macflag, not M | |
91| PB | Macflag, only M | |
92| Q | Symbol | |
93
94** TODO Support all register classes
95* DAG combiner
96** Create test case for each Illegal SETCC case
97The DAG combiner may someimes produce illegal i16 SETCC instructions.
98
99*** TODO SETCC (ctlz x), 5) == const
100*** TODO SETCC (and load, const) == const
101*** DONE SETCC (zext x) == const
102*** TODO SETCC (sext x) == const
103
104* Instruction selection
105** TODO Better imediate constants
106Like ARM, build constants as small imm + shift.
107
108** TODO Implement cycle counter
109We have CYCLES and CYCLES2 registers, but the readcyclecounter intrinsic wants
110to return i64, and the code generator doesn't know how to legalize that.
111
112** TODO Instruction alternatives
113Some instructions come in different variants for example:
114
115 D = D + D
116 P = P + P
117
118Cross combinations are not allowed:
119
120 P = D + D (bad)
121
122Similarly for the subreg pseudo-instructions:
123
124 D16L = EXTRACT_SUBREG D16, bfin_subreg_lo16
125 P16L = EXTRACT_SUBREG P16, bfin_subreg_lo16
126
127We want to take advantage of the alternative instructions. This could be done by
128changing the DAG after instruction selection.
129
130
131** Multipatterns for load/store
132We should try to identify multipatterns for load and store instructions. The
133available instruction matrix is a bit irregular.
134
135Loads:
136
137| Addr | D | P | D 16z | D 16s | D16 | D 8z | D 8s |
138|------------+---+---+-------+-------+-----+------+------|
139| P | * | * | * | * | * | * | * |
140| P++ | * | * | * | * | | * | * |
141| P-- | * | * | * | * | | * | * |
142| P+uimm5m2 | | | * | * | | | |
143| P+uimm6m4 | * | * | | | | | |
144| P+imm16 | | | | | | * | * |
145| P+imm17m2 | | | * | * | | | |
146| P+imm18m4 | * | * | | | | | |
147| P++P | * | | * | * | * | | |
148| FP-uimm7m4 | * | * | | | | | |
149| I | * | | | | * | | |
150| I++ | * | | | | * | | |
151| I-- | * | | | | * | | |
152| I++M | * | | | | | | |
153
154Stores:
155
156| Addr | D | P | D16H | D16L | D 8 |
157|------------+---+---+------+------+-----|
158| P | * | * | * | * | * |
159| P++ | * | * | | * | * |
160| P-- | * | * | | * | * |
161| P+uimm5m2 | | | | * | |
162| P+uimm6m4 | * | * | | | |
163| P+imm16 | | | | | * |
164| P+imm17m2 | | | | * | |
165| P+imm18m4 | * | * | | | |
166| P++P | * | | * | * | |
167| FP-uimm7m4 | * | * | | | |
168| I | * | | * | * | |
169| I++ | * | | * | * | |
170| I-- | * | | * | * | |
171| I++M | * | | | | |
172
173* Workarounds and features
174Blackfin CPUs have bugs. Each model comes in a number of silicon revisions with
175different bugs. We learn about the CPU model from the -mcpu switch.
176
177** Interpretation of -mcpu value
178- -mcpu=bf527 refers to the latest known BF527 revision
179- -mcpu=bf527-0.2 refers to silicon rev. 0.2
180- -mcpu=bf527-any refers to all known revisions
181- -mcpu=bf527-none disables all workarounds
182
183The -mcpu setting affects the __SILICON_REVISION__ macro and enabled workarounds:
184
185| -mcpu | __SILICON_REVISION__ | Workarounds |
186|------------+----------------------+--------------------|
187| bf527 | Def Latest | Specific to latest |
188| bf527-1.3 | Def 0x0103 | Specific to 1.3 |
189| bf527-any | Def 0xffff | All bf527-x.y |
190| bf527-none | Undefined | None |
191
192These are the known cores and revisions:
193
194| Core | Silicon | Processors |
195|-------------+--------------------+-------------------------|
196| Edinburgh | 0.3, 0.4, 0.5, 0.6 | BF531 BF532 BF533 |
197| Braemar | 0.2, 0.3 | BF534 BF536 BF537 |
198| Stirling | 0.3, 0.4, 0.5 | BF538 BF539 |
199| Moab | 0.0, 0.1, 0.2 | BF542 BF544 BF548 BF549 |
200| Teton | 0.3, 0.5 | BF561 |
201| Kookaburra | 0.0, 0.1, 0.2 | BF523 BF525 BF527 |
202| Mockingbird | 0.0, 0.1 | BF522 BF524 BF526 |
203| Brodie | 0.0, 0.1 | BF512 BF514 BF516 BF518 |
204
205
206** Compiler implemented workarounds
207Most workarounds are implemented in header files and source code using the
208__ADSPBF527__ macros. A few workarounds require compiler support.
209
210| Anomaly | Macro | GCC Switch |
211|----------+--------------------------------+------------------|
212| Any | __WORKAROUNDS_ENABLED | |
213| 05000074 | WA_05000074 | |
214| 05000244 | __WORKAROUND_SPECULATIVE_SYNCS | -mcsync-anomaly |
215| 05000245 | __WORKAROUND_SPECULATIVE_LOADS | -mspecld-anomaly |
216| 05000257 | WA_05000257 | |
217| 05000283 | WA_05000283 | |
218| 05000312 | WA_LOAD_LCREGS | |
219| 05000315 | WA_05000315 | |
220| 05000371 | __WORKAROUND_RETS | |
221| 05000426 | __WORKAROUND_INDIRECT_CALLS | Not -micplb |
222
223** GCC feature switches
224| Switch | Description |
225|---------------------------+----------------------------------------|
226| -msim | Use simulator runtime |
227| -momit-leaf-frame-pointer | Omit frame pointer for leaf functions |
228| -mlow64k | |
229| -mcsync-anomaly | |
230| -mspecld-anomaly | |
231| -mid-shared-library | |
232| -mleaf-id-shared-library | |
233| -mshared-library-id= | |
234| -msep-data | Enable separate data segment |
235| -mlong-calls | Use indirect calls |
236| -mfast-fp | |
237| -mfdpic | |
238| -minline-plt | |
239| -mstack-check-l1 | Do stack checking in L1 scratch memory |
240| -mmulticore | Enable multicore support |
241| -mcorea | Build for Core A |
242| -mcoreb | Build for Core B |
243| -msdram | Build for SDRAM |
244| -micplb | Assume ICPLBs are enabled at runtime. |
245