1 2Status 3~~~~~~ 4 5As of Jan 2014 the trunk contains a port to AArch64 ARMv8 -- loosely, 6the 64-bit ARM architecture. Currently it supports integer and FP 7instructions and can run anything generated by gcc-4.8.2 -O3. The 8port is under active development. 9 10Current limitations, as of mid-May 2014. 11 12* limited support of vector (SIMD) instructions. Initial target is 13 support for instructions created by gcc-4.8.2 -O3 14 (via autovectorisation). This is complete. 15 16* Integration with the built in GDB server: 17 - works ok (breakpoint, attach to a process blocked in a syscall, ...) 18 - still to do: 19 arm64 xml register description files (allowing shadow registers 20 to be looked at). 21 cpsr transfer to/from gdb to be looked at (see also arm equivalent code) 22 23* limited syscall support 24 25There has been extensive testing of the baseline simulation of integer 26and FP instructions. Memcheck is also believed to work, at least for 27small examples. Other tools appear to at least not crash when running 28/bin/date. 29 30Enough syscalls and instructions are supported for substantial 31programs to work. Firefox 26 is able to start up and quit. The noise 32level from Memcheck is low enough to make it practical to use for real 33debugging. 34 35 36Building 37~~~~~~~~ 38 39You could probably build it directly on a target OS, using the normal 40non-cross scheme 41 42 ./autogen.sh ; ./configure --prefix=.. ; make ; make install 43 44Development so far was however done by cross compiling, viz: 45 46 export CC=aarch64-linux-gnu-gcc 47 export LD=aarch64-linux-gnu-ld 48 export AR=aarch64-linux-gnu-ar 49 50 ./autogen.sh 51 ./configure --prefix=`pwd`/Inst --host=aarch64-unknown-linux \ 52 --enable-only64bit 53 make -j4 54 make -j4 install 55 56Doing this assumes that the install path (`pwd`/Inst) is valid on 57both host and target, which isn't normally the case. To avoid 58this limitation, do instead: 59 60 ./configure --prefix=/install/path/on/target \ 61 --host=aarch64-unknown-linux \ 62 --enable-only64bit 63 make -j4 64 make -j4 install DESTDIR=/a/temp/dir/on/host 65 # and then copy the contents of DESTDIR to the target. 66 67See README.android for more examples of cross-compile building. 68 69 70Implementation tidying-up/TODO notes 71~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 72 73UnwindStartRegs -- what should that contain? 74 75 76vki-arm64-linux.h: vki_sigaction_base 77I really don't think that __vki_sigrestore_t sa_restorer 78should be present. Adding it surely puts sa_mask at a wrong 79offset compared to (kernel) reality. But not having it causes 80compilation of m_signals.c to fail in hard to understand ways, 81so adding it temporarily. 82 83 84m_trampoline.S: what's the unexecutable-insn value? 0xFFFFFFFF 85is there at the moment, but 0x00000000 is probably what it should be. 86Also, fix indentation/tab-vs-space stuff 87 88 89./include/vki/vki-arm64-linux.h: uses __uint128_t. Should change 90it to __vki_uint128_t, but what's the defn of that? 91 92 93m_debuginfo/priv_storage.h: need proper defn of DiCfSI 94 95 96readdwarf.c: is this correct? 97#elif defined(VGP_arm64_linux) 98# define FP_REG 29 //??? 99# define SP_REG 31 //??? 100# define RA_REG_DEFAULT 30 //??? 101 102 103vki-arm64-linux.h: 104re linux-3.10.5/include/uapi/asm-generic/sembuf.h 105I'd say the amd64 version has padding it shouldn't have. Check? 106 107 108syswrap-linux.c run_a_thread_NORETURN assembly sections 109seems like tst->os_state.exitcode has word type 110in which case the ppc64_linux use of lwz to read it, is wrong 111 112 113syswrap-linux.c ML_(do_fork_clone) 114assuming that VGP_arm64_linux is the same as VGP_arm_linux here 115 116 117dispatch-arm64-linux.S: FIXME: set up FP control state before 118entering generated code. Also fix screwy indentation. 119 120 121dispatcher-ery general: what's a good (predictor-friendly) way to 122branch to a register? 123 124 125in vki-arm64-scnums.h 126//#if __BITS_PER_LONG == 64 && !defined(__SYSCALL_COMPAT) 127Probably want to reenable that and clean up accordingly 128 129 130putIRegXXorZR: figure out a way that the computed value is actually 131used, so as to keep any memory reads that might generate it, alive. 132(else the simulation can lose exceptions). At least, for writes to 133the zero register generated by loads .. or .. can anything other 134integer instructions, that write to a register, cause exceptions? 135 136 137loads/stores: generate stack alignment checks as necessary 138 139 140fix barrier insns: ISB, DMB 141 142 143fix atomic loads/stores 144 145 146FMADD/FMSUB/FNMADD/FNMSUB: generate and use the relevant fused 147IROps so as to avoid double rounding 148 149 150ARM64Instr_Call getRegUsage: re-check relative to what 151getAllocableRegs_ARM64 makes available 152 153 154Make dispatch-arm64-linux.S save any callee-saved Q regs 155I think what is required is to save D8-D15 and nothing more than that. 156 157 158wrapper for __NR3264_fstat -- correct? 159 160 161PRE(sys_clone): get rid of references to vki_modify_ldt_t and the 162definition of it in vki-arm64-linux.h. Ditto for 32 bit arm. 163 164 165sigframe-arm64-linux.c: build_sigframe: references to nonexistent 166siguc->uc_mcontext.trap_no, siguc->uc_mcontext.error_code have been 167replaced by zero. Also in synth_ucontext. 168 169 170m_debugger.c: 171uregs.pstate = LibVEX_GuestARM64_get_nzcv(vex); /* is this correct? */ 172Is that remotely correct? 173 174 175host_arm64_defs.c: emit_ARM64INstr: 176ARM64in_VDfromX and ARM64in_VQfromXX: use simple top-half zeroing 177MOVs to vector registers instead of INS Vd.D[0], Xreg, to avoid false 178dependencies on the top half of the register. (Or at least check 179the semantics of INS Vd.D[0] to see if it zeroes out the top.) 180 181 182preferredVectorSubTypeFromSize: review perf effects and decide 183on a types-for-subparts policy 184 185 186fold_IRExpr_Unop: add a reduction rule for this 1871Sto64(CmpNEZ64( Or64(GET:I64(1192),GET:I64(1184)) )) 188vis 1Sto64(CmpNEZ64(x)) --> CmpwNEZ64(x) 189 190 191check insn selection for memcheck-only primops: 192Left64 CmpwNEZ64 V128to64 V128HIto64 1Sto64 CmpNEZ64 CmpNEZ32 193widen_z_8_to_64 1Sto32 Left32 32HLto64 CmpwNEZ32 CmpNEZ8 194 195 196isel: get rid of various cases where zero is put into a register 197and just use xzr instead. Especially for CmpNEZ64/32. And for 198writing zeroes into the CC thunk fields. 199 200 201/* Keep this list in sync with that in iselNext below */ 202/* Keep this list in sync with that for Ist_Exit above */ 203uh .. they are not in sync 204 205 206very stupid: 207imm64 x23, 0xFFFFFFFFFFFFFFA0 20817 F4 9F D2 F7 FF BF F2 F7 FF DF F2 F7 FF FF F2 209 210 211valgrind.h: fix VALGRIND_ALIGN_STACK/VALGRIND_RESTORE_STACK, 212also add CFI annotations 213 214 215could possibly bring r29 into use, which be useful as it is 216callee saved 217 218 219ubfm/sbfm etc: special case cases that are simple shifts, as iropt 220can't always simplify the general-case IR to a shift in such cases. 221 222 223LDP,STP (immediate, simm7) (FP&VEC) 224should zero out hi parts of dst registers in the LDP case 225 226 227DUP insns: use Iop_Dup8x16, Iop_Dup16x8, Iop_Dup32x4 228rather than doing it "by hand" 229 230 231Any place where ZeroHI64ofV128 is used in conjunction with 232FP vector IROps: find a way to make sure that arithmetic on 233the upper half of the values is "harmless." 234 235 236math_MINMAXV: use real Iop_Cat{Odd,Even}Lanes ops rather than 237inline scalar code 238 239 240chainXDirect_ARM64: use direct jump forms when possible 241