1 2 3(* Copyright (c) 2011-2020 Stefan Krah. All rights reserved. *) 4 5 6The Matrix Fourier Transform: 7============================= 8 9In libmpdec, the Matrix Fourier Transform [1] is called four-step transform 10after a variant that appears in [2]. The algorithm requires that the input 11array can be viewed as an R*C matrix. 12 13All operations are done modulo p. For readability, the proofs drop all 14instances of (mod p). 15 16 17Algorithm four-step (forward transform): 18---------------------------------------- 19 20 a := input array 21 d := len(a) = R * C 22 p := prime 23 w := primitive root of unity of the prime field 24 r := w**((p-1)/d) 25 A := output array 26 27 1) Apply a length R FNT to each column. 28 29 2) Multiply each matrix element (addressed by j*C+m) by r**(j*m). 30 31 3) Apply a length C FNT to each row. 32 33 4) Transpose the matrix. 34 35 36Proof (forward transform): 37-------------------------- 38 39 The algorithm can be derived starting from the regular definition of 40 the finite-field transform of length d: 41 42 d-1 43 ,---- 44 \ 45 A[k] = | a[l] * r**(k * l) 46 / 47 `---- 48 l = 0 49 50 51 The sum can be rearranged into the sum of the sums of columns: 52 53 C-1 R-1 54 ,---- ,---- 55 \ \ 56 = | | a[i * C + j] * r**(k * (i * C + j)) 57 / / 58 `---- `---- 59 j = 0 i = 0 60 61 62 Extracting a constant from the inner sum: 63 64 C-1 R-1 65 ,---- ,---- 66 \ \ 67 = | r**k*j * | a[i * C + j] * r**(k * i * C) 68 / / 69 `---- `---- 70 j = 0 i = 0 71 72 73 Without any loss of generality, let k = n * R + m, 74 where n < C and m < R: 75 76 C-1 R-1 77 ,---- ,---- 78 \ \ 79 A[n*R+m] = | r**(R*n*j) * r**(m*j) * | a[i*C+j] * r**(R*C*n*i) * r**(C*m*i) 80 / / 81 `---- `---- 82 j = 0 i = 0 83 84 85 Since r = w ** ((p-1) / (R*C)): 86 87 a) r**(R*C*n*i) = w**((p-1)*n*i) = 1 88 89 b) r**(C*m*i) = w**((p-1) / R) ** (m*i) = r_R ** (m*i) 90 91 c) r**(R*n*j) = w**((p-1) / C) ** (n*j) = r_C ** (n*j) 92 93 r_R := root of the subfield of length R. 94 r_C := root of the subfield of length C. 95 96 97 C-1 R-1 98 ,---- ,---- 99 \ \ 100 A[n*R+m] = | r_C**(n*j) * [ r**(m*j) * | a[i*C+j] * r_R**(m*i) ] 101 / ^ / 102 `---- | `---- 1) transform the columns 103 j = 0 | i = 0 104 ^ | 105 | `-- 2) multiply 106 | 107 `-- 3) transform the rows 108 109 110 Note that the entire RHS is a function of n and m and that the results 111 for each pair (n, m) are stored in Fortran order. 112 113 Let the term in square brackets be f(m, j). Step 1) and 2) precalculate 114 the term for all (m, j). After that, the original matrix is now a lookup 115 table with the mth element in the jth column at location m * C + j. 116 117 Let the complete RHS be g(m, n). Step 3) does an in-place transform of 118 length n on all rows. After that, the original matrix is now a lookup 119 table with the mth element in the nth column at location m * C + n. 120 121 But each (m, n) pair should be written to location n * R + m. Therefore, 122 step 4) transposes the result of step 3). 123 124 125 126Algorithm four-step (inverse transform): 127---------------------------------------- 128 129 A := input array 130 d := len(A) = R * C 131 p := prime 132 d' := d**(p-2) # inverse of d 133 w := primitive root of unity of the prime field 134 r := w**((p-1)/d) # root of the subfield 135 r' := w**((p-1) - (p-1)/d) # inverse of r 136 a := output array 137 138 0) View the matrix as a C*R matrix. 139 140 1) Transpose the matrix, producing an R*C matrix. 141 142 2) Apply a length C FNT to each row. 143 144 3) Multiply each matrix element (addressed by i*C+n) by r**(i*n). 145 146 4) Apply a length R FNT to each column. 147 148 149Proof (inverse transform): 150-------------------------- 151 152 The algorithm can be derived starting from the regular definition of 153 the finite-field inverse transform of length d: 154 155 d-1 156 ,---- 157 \ 158 a[k] = d' * | A[l] * r' ** (k * l) 159 / 160 `---- 161 l = 0 162 163 164 The sum can be rearranged into the sum of the sums of columns. Note 165 that at this stage we still have a C*R matrix, so C denotes the number 166 of rows: 167 168 R-1 C-1 169 ,---- ,---- 170 \ \ 171 = d' * | | a[j * R + i] * r' ** (k * (j * R + i)) 172 / / 173 `---- `---- 174 i = 0 j = 0 175 176 177 Extracting a constant from the inner sum: 178 179 R-1 C-1 180 ,---- ,---- 181 \ \ 182 = d' * | r' ** (k*i) * | a[j * R + i] * r' ** (k * j * R) 183 / / 184 `---- `---- 185 i = 0 j = 0 186 187 188 Without any loss of generality, let k = m * C + n, 189 where m < R and n < C: 190 191 R-1 C-1 192 ,---- ,---- 193 \ \ 194 A[m*C+n] = d' * | r' ** (C*m*i) * r' ** (n*i) * | a[j*R+i] * r' ** (R*C*m*j) * r' ** (R*n*j) 195 / / 196 `---- `---- 197 i = 0 j = 0 198 199 200 Since r' = w**((p-1) - (p-1)/d) and d = R*C: 201 202 a) r' ** (R*C*m*j) = w**((p-1)*R*C*m*j - (p-1)*m*j) = 1 203 204 b) r' ** (C*m*i) = w**((p-1)*C - (p-1)/R) ** (m*i) = r_R' ** (m*i) 205 206 c) r' ** (R*n*j) = r_C' ** (n*j) 207 208 d) d' = d**(p-2) = (R*C) ** (p-2) = R**(p-2) * C**(p-2) = R' * C' 209 210 r_R' := inverse of the root of the subfield of length R. 211 r_C' := inverse of the root of the subfield of length C. 212 R' := inverse of R 213 C' := inverse of C 214 215 216 R-1 C-1 217 ,---- ,---- 2) transform the rows of a^T 218 \ \ 219 A[m*C+n] = R' * | r_R' ** (m*i) * [ r' ** (n*i) * C' * | a[j*R+i] * r_C' ** (n*j) ] 220 / ^ / ^ 221 `---- | `---- | 222 i = 0 | j = 0 | 223 ^ | `-- 1) Transpose input matrix 224 | `-- 3) multiply to address elements by 225 | i * C + j 226 `-- 3) transform the columns 227 228 229 230 Note that the entire RHS is a function of m and n and that the results 231 for each pair (m, n) are stored in C order. 232 233 Let the term in square brackets be f(n, i). Without step 1), the sum 234 would perform a length C transform on the columns of the input matrix. 235 This is a) inefficient and b) the results are needed in C order, so 236 step 1) exchanges rows and columns. 237 238 Step 2) and 3) precalculate f(n, i) for all (n, i). After that, the 239 original matrix is now a lookup table with the ith element in the nth 240 column at location i * C + n. 241 242 Let the complete RHS be g(m, n). Step 4) does an in-place transform of 243 length m on all columns. After that, the original matrix is now a lookup 244 table with the mth element in the nth column at location m * C + n, 245 which means that all A[k] = A[m * C + n] are in the correct order. 246 247 248-- 249 250 [1] Joerg Arndt: "Matters Computational" 251 http://www.jjj.de/fxt/ 252 [2] David H. Bailey: FFTs in External or Hierarchical Memory 253 http://crd.lbl.gov/~dhbailey/dhbpapers/ 254 255 256 257