| | 204 | |
| | 205 | == __fft__ == |
| | 206 | |
| | 207 | This multi-threaded parallel application is the port of the SPLASH FFT benchmark on the GIET_VM OS. |
| | 208 | |
| | 209 | It performs the in-place 1D Fast Fourier Transfom for an array of N complex points, using the Cooley-Tuckey method. |
| | 210 | The N data points are handled as a 2D array (rootN rows * rootN columns). This data array is implemented as |
| | 211 | distributed buffers (one buffer per cluster). |
| | 212 | * The number of points N is defined by the M parameter (N = 1<< M). It must be a power of 4. |
| | 213 | * The number of threads is equal to the number of processors in the hardware architecture. It must be a power of 2. |
| | 214 | * The number of threads must be larger or equal to the number of rows (nthreads >= rootN). |
| | 215 | Each thread is handling a subset of the k of the rows (k = rootN / nthreads), and transpose the data array to benefit of cache locality. |
| | 216 | After the sequential initialization, the parallel processing is done in five steps separated by synchronisation barriers: |
| | 217 | * first data array transpose |
| | 218 | * FFT on the columns |
| | 219 | * second data array transpose |
| | 220 | * FFT on the rows |
| | 221 | * third data array transpose |
| | 222 | The N input data points be initialised in three different modes: |
| | 223 | * CONSTANT : all data points have the same [1,0] value |
| | 224 | * COSIN : data point n has [cos(n/N) , sin(n/N)] value |
| | 225 | * RANDOM : data points have pseudo random values |
| | 226 | The instrumentation results (computation times) are stored in a file. |
| | 227 | |
| | 228 | It requires one TTY terminal shared by all threads. |
| | 229 | |
| | 230 | The source code can be found [source:soft/giet_vm/applications/fft/fft.c here], and the mapping is defined [source:soft/giet_vm/applications/fft/fft.py here]. |
| | 231 | |