Optimized Implementation of the 2-D DFT on Loosely-Coupled Parallel Systems
01 January 1989
The problem of optimal implementation of the 2-D DFT on a very large number of loosely-coupled processors is addressed by means of a very accurate, high-level simulation model based on an existing hardware system, called Armstrong. Simulations were run for 2-D DFT sizes of 2 x 2 to 2048. Remarkably, a comparison of simulation results to timings on the real Armstrong hardware, for up to 32 processors, showed differences of less than 3%. The simulation results were further valided through a simple analytic model for 2-D DFT performance. It is shown that there exists an optimum number of processors for each size DFT and that increasing the number of processors beyond this number actually decreases system performance.