4.6 Article

PFFT: AN EXTENSION OF FFTW TO MASSIVELY PARALLEL ARCHITECTURES

期刊

SIAM JOURNAL ON SCIENTIFIC COMPUTING
卷 35, 期 3, 页码 C213-C236

出版社

SIAM PUBLICATIONS
DOI: 10.1137/120885887

关键词

parallel fast Fourier transform

资金

  1. BMBF [01IH08001B]

向作者/读者索取更多资源

We present an MPI based software library for computing fast Fourier transforms (FFTs) on massively parallel, distributed memory architectures based on the Message Passing Interface standard (MPI). Similar to established transpose FFT algorithms, we propose a parallel FFT framework that is based on a combination of local FFTs, local data permutations, and global data transpositions. This framework can be generalized to arbitrary multidimensional data and process meshes. All performance-relevant building blocks can be implemented with the help of the FFTW software library. Therefore, our library offers great flexibility and portable performance. Similarly to FFTW, we are able to compute FFTs of complex data, real data, and even- or odd-symmetric real data. All the transforms can be performed completely in place. Furthermore, we propose an algorithm to calculate pruned FFTs more efficiently on distributed memory architectures. For example, we provide performance measurements of FFTs of sizes between 512(3) and 8192(3) up to 262144 cores on a BlueGene/P architecture, up to 32768 cores on a BlueGene/Q architecture, and up to 4096 cores on the Julich Research on Petaflop Architectures (JuRoPA).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据