4.4 Article

Performance of a three-dimensional unstructured mesh compressible flow solver on NVIDIA Fermi-class graphics processing unit hardware

期刊

出版社

WILEY-BLACKWELL
DOI: 10.1002/fld.3744

关键词

Eulerian; finite element; partial differential equations; compressible flow; parallelization; explicit

向作者/读者索取更多资源

We describe the performance of Chicoma, a 3D unstructured mesh compressible flow solver, on graphics processing unit (GPU) hardware. The approach used to deploy the solver on GPU architectures derives from the threaded multicore execution model used in Chicoma, and attempts to improve memory performance via the application of graph theory techniques. The result is a scheme that can be deployed on the GPU with high-level programming constructs, for example, compiler directives, rather than low-level programming extensions. With an NVIDIA Fermi-class GPU (NVIDIA Corp., Sta. Clara, CA, USA) and double precision floating point arithmetic, we observe performance gains of 45xon problem sizes of 106 107 tetrahedra. We also compare GPU performance to threaded multicore performance with OpenMP and demonstrate hybrid multicore-GPU calculations with adaptive mesh refinement. Published 2012. This article is a US Government work and is in the public domain in the USA.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据