Journal
INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY
Volume 114, Issue 9, Pages 543-552Publisher
WILEY
DOI: 10.1002/qua.24607
Keywords
electron repulsion integrals; graphics processing unit; density functional theory; parallel processing
Ask authors/readers for more resources
The computation of electron repulsion integrals (ERIs) is the most time-consuming process in the density functional calculation using Gaussian basis set. Many temporal ERIs are calculated, and most are stored on slower storage, such as cache or memory, because of the shortage of registers, which are the fastest storage in a central processing unit (CPU). Moreover, the heavy register usage makes it difficult to launch many concurrent threads on a graphics processing unit (GPU) to hide latency. Hence, we propose to optimize the calculation order of one-center ERIs to minimize the number of registers used, and to calculate each ERI with three or six co-operating threads. The performance of this method is measured on a recent CPU and a GPU. The proposed approach is found to be efficient for high angular basis functions with a GPU. When combined with a recent GPU, it accelerates the computation almost 4-fold. (c) 2014 Wiley Periodicals, Inc.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available