期刊
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING
卷 42, 期 2, 页码 384-404出版社
SPRINGER/PLENUM PUBLISHERS
DOI: 10.1007/s10766-013-0252-y
关键词
Heterogeneous computing; Parallel processing; GPGPU; CUDA
资金
- Basic Science Research Program through the National Research Foundation (NRF) of Korea
- Ministry of Education, Science and Technology [2009-0070364]
- US National Science Foundation [CCF-1065448]
- National Research Foundation of Korea [2009-0070364] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)
This paper presents a cooperative heterogeneous computing framework which enables the efficient utilization of available computing resources of host CPU cores for CUDA kernels, which are designed to run only on GPU. The proposed system exploits at runtime the coarse-grain thread-level parallelism across CPU and GPU, without any source recompilation. To this end, three features including a work distribution module, a transparent memory space, and a global scheduling queue are described in this paper. With a completely automatic runtime workload distribution, the proposed framework achieves speedups of 3.08 in the best case and 1.42 on average compared to the baseline GPU-only processing.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据