☆ 4.5 Article

GPUS AND THE FUTURE OF PARALLEL COMPUTING

IEEE MICRO (2011)

Journal

IEEE MICRO

Volume 31, Issue 5, Pages 7-17

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/MM.2011.89

Keywords

-

Categories

Computer Science, Hardware & Architecture Computer Science, Software Engineering

Funding

US government via DARPA [HR0011-10-9-0008]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Computer Science, Hardware & Architecture

Parallel border tracking in binary images using GPUs

Victor M. Garcia-Molla, Pedro Alonso-Jorda, Ricardo Garcia-Laguia

Summary: This paper proposes a parallel version of the Suzuki algorithm designed for GPUs, which splits the image into small rectangles and tracks the borders with a dedicated thread for each rectangle. The experimental results show that the proposed parallel algorithm is more than 10 times faster than the sequential CPU routine, using the available GPUs and CPUs.

JOURNAL OF SUPERCOMPUTING (2022)

Add to Collection

Article Mathematics, Applied

A massively parallel algorithm for Bordered Almost Block Diagonal Systems on GPUs

M. Dessole, F. Marcuzzi

Summary: The paper presents the PARASOF algorithm for solving linear systems with BABD matrices on massively parallel computing systems, comparing it with the state-of-the-art SOF algorithm in terms of stability. It discusses the design, implementation, and theoretical and experimental performances of PARASOF.

NUMERICAL ALGORITHMS (2021)

Add to Collection

Article Environmental Sciences

Hyperspectral Parallel Image Compression on Edge GPUs

Oscar Ferraz, Vitor Silva, Gabriel Falcao

Summary: The study investigated the parallel solution on embedded systems, reducing development effort and power consumption, utilizing a low-power GPU for image prediction, and exploiting multiple CPU cores and GPU for image entropy encoding in parallel.

REMOTE SENSING (2021)

Add to Collection

Article Computer Science, Software Engineering

Fast Parallel Evaluation of Exact Geometric Predicates on GPUs

Marcelo de Matos Menezes, Salles Viana Gomes de Magalhaes, Matheus Aguilar de Oliveira, W. Randolph Franklin, Rodrigo Eduardo de Oliveira Bauer Chichorro

Summary: This paper presents an algorithm that accelerates the evaluation of a large number of 3D geometric predicates by utilizing the strengths of both CPU and GPU. The algorithm progressively eliminates non-intersecting pairs and identifies the actual intersections. It achieves significant parallel speedup and can efficiently find a large number of intersections in a short time.

COMPUTER-AIDED DESIGN (2022)

Add to Collection

Article Computer Science, Theory & Methods

Scalable Energy Games Solvers on GPUs

Andrea Formisano, Raffaella Gentilini, Flavio Vella

Summary: This article discusses the importance of modeling the consumption of limited resources in embedded controllers, as well as the challenges in solving certain game instances. Research shows that sequential implementations and CPU multi-core, GPU parallelism are limited in efficiency in solving these problems. By optimizing algorithms and introducing new parallel implementations, the time to solution can be significantly reduced.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2021)

Add to Collection

Article Computer Science, Software Engineering

InstantTrace: fast parallel neuron tracing on GPUs

Yuxuan Hou, Zhong Ren, Qiming Hou, Yubo Tao, Yankai Jiang, Wei Chen

Summary: Neuron tracing, or reconstruction, is crucial for studying neuronal circuits and brain mechanisms. We present InstantTrace, a novel framework that utilizes parallel tracing on GPUs and achieves a more than 20x speed boost compared to state-of-the-art methods, while maintaining comparable reconstruction quality. The framework takes advantage of the sparse feature and tree structure of the neuron image and parallelizes all stages of the tracing pipeline on GPU. A test on a whole mouse brain OM Image demonstrated that our framework can achieve a preliminary reconstruction within 1 hour on a single GPU, which is an order of magnitude faster than existing methods. This framework has the potential to significantly improve the efficiency of the neuron tracing process and provide instant preliminary results for manual verification and refinement.

VISUAL COMPUTER (2023)

Add to Collection

Article Computer Science, Interdisciplinary Applications

A high-throughput hybrid task and data parallel Poisson solver for large-scale simulations of incompressible turbulent flows on distributed GPUs

Hadi Zolfaghari, Dominik Obrist

Summary: The paper introduces a more algebraically simpler yet more advanced parallel implementation for solving the Poisson problem on a large number of distributed GPUs. The combination of data parallelism and task parallelism reduces communication overhead, leading to a significant decrease in time-to-solution and computational cost for the Poisson problem.

JOURNAL OF COMPUTATIONAL PHYSICS (2021)

Add to Collection

Article Computer Science, Theory & Methods

An Efficient Parallel Secure Machine Learning Framework on GPUs

Feng Zhang, Zheng Chen, Chenyang Zhang, Amelie Chi Zhou, Jidong Zhai, Xiaoyong Du

Summary: This article introduces a GPU-based framework named ParSecureML to improve the performance of secure machine learning algorithms based on two-party computation. ParSecureML solves challenges including complex computation patterns, frequent data transmission between CPU and GPU, and inter-node data dependence. Compared to state-of-the-art frameworks, ParSecureML achieves an average speedup of 33.8x.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2021)

Add to Collection

Article Physics, Multidisciplinary

Acceleration of Approximate Matrix Multiplications on GPUs

Takuya Okuyama, Andre Rohm, Takatomo Mihana, Makoto Naruse

Summary: Matrix multiplication is important for various applications, and reducing computation time is crucial. Despite the potential of GPUs, research has not focused on accelerating AMMs for general matrices. In this paper, we propose a method to improve Monte Carlo AMMs, with optimal values for hyperparameters. The proposed method enhances matrix product approximation without increasing computation time, and is compatible with parallel operations on GPUs, demonstrating halved computation time compared to the conventional power method.

ENTROPY (2023)

Add to Collection

Review Biochemical Research Methods

Parallel computing for genome sequence processing

You Zou, Yuejie Zhu, Yaohang Li, Fang-Xiang Wu, Jianxin Wang

Summary: The rapid increase of genome data from gene sequencing technologies presents a massive challenge to data processing. To address this, researchers have proposed methods in big data storage, efficient algorithm design, and parallel computing. This review investigates popular parallel programming technologies for genome sequence processing, discussing models, applications, and limitations.

BRIEFINGS IN BIOINFORMATICS (2021)

Add to Collection

Article Computer Science, Theory & Methods

Electrical-Level Attacks on CPUs, FPGAs, and GPUs: Survey and Implications in the Heterogeneous Era

Dina G. Mahmoud, Vincent Lenders, Mirjana Stojilovic

Summary: This article investigates electrical-level attacks on CPUs, FPGAs, and GPUs, and explores their impact on heterogeneous systems. Additionally, it highlights open research directions for ensuring the security of heterogeneous computing systems in the future.

ACM COMPUTING SURVEYS (2023)

Add to Collection

Article Computer Science, Hardware & Architecture

GPUs-RRTMG_LW: high-efficient and scalable computing for a longwave radiative transfer model on multiple GPUs

Yuzhu Wang, Mingxin Guo, Yuan Zhao, Jinrong Jiang

Summary: This paper presents an approach to running a large-scale, computationally intensive, longwave radiative transfer model on a GPU cluster. A CUDA-based acceleration algorithm for the RRTMG longwave radiation scheme on multiple GPUs is proposed, and a heterogeneous, hybrid programming paradigm (MPI+CUDA) is utilized with the RRTMG_LW on a GPU cluster. Experimental results show that the multi-GPU acceleration algorithm is valid, scalable, and highly efficient, achieving a 77.78x speedup when compared to a single Intel Xeon E5-2680 CPU core.

JOURNAL OF SUPERCOMPUTING (2021)

Add to Collection

Review Energy & Fuels

High-Performance and Parallel Computing Techniques Review: Applications, Challenges and Potentials to Support Net-Zero Transition of Future Grids

Ahmed Al-Shafei, Hamidreza Zareipour, Yankai Cao

Summary: The transition towards net-zero emissions is inevitable for humanity's future. Electrical energy systems emit the most emissions among all sectors. This requires a transition towards an emission-free smart grid, which involves integrating wind and solar-powered resources and adopting new paradigms such as distributed resources and IoT technologies. However, these changes will pose unprecedented challenges in terms of scale, complexity, and data, making it important to consider high performance computing, parallel computing, and cloud computing in future electrical energy studies.

ENERGIES (2022)

Add to Collection

Article Computer Science, Hardware & Architecture

Parallel radiation dose computations with GENOCOP III on GPUs

J. J. Moreno, J. Miroforidis, E. Filatovas, I. Kaliszewski, E. M. Garzon

Summary: This work reports on the authors' attempt to put radiotherapy planning in a 'win-win' situation by exploring unexploited reserves in optimization methods and algorithms, as well as utilizing High Performance Computing resources. By incorporating sparse matrix procedures into optimization algorithms and leveraging graphical processing units, they were able to achieve speedups in optimization computations, as demonstrated in numerical testing on a clinical case.

JOURNAL OF SUPERCOMPUTING (2021)

Add to Collection

Review Computer Science, Theory & Methods

Systematic Literature Review on Parallel Trajectory-based Metaheuristics

Andre Luis Barroso Almeida, Joubert de Castro Lima, Marco Antonio M. Carvalho

Summary: In the past 35 years, parallel computing has gained increasing interest in the academic community, particularly in addressing complex optimization problems. This survey focuses on the use of high-performance computing techniques to design and implement trajectory-based metaheuristics. It provides a comprehensive overview of the current state-of-the-art in multi-core and distributed trajectory-based metaheuristics, introducing basic concepts of high-performance computing and reviewing different taxonomies for architectures and metaheuristics. The survey also presents a summary and classification of 127 publications, identifies research gaps, and discusses past and future trends.

ACM COMPUTING SURVEYS (2023)

Add to Collection

No Data Available

No Data Available

© Peeref 2019-2024. All rights reserved.