4.2 Article

Functionality and performance of NVLink with IBM POWER9 processors

Ask authors/readers for more resources

Heterogeneous computer systems with multiple types of processing elements (PEs) are becoming a popular design to optimize performance and efficiency for a wide variety of applications. Each part of an application can be executed on the PE for which it is best suited. In heterogeneous systems, communication, efficient data movement, and memory sharing across PEs are critical to execute an application across the different PEs while incurring minimal overhead for communication and synchronization. The IBM POWER9 processor supports the NVIDIA NVLink interface, a high performance interconnect with many such capabilities. In the IBM Power System AC922, IBM POWER9 processors directly connect to multiple NVIDIA GPUs using NVLink. In this paper, we highlight the important functional and performance capabilities of NVLink with the POWER9 processor. These include high bandwidth, hardware cache coherence, fine-grained data movement, and hardware support for atomic operations across all PEs of a compute node. We also present an analysis of how these performance and functional capabilities of POWER9 processors and NT/Link are expected to have signifficant impacts on performance and programmability across a variety of important applications, such as machine learning and domains within high-performance computing.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available