4.2 Article

GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers

期刊

IEEE COMPUTER ARCHITECTURE LETTERS
卷 19, 期 2, 页码 139-142

出版社

IEEE COMPUTER SOC
DOI: 10.1109/LCA.2020.3023723

关键词

Servers; Graphics processing units; Load modeling; Power measurement; Monitoring; Instruments; Throughput; Multi-GPU; energy efficiency; inference server

资金

  1. NSF [CC-F-1815643]
  2. University of California, Riverside

向作者/读者索取更多资源

Cloud inference systems have recently emerged as a solution to the ever-increasing integration of AI-powered applications into the smart devices around us. The wide adoption of GPUs in cloud inference systems has made power consumption a first-order constraint in multi-GPU systems. Thus, to achieve this goal, it is critical to have better insight into the power and performance behaviors of multi-GPU inference system. To this end, we propose GPU-NEST, an energy efficiency characterization methodology for multi-GPU inference systems. As case studies, we examined the challenges presented by, and implications of, multi-GPU scaling, inference scheduling, and non-GPU bottleneck on multi-GPU inference systems energy efficiency. We found that inference scheduling in particular has great benefits in improving the energy efficiency of multi-GPU scheduling, by as much as 40 percent.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据