4.7 Article

Improving I/O Performance for Exascale Applications Through Online Data Layout Reorganization

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPDS.2021.3100784

关键词

Layout; Arrays; Heuristic algorithms; Computational modeling; Performance evaluation; Optimization; Distributed databases; Parallel IO; data layout; IO performance; WarpX; data access optimization

资金

  1. Exascale Computing Project of the U.S. Department of Energy Office of Science [17-SC-20-SC]
  2. Exascale Computing Project of the National Nuclear Security Administration [17-SC-20-SC]
  3. Center of Advanced Systems Understanding (CASUS), Germany's Federal Ministry of Education and Research (BMBF)
  4. Saxon Ministry for Science, Culture and Tourism (SMWK)
  5. DOE Office of Science User Facility [DE-AC05-00OR22725]

向作者/读者索取更多资源

The applications being developed on Exascale computers will produce scientific results with unprecedented accuracy and efficiency. However, the irregular and dynamic data distributions pose new challenges for I/O logic. This paper introduces two online data layout reorganization approaches to balance read and write performance.
The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read and write performance. We demonstrate the benefits of using these two approaches for the ECP particle-in-cell simulation WarpX, which serves as a motif for a large class of important Exascale applications. We show that by understanding application I/O patterns and carefully designing data layouts we can increase read performance by more than 80 percent.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据