Article
Thermodynamics
Ao Xu, Bo-Tao Li
Summary: This study evaluates the performance of a hybrid OpenACC and MPI approach for multi-GPUs accelerated thermal LB simulation. OpenACC is used to accelerate computation on a single GPU, while MPI synchronizes information between multiple GPUs. The results show promising performance improvement with single GPU achieving 1.93 billion GLUPS for 2D simulation and 1.04 GLUPS for 3D simulation. With 16 GPUs, the parallel efficiency remains high, reaching 30.42 GLUPS for 2D simulation and 14.52 GLUPS for 3D simulation.
INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER
(2023)
Article
Computer Science, Interdisciplinary Applications
Fabio Bonaccorso, Marco Lauricella, Andrea Montessori, Giorgio Amati, Massimo Bernaschi, Filippo Spiga, Adriano Tiribocchi, Sauro Succi
Summary: This paper presents LBcuda, a GPU-accelerated version of LBsoft, a MPI-based software for simulating multi-component colloidal flows. The design principles, optimization, and performance of LBcuda compared to the CPU version are described, using both a low-cost GPU and high-end NVidia GPU cards (V100 and A100). Results show a substantial acceleration for the fluid solver, reaching up to 200 GLUPS on a cluster of 512 A100 NVIDIA cards simulating a grid of eight billion lattice points.
COMPUTER PHYSICS COMMUNICATIONS
(2022)
Article
Thermodynamics
Ao Xu, Bo-Tao Li
Summary: We utilize the OpenACC approach to accelerate particle-resolved thermal lattice Boltzmann simulation using graphics processing unit (GPU). By adopting the momentum-exchange method for fluid-particle interactions and extending the indirect addressing method to address load imbalance issues, we achieve improved simulation performance.
INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER
(2024)
Article
Materials Science, Multidisciplinary
Yongjia Zhang, Jianxin Zhou, Yajun Yin, Xu Shen, Xiaoyuan Ji
Summary: A parallel 3D cellular automaton model was developed for dendritic growth simulation during solidification of binary alloy, implemented using GPU with CUDA and MPI combination to achieve high parallel performance and reduce communication cost. The model accurately predicts the development of columnar and equiaxed dendrites during directional solidification at different cooling rates.
JOURNAL OF MATERIALS RESEARCH AND TECHNOLOGY-JMR&T
(2021)
Article
Computer Science, Hardware & Architecture
Binbin Zhou, Lu Lu
Summary: This paper introduces an efficient 3D FFT framework for multi-GPU distributed-memory systems, which utilizes a hybrid programming model combining MPI and OpenMP for effective communication, and adopts an asynchronous strategy and fast parallel kernels for acceleration.
JOURNAL OF SUPERCOMPUTING
(2022)
Article
Physics, Multidisciplinary
Zi-Hao Gao, Chang-Sheng Zhu, Cang-Long Wang
Summary: A GPU-parallel-based computational scheme is developed to study the competitive growth process of converging bi-crystals under forced convection conditions. The elimination mechanism of three different conformational schemes under diffusion and forced convection conditions is analyzed. The presence of forced convection leads to an anomalous elimination phenomenon in which unfavorable dendrites eliminate favorable dendrites in the grain boundaries. Parallelization of the multi-phase field-lattice Boltzmann model on the GPU platform is achieved, showing significant parallel acceleration.
Article
Mechanics
Yanfang Lyu, Xiaoyu Zhao, Zhiqiang Gong, Xiao Kang, Wen Yao
Summary: Data-driven prediction of laminar flow and turbulent flow in marine and aerospace engineering has been extensively studied and shown potential in real-time prediction. This work proposes a novel multi-fidelity learning method that combines abundant low-fidelity data and limited high-fidelity data using the Fourier neural operator and transfer learning. The method achieves high modeling accuracy of 99% for selected physical field problems, outperforming other high-fidelity models. The proposed method has the potential to provide a reference for subsequent model construction with its simple structure and high precision for fluid flow problems.
Article
Computer Science, Software Engineering
Adrian Kummerlaender, Marcio Dorn, Martin Frank, Mathias J. Krause
Summary: This article revisits and extends the work on implicit propagation on directly addressed grids by considering them as transformations of the underlying space filling curve. A new periodic shift (PS) pattern is proposed that provides consistent performance across a range of targets. Benchmark results for PS and shift-swap-streaming (SSS) on SIMD CPUs and Nvidia GPUs are provided, and the application of PS as the propagation pattern of the open source LBM framework OpenLB is summarized.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
(2023)
Article
Materials Science, Multidisciplinary
Yaqi Guo, Sen Luo, Weiling Wang, Miaoyong Zhu
Summary: A GPU-accelerated 3D PF-LBM model was established to predict multi-dendritic growth of Fe-C binary alloy, overcoming limitations of GPU memory and achieving a speedup ratio of 1700 times. Results showed that under forced fluid flow, solute enrichment at the interdendritic space was hardly washed away, leading to more significant solute enrichment in the downstream region.
JOURNAL OF MATERIALS RESEARCH AND TECHNOLOGY-JMR&T
(2022)
Article
Computer Science, Hardware & Architecture
Andrey Zakirov, Anastasia Perepelkina, Vadim Levchenko, Sergey Khilkov
Summary: This paper explores data flow arrangement possibilities, defines a new propagation scheme, and implements it as GPU program code using a locally recursive non-locally asynchronous algorithm construction method, achieving performance up to 10 GLUps on a single nVidia GeForce RTX 3090 GPU.
JOURNAL OF SUPERCOMPUTING
(2021)
Article
Mathematical & Computational Biology
Gang Huang, Qianlin Ye, Hao Tang, Zhangrong Qin
Summary: In this paper, a 3D human eyes aqueous humor (AH) dynamics model was presented and optimized using GPU technology. The feasibility of the model was demonstrated through validation, and the effect of different factors on AH flow was investigated. Results showed that AH flow more rapidly when standing, intraocular temperature had the greatest effect on AH flow, and AH secretion rate and trabecular meshwork (TM) permeability had a greater effect on intraocular pressure (IOP). Corneal indentation and ovoid anterior chamber (AC) also affected AH flow. Finally, the PartSparse algorithm based GPU achieved significant memory savings and improved performance.
MATHEMATICAL BIOSCIENCES AND ENGINEERING
(2023)
Article
Mechanics
Jiakun Han, Yongtao Shui, Lu Nie, Gang Chen
Summary: The unsteady flow control of the flexible flap on the bio-inspired wing is studied using the IB-LB-FEM method, revealing the deformation law of the flap with fluid-structure interaction and its influence on unsteady aerodynamics. It is found that as the angle of attack increases, the aerodynamic characteristics transition from periodic to chaotic to quasi-periodic states, which is closely related to flow separation on the wing's surface. The proposed flow control mechanism of the flexible flap provides new design ideas for bio-inspired aircraft.
Article
Environmental Sciences
Yansen Wang, Xiping Zeng, Jonathan Decker
Summary: A three-dimensional radiation model based on lattice Boltzmann method (LBM) is developed and implemented on GPU for accelerated computation speed in simulating radiative transfer in the atmosphere. The model, named RT-LBM, shows high accuracy and computational efficiency compared to Monte Carlo method (MCM) simulations, running significantly faster on both CPU and GPU platforms.
Article
Computer Science, Information Systems
Benjamin T. Shealy, Mehrdad Yousefi, Ashwin T. Srinath, Melissa C. Smith, Ulf D. Schiller
Summary: This study presents the implementation and scaling analysis of a GPU-accelerated kernel for HemeLB, a high-performance Lattice Boltzmann code designed for sparse complex geometries. The research shows significant speedups in single-GPU performance for HemeLB-GPU compared to a single CPU core, with good scalability up to 32 GPUs. Strategies to improve kernel performance and scalability for a larger number of GPUs are also discussed, aiming to enable better utilization of heterogeneous high-performance computing systems for large-scale lattice Boltzmann simulations.
Article
Computer Science, Interdisciplinary Applications
Vincent Delmas, Azzedine Soulaimani
Summary: This study presents the development of a multi-GPU version of a time-explicit finite volume solver for the Shallow-Water Equations on a multi-GPU architecture, utilizing MPI, CUDA-Fortran, and the METIS library. By using multiple GPUs to accelerate message passing and conducting efficiency studies, it was found that efficiencies of over 80% can be achieved.
COMPUTER PHYSICS COMMUNICATIONS
(2022)
Article
Chemistry, Multidisciplinary
Yan Zhao, Kaiyue Jiang, Can Li, Yufeng Liu, Gucheng Zhu, Michele Pizzochero, Efthimios Kaxiras, Dandan Guan, Yaoyi Li, Hao Zheng, Canhua Liu, Jinfeng Jia, Mingpu Qin, Xiaodong Zhuang, Shiyong Wang
Summary: Individual quantum nanomagnets based on metal-free multi-porphyrin systems have been synthesized. The magnetic coupling between porphyrins was tuned by converting specific porphyrin units to their radical or biradical state. The resulting chains exhibit different magnetic properties, with gap excitation in S = 1/2 antiferromagnets and distinct end states in S = 1 antiferromagnets.
Article
Chemistry, Physical
Mingu Kang, Shiang Fang, Jonggyu Yoo, Brenden R. Ortiz, Yuzki M. Oey, Jonghyeok Choi, Sae Hee Ryu, Jimin Kim, Chris Jozwiak, Aaron Bostwick, Eli Rotenberg, Efthimios Kaxiras, Joseph G. Checkelsky, Stephen D. Wilson, Jae-Hoon Park, Riccardo Comin
Summary: The authors use high-resolution angle-resolved photoemission spectroscopy to determine the microscopic structure of three-dimensional charge order in AV(3)Sb(5) (A = K, Rb, Cs) and its interplay with superconductivity. The observed difference in charge order structure between CsV3Sb5 and the other compounds potentially explains the double-dome superconductivity in CsV3(Sb,Sn)(5) and the suppression of T-c in KV3Sb5 and RbV3Sb5. These findings provide fresh insights into the phase diagram of AV(3)Sb(5).
Article
Physics, Multidisciplinary
Pablo G. Tello, D. O. N. A. T. O. Bini, S. T. U. A. R. T. Kauffman, S. A. U. R. O. Succi
Summary: This letter proposes an approach to the vacuum energy and the cosmological constant (CC) paradox based on the Zel'dovich's ansatz, which states that the observable contribution to the vacuum energy density is given by the gravitational energy of virtual particle-antiparticle pairs. The novelty of this work is the use of an ultraviolet cut-off length based on the holographic principle, which yields current values of the CC in semi-quantitative agreement with experimental observations.
Article
Chemistry, Multidisciplinary
Yeonchoo Cho, Gabriel R. Schleder, Daniel T. Larson, Elise Brutschea, Kyung-Eun Byun, Hongkun Park, Philip Kim, Efthimios Kaxiras
Summary: The researchers propose a solution to the metal-semiconductor contact resistance problem, called modulation doping, by placing a doping layer on the opposite side of the metal-semiconductor interface. By using first-principles calculations, they demonstrate that modulation doping can reduce the Schottky barrier height and contact resistance at the metal-semiconductor interface. The feasibility of this approach is demonstrated for single-layer tungsten diselenide and 2D MXene materials, and it can be generalized for other 2D semiconductors.
Article
Physics, Multidisciplinary
Nikita V. Tepliakov, Johannes Lischner, Efthimios Kaxiras, Arash A. Mostofi, Michele Pizzochero
Summary: In this study, a new perspective on the electronic structure of armchair graphene nanoribbons is presented using simple model Hamiltonians and ab initio calculations. The research demonstrates that the energy-gap opening in these nanoribbons is caused by the breaking of a hidden symmetry through long-ranged hopping of pi electrons and structural distortions at the edges. This hidden symmetry can be restored or manipulated through in-plane lattice strain, enabling continuous energy-gap tuning, the emergence of Dirac points at the Fermi level, and topological quantum phase transitions. This work establishes an original interpretation of the semiconducting properties of armchair graphene nanoribbons and provides guidelines for their rational electronic structure design.
PHYSICAL REVIEW LETTERS
(2023)
Article
Physics, Mathematical
Daniele Simeoni, Alessandro Gabbana, Sauro Succi
Summary: In this work, we provide both analytic and numerical solutions for the Bjorken flow, which is a standard benchmark in relativistic hydrodynamics. It offers a simple model for the macroscopic evolution of matter produced in heavy nucleus collisions. We consider relativistic gases with both massive and massless particles, working in a (2+1) and (3+1) Minkowski spacetime coordinate system. The numerical results obtained from a newly developed lattice kinetic scheme show excellent agreement with the analytic solutions.
COMMUNICATIONS IN COMPUTATIONAL PHYSICS
(2023)
Article
Physics, Mathematical
Giacomo Falcucci, Giorgio Amati, Pierluigi Fanelli, Sauro Succi, Maurizio Porfiri
Summary: This study investigates the flow characteristics of the Hexactinellid Sponge Euplectella aspergillum using large-scale simulations. The findings reveal the evolutionary adaptations of deep-sea sponges to fluid flow and open up new possibilities for interdisciplinary research in physics, engineering, and biology at the ocean interface.
COMMUNICATIONS IN COMPUTATIONAL PHYSICS
(2023)
Article
Chemistry, Physical
Mihir Durve, Sibilla Orsini, Adriano Tiribocchi, Andrea Montessori, Jean-Michel Tucny, Marco Lauricella, Andrea Camposeo, Dario Pisignano, Sauro Succi
Summary: Tracking droplets in microfluidics is a challenging task, and choosing a tool to analyze microfluidic videos is difficult. The YOLO and DeepSORT algorithms are used for droplet identification and tracking by training networks. Several YOLOv5 and YOLOv7 models and the DeepSORT network were trained for droplet tracking. Performance comparison between YOLOv5 and YOLOv7 in terms of training time and video analysis time was conducted. Real-time tracking was achieved with lighter YOLO models on RTX 3070 Ti GPU due to additional droplet tracking costs from the DeepSORT algorithm. This work serves as a benchmark study for YOLOv5 and YOLOv7 networks with DeepSORT for microfluidic droplet analysis.
EUROPEAN PHYSICAL JOURNAL E
(2023)
Article
Chemistry, Physical
Adriano Tiribocchi, Andrea Montessori, Giorgio Amati, Massimo Bernaschi, Fabio Bonaccorso, Sergio Orlandini, Sauro Succi, Marco Lauricella
Summary: A regularized version of the lattice Boltzmann method is proposed for efficient simulation of soft materials. It reconstructs the distribution functions from available hydrodynamic variables without storing the full set of discrete populations, leading to lower memory requirements and data access costs. Benchmark tests validate the method's effectiveness for simulating soft matter systems, particularly on future exascale computers.
JOURNAL OF CHEMICAL PHYSICS
(2023)
Article
Mechanics
A. Tiribocchi, M. Durve, M. Lauricella, A. Montessori, D. Marenduzzo, S. Succi
Summary: Active droplets are artificial microswimmers that exhibit self-propelled motion. The authors study the effect of activity on a droplet containing a contractile polar fluid confined within microfluidic channels of various sizes. They find a range of shapes and dynamic regimes, regulated by contractile stress, droplet elasticity, and microchannel width.
Article
Physics, Particles & Fields
Andrea Solfanelli, Stefano Ruffo, Sauro Succi, Nicolo Defenu
Summary: In this study, we investigate the asymptotic behavior of the entanglement entropy for Kitaev chains with long-range hopping and pairing couplings. We find that the system exhibits an extremely rich phenomenology due to its truly non-local nature. In the strong long-range regime, we observe logarithmic, fractal, or volume-law entanglement scaling depending on the values of the chemical potential and power law decay strength.
JOURNAL OF HIGH ENERGY PHYSICS
(2023)
Article
Chemistry, Multidisciplinary
Nikita V. Tepliakov, Ruize Ma, Johannes Lischner, Efthimios Kaxiras, Arash A. Mostofi, Michele Pizzochero
Summary: In this study, it is predicted that the recently fabricated heterojunctions of zigzag nanoribbons embedded in two-dimensional hexagonal boron nitride exhibit half-semimetallic behavior, with opposite energy shifts of the states residing at the two edges while maintaining their intrinsic antiferromagnetic exchange coupling. These heterojunctions undergo an antiferromagnetic-to-ferrimagnetic phase transition upon doping, where the sign of the excess charge controls the spatial localization of the net magnetic moments. This research holds promise for the development of carbon-based spintronics.
Article
Physics, Multidisciplinary
Ziyan Zhu, Marios Mattheakis, Weiwei Pan, Efthimios Kaxiras
Summary: In this study, we introduce a deep neural network model called HubbardNet for variational determination of the ground-state and excited-state wave functions of the one-dimensional and two-dimensional Bose-Hubbard model. The model demonstrates excellent generalization ability and outperforms traditional methods in terms of computational efficiency and accuracy.
PHYSICAL REVIEW RESEARCH
(2023)
Article
Astronomy & Astrophysics
Donato Bini, Stuart Kauffman, Pablo G. Tello, Sauro Succi
Summary: In this study, we compute the metric fluctuations induced by a turbulent energy-matter tensor within the first order post-Minkowskian approximation. We find that the turbulent energy cascade can interfere with the process of black hole formation and exhibit a potentially strong coupling between these two highly nonlinear phenomena. Furthermore, we discover that the power-law turbulent energy spectrum determines the scaling of metric fluctuations as xn-2, with x representing the four-dimensional spacelike distance in Minkowski spacetime and highlighting metric singularities when n < 2. Finally, we discuss the effect of metric fluctuations on the geodesic motion of test particles as a potential technique to extract information on the spectral characteristics of fluctuating spacetime.
Article
Mathematics
Mihir Durve, Andriano Tiribocchi, Andrea Montessori, Marco Lauricella, Sauro Succi
Summary: This work analyzes the trajectories obtained from YOLO and DeepSORT algorithms in dense emulsion systems simulated using lattice Boltzmann methods. The findings reveal that the direction of individual droplets is more influenced by those immediately behind rather than in front of them. The analysis also provides insights into the constraints of a dynamical model for dense emulsions in narrow channels.
COMMUNICATIONS IN APPLIED AND INDUSTRIAL MATHEMATICS
(2022)