Publications

Results 1–25 of 58

Search results

Jump to search filters

Performance Portable Batched Sparse Linear Solvers

IEEE Transactions on Parallel and Distributed Systems

Liegeois, Kim A.; Rajamanickam, Sivasankaran R.; Berger-Vergiat, Luc B.

Solving large number of small linear systems is increasingly becoming a bottleneck in computational science applications. While dense linear solvers for such systems have been studied before, batched sparse linear solvers are just starting to emerge. In this paper, we discuss algorithms for solving batched sparse linear systems and their implementation in the Kokkos Kernels library. The new algorithms are performance portable and map well to the hierarchical parallelism available in modern accelerator architectures. The sparse matrix vector product (SPMV) kernel is the main performance bottleneck of the Krylov solvers we implement in this work. The implementation of the batched SPMV and its performance are therefore discussed thoroughly in this paper. The implemented kernels are tested on different Central Processing Unit (CPU) and Graphic Processing Unit (GPU) architectures. We also develop batched Conjugate Gradient (CG) and batched Generalized Minimum Residual (GMRES) solvers using the batched SPMV. Our proposed solver was able to solve 20,000 sparse linear systems on V100 GPUs with a mean speedup of 76x and 924x compared to using a parallel sparse solver with a block diagonal system with all the small linear systems, and compared to solving the small systems one at a time, respectively. We see mean speedup of 0.51 compared to dense batched solver of cuSOLVER on V100, while using lot less memory. Thorough performance evaluation on three different architectures and analysis of the performance are presented.

More Details

ExaWind: Then and now

Crozier, Paul C.; Berger-Vergiat, Luc B.; Dement, David C.; deVelder, Nathaniel d.; Hu, Jonathan J.; Knaus, Robert C.; Lee, Dong H.; Matula, Neil M.; Overfelt, James R.; Sakievich, Philip S.; Smith, Timothy A.; Williams, Alan B.; Prokopenko, Andrey; Moser, Robert; Melvin, Jeremy; Sprague, Michael; Bidadi, Shreyas; Brazell, Michael; Brunhart-Lupo, Nicholas; Henry De Frahan, Marc; Rood, Jon; Sharma, Ashesh; Topcuoglu, Ilker; Vijayakumar, Ganesh

Abstract not provided.

ExaWind: Exascale Predictive Wind Plant Flow Physics Modeling

Sprague, Michael A.; Brazell, Michael; Brunhart-Lupo, Nicholas; Mullowney, Paul; Rood, Jon; Sharma, Ashesh; Thomas, Stephen; Vijayakumar, Ganesh; Crozier, Paul C.; Berger-Vergiat, Luc B.; Cheung, Lawrence C.; deVelder, Nathaniel d.; Hu, Jonathan J.; Knaus, Robert C.; Lee, Dong H.; Matula, Neil M.; Overfelt, James R.; Sakievich, Philip S.; Smith, Timothy A.; Williams, Alan B.; Yamazaki, Ichitaro Y.; Turner, John A.; Prokopenko, Andrey; Wilson, Robert; Moser, Robert; Melvin, Jeremy

Abstract not provided.

Half-Precision Scalar Support in Kokkos and Kokkos Kernels: An Engineering Study and Experience Report

Proceedings - 2022 IEEE 18th International Conference on e-Science, eScience 2022

Harvey, Evan C.; Milewicz, Reed M.; Trott, Christian R.; Berger-Vergiat, Luc B.; Rajamanickam, Sivasankaran R.

To keep pace with the demand for innovation through scientific computing, modern scientific software development is increasingly reliant upon a rich and diverse ecosystem of software libraries and toolchains. Research software engineers (RSEs) responsible for that infrastructure perform highly integrative work, acting as a bridge between the hardware, the needs of researchers, and the software layers situated between them; relatively little, however, has been written about the role played by RSEs in that work and what support they need to thrive. To that end, we present a two-part report on the development of half-precision floating point support in the Kokkos Ecosystem. Half-precision computation is a promising strategy for increasing performance in numerical computing and is particularly attractive for emerging application areas (e.g., machine learning), but developing practicable, portable, and user-friendly abstractions is a nontrivial task. In the first half of the paper, we conduct an engineering study on the technical implementation of the Kokkos half-precision scalar feature and showcase experimental results; in the second half, we offer an experience report on the challenges and lessons learned during feature development by the first author. We hope our study provides a holistic view on scientific library development and surfaces opportunities for future studies into effective strategies for RSEs engaged in such work.

More Details

Harnessing exascale for whole wind farm high-fidelity simulations to improve wind farm efficiency

Crozier, Paul C.; Adcock, Christiane; Ananthan, Shreyas; Berger-Vergiat, Luc B.; Brazell, Michael; Brunhart-Lupo, Nicholas; Henry De Frahan, Marc T.; Hu, Jonathan J.; Knaus, Robert C.; Melvin, Jeremy; Moser, Bob; Mullowney, Paul; Rood, Jon; Sharma, Ashesh; Thomas, Stephen; Vijayakumar, Ganesh; Williams, Alan B.; Wilson, Robert; Yamazaki, Ichitaro Y.; Sprague, Michael A.

Abstract not provided.

FY2021 Q4: Demonstrate moving-grid multi-turbine simulations primarily run on GPUs and propose improvements for successful KPP-2 [Slides]

Adcock, Christiane; Ananthan, Shreyas; Berger-Vergiat, Luc B.; Brazell, Michael; Brunhart-Lupo, Nicholas; Hu, Jonathan J.; Knaus, Robert C.; Melvin, Jeremy; Moser, Bob; Mullowney, Paul; Rood, Jon; Sharma, Ashesh; Thomas, Stephen; Vijayakumar, Ganesh; Williams, Alan B.; Wilson, Robert; Yamazaki, Ichitaro Y.; Sprague, Michael

Isocontours of Q-criterion with velocity visualized in the wake for two NREL 5-MW turbines operating under uniform-inflow wind speed of 8 m/s. Simulation performed with the hybrid-Nalu-Wind/AMR-Wind solver.

More Details

The Kokkos EcoSystem: Comprehensive Performance Portability for High Performance Computing

Computing in Science and Engineering

Trott, Christian R.; Berger-Vergiat, Luc B.; Poliakoff, David Z.; Rajamanickam, Sivasankaran R.; Lebrun-Grandie, Damien; Madsen, Jonathan; Al Awar, Nader; Gligoric, Milos; Shipman, Galen; Womeldorff, Geoff

State-of-the-art engineering and science codes have grown in complexity dramatically over the last two decades. Application teams have adopted more sophisticated development strategies, leveraging third party libraries, deploying comprehensive testing, and using advanced debugging and profiling tools. In today's environment of diverse hardware platforms, these applications also desire performance portability-avoiding the need to duplicate work for various platforms. The Kokkos EcoSystem provides that portable software stack. Based on the Kokkos Core Programming Model, the EcoSystem provides math libraries, interoperability capabilities with Python and Fortran, and Tools for analyzing, debugging, and optimizing applications. In this article, we overview the components, discuss some specific use cases, and highlight how codesigning these components enables a more developer friendly experience.

More Details

ExaWind: Exascale Predictive Wind Plant Flow Physics Modeling

Sprague, Michael; Ananthan, Shreyas; Binyahib, Roba; Brazell, Michael; De Frahan, Marc H.; King, Ryan A.; Mullowney, Paul; Rood, Jon; Sharma, Ashesh; Thomas, Stephen A.; Vijayakumar, Ganesh; Crozier, Paul C.; Berger-Vergiat, Luc B.; Cheung, Lawrence C.; Dement, David C.; deVelder, Nathaniel d.; Glaze, D.J.; Hu, Jonathan J.; Knaus, Robert C.; Lee, Dong H.; Matula, Neil M.; Okusanya, Tolulope O.; Overfelt, James R.; Rajamanickam, Sivasankaran R.; Sakievich, Philip S.; Smith, Timothy A.; Vo, Johnathan V.; Williams, Alan B.; Yamazaki, Ichitaro Y.; Turner, William J.; Prokopenko, Andrey; Wilson, Robert V.; Moser, Robert; Melvin, Jeremy; Sitaraman, Jay

Abstract not provided.

Results 1–25 of 58
Results 1–25 of 58