GPU Performance of Algebraic multigrid in Trilinos/Muelu
Abstract not provided.
Abstract not provided.
Near-term solutions are needed to allow for flexible engagement in future nuclear arms control discussions. This project developed a method for implementing an information barrier (IB) on commercial systems, shortening the research and development lifecycle for warhead verification technologies while offering improved and inherently flexible capabilities. The crux of the verification challenge remains the difficulty in developing an authenticatable IB which prevents sensitive host country information from inadvertent transmission to an inspector. Many concepts for IB’s rely on dedicated “trusted” processor modules developed with dedicated custom radiation detection systems and associated algorithms. Without a priori knowledge of the treaty item, the parameter space for measurements can be nearly infinite and robustness against spoofing without the ability to view sensitive data is key. This project has produced an unclassified framework capable of ingesting data from common gamma detectors and identifying the presence of weapons grade nuclear material at over 90% accuracy.
The basic building block of a distributed-memory cluster or supercomputer is a node. Each node includes a host, which is a processor (xPU) + memory hierarchy. The host can communicate with other hosts via its NIC (network interface controller). A network connects the nodes. The nodes may be arranged in some topology, which determines the network’s carrying capacity and cost.
Abstract not provided.
Abstract not provided.
Abstract not provided.
International Journal of Impact Engineering
ALEGRA is a multiphysics finite-element shock hydrodynamics code, under development at Sandia National Laboratories since 1990. Fully coupled multiphysics capabilities include transient magnetics, magnetohydrodynamics, electromechanics, and radiation transport. Importantly, ALEGRA is used to study hypervelocity impact, pulsed power devices, and radiation effects. The breadth of physics represented in ALEGRA is outlined here, along with simulated results for a selected hypervelocity impact experiment.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
Centered on modern C++ and the SYCL standard for heterogeneous programming, Data Parallel C++ (dpc++) and Intel's oneAPI software ecosystem aim to lower the barrier to entry for the use of accelerators like FPGAs in diverse applications. In this work, we consider the usage of FPGAs for scientific computing, in particular with a multigrid solver, MueLu. We report on early experiences implementing kernels of the solver in DPC++ for execution on Stratix 10 FPGAs, and we evaluate several algorithmic design and implementation choices. These choices not only impact performance, but also shed light on the capabilities and limitations of DPC++ and oneAPI.
Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
Centered on modern C++ and the SYCL standard for heterogeneous programming, Data Parallel C++ (dpc++) and Intel's oneAPI software ecosystem aim to lower the barrier to entry for the use of accelerators like FPGAs in diverse applications. In this work, we consider the usage of FPGAs for scientific computing, in particular with a multigrid solver, MueLu. We report on early experiences implementing kernels of the solver in DPC++ for execution on Stratix 10 FPGAs, and we evaluate several algorithmic design and implementation choices. These choices not only impact performance, but also shed light on the capabilities and limitations of DPC++ and oneAPI.
This work, building on previous efforts, develops a suite of new graph neural network machine learning architectures that generate data-driven prolongators for use in Algebraic Multigrid (AMG). Algebraic Multigrid is a powerful and common technique for solving large, sparse linear systems. Its effectiveness is problem dependent and heavily depends on the choice of the prolongation operator, which interpolates the coarse mesh results onto a finer mesh. Previous work has used recent developments in graph neural networks to learn a prolongation operator from a given coefficient matrix. In this paper, we expand on previous work by exploring architectural enhancements of graph neural networks. A new method for generating a training set is developed which more closely aligns to the test set. Asymptotic error reduction factors are compared on a test suite of 3-dimensional Poisson problems with varying degrees of element stretching. Results show modest improvements in asymptotic error factor over both commonly chosen baselines and learning methods from previous work.
Abstract not provided.
As the number of supported platforms for SNL software increases, so do the testing requirements. This increases the total time spent between when a developer submits code for testing, and when tests are completed. This in turn leads developers to hold off submitting code for testing, meaning that when code is ready for testing there's a lot more of it. This increases the likelihood of merge conflicts which the developer must resolve by hand -- because someone else touched the files near the lines the developer touched. Current text-based diff tools often have trouble resolving conflicts in these cases. Work in Europe and Japan has demonstrated that, using programming language aware diff tools (e.g., using the abstract syntax tree (AST) a compiler might generate) can reduce the manual labor necessary to resolve merge conflicts. These techniques can detect code blocks which have moved, as opposed than current text-based diff tools, which only detect insertions / deletions of text blocks. In this study, we evaluate one such tool, GumTree, and see how effective it is as a replacement for traditional text-based diff approaches.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
A verification study is conducted for the ALEGRA software, using the problem of an electrified medium with a spherical inclusion, paying special attention to resistive heating. We do so by extending an existing analytic solution for this problem to include both conducting and insulating inclusions, and we examine the effects of mesh resolution and mesh topology, considering both body-fitted and rectangular meshes containing mixed cells. We present observed rates of convergence with respect to mesh refinement for four electromagnetic quantities: electric potential, electric field, current density and Joule power.
We show here that Sandia's ALEGRA software can be used to model a permanent magnet in 2D and 3D, with accuracy matching that of the open-source commercial software FEMM. This is done by conducting simulations and experimental measurements for a commercial-grade N42 neodymium alloy ring magnet with a measured magnetic field strength of approximately 0.4 T in its immediate vicinity. Transient simulations using ALEGRA and static simulations using FEMM are conducted. Comparisons are made between simulations and measurements, and amongst the simulations, for sample locations in the steady-state magnetic field. The comparisons show that all models capture the data to within 7%. The FEMM and ALEGRA results agree to within approximately 2%. The most accurate solutions in ALEGRA are obtained using quadrilateral or hexahedral elements. In the case where iron shielding disks are included in the magnetized space, ALEGRA simulations are considerably more expensive because of the increased magnetic diffusion time, but FEMM and ALEGRA results are still in agreement. The magnetic field data are portable to other software interfaces using the Exodus file format.
Abstract not provided.
The eddy current approximation to Maxwell's equation often omits terms associated with magnetization, removing permanent magnets from the domain of validity of the approximation. We show that adding these terms back into the eddy current approximation is relatively straightforward, and demonstrate this on using a simple material constitutive model.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This is the official user guide for MUELU multigrid library in Trilinos version 12.13 (Dev). This guide provides an overview of MUELU, its capabilities, and instructions for new users who want to start using MUELU with a minimum of effort. Detailed information is given on how to drive MUELU through its XML interface. Links to more advanced use cases are given. This guide gives information on how to achieve good parallel performance, as well as how to introduce new algorithms Finally, readers will find a comprehensive listing of available MUELU options. Any options not documented in this manual should be considered strictly experimental.
Journal of Computational Physics
High resolution simulation of viscous fingering can offer an accurate and detailed prediction for subsurface engineering processes involving fingering phenomena. The fully implicit discontinuous Galerkin (DG) method has been shown to be an accurate and stable method to model viscous fingering with high Peclet number and mobility ratio. In this paper, we present two techniques to speedup large scale simulations of this kind. The first technique relies on a simple p-adaptive scheme in which high order basis functions are employed only in elements near the finger fronts where the concentration has a sharp change. As a result, the number of degrees of freedom is significantly reduced and the simulation yields almost identical results to the more expensive simulation with uniform high order elements throughout the mesh. The second technique for speedup involves improving the solver efficiency. We present an algebraic multigrid (AMG) preconditioner which allows the DG matrix to leverage the robust AMG preconditioner designed for the continuous Galerkin (CG) finite element method. The resulting preconditioner works effectively for fixed order DG as well as p-adaptive DG problems. With the improvements provided by the p-adaptivity and AMG preconditioning, we can perform high resolution three-dimensional viscous fingering simulations required for miscible displacement with high Peclet number and mobility ratio in greater detail than before for well injection problems.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report documents the outcome from the ASC ATDM Level 2 Milestone 6358: Assess Status of Next Generation Components and Physics Models in EMPIRE. This Milestone is an assessment of the EMPIRE (ElectroMagnetic Plasma In Realistic Environments) application and three software components. The assessment focuses on the electromagnetic and electrostatic particle-in-cell solutions for EMPIRE and its associated solver, time integration, and checkpoint-restart components. This information provides a clear understanding of the current status of the EMPIRE application and will help to guide future work in FY19 in order to ready the application for the ASC ATDM L1 Milestone in FY20. It is clear from this assessment that performance of the linear solver will have to be a focus in FY19.