Lessons Learned in Performance and Portability
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Efficient modeling of low magnetic Reynolds number (low-Rm) magnetohydrodynamics is often challenging and requires the implementation of innovative techniques to avoid key barriers experienced with prior approaches. We detail a new paradigm for first-principles simulation of the solution to the low-Rm governing equations in complex geometries. As a result of a number of innovative numerical advances, the next-generation GPU (graphics processing unit) accelerated physics code LGR has been successfully applied to the modeling of exploding wire problems.
Abstract not provided.
Abstract not provided.
IEEE Transactions on Parallel and Distributed Systems
As the push towards exascale hardware has increased the diversity of system architectures, performance portability has become a critical aspect for scientific software. We describe the Kokkos Performance Portable Programming Model that allows developers to write single source applications for diverse high performance computing architectures. Kokkos provides key abstractions for both the compute and memory hierarchy of modern hardware. Here, we describe the novel abstractions that have been added to Kokkos recently such as hierarchical parallelism, containers, task graphs, and arbitrary-sized atomic operations. We demonstrate the performance of these new features with reproducible benchmarks on CPUs and GPUs.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Tetrahedral finite element workflows have the potential to drastically reduce time to solution for computational solid mechanics simulations when compared to traditional hexahedral finite element analogues. A recently developed, higher-order composite tetrahedral element has shown promise in the space of incompressible computational plasticity. Mesh adaptivity has the potential to increase solution accuracy and increase solution robustness. In this work, we demonstrate an initial strategy to perform conformal mesh adaptivity for this higher-order composite tetrahedral element using well-established mesh modification operations for linear tetrahedra. We propose potential extensions to improve this initial strategy in terms of robustness and accuracy.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report documents the completion of milestone STPRO4-6 Kokkos Support for ASC applications and libraries. The team provided consultation and support for numerous ASC code projects including Sandias SPARC, EMPIRE, Aria, GEMMA, Alexa, Trilinos, LAMMPS and nimbleSM. Over the year more than 350 Kokkos github issues were resolved, with over 220 requiring fixes and enhancements to the code base. Resolving these requests, with many of them issued by ASC code teams, provided applications with the necessary capabilities in Kokkos to be successful.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Molecular Physics
Simulating energetic materials with complex microstructure is a grand challenge, where until recently, an inherent gap in computational capabilities had existed in modelling grain-scale effects at the microscale. We have enabled a critical capability in modelling the multiscale nature of the energy release and propagation mechanisms in advanced energetic materials by implementing, in the widely used LAMMPS molecular dynamics (MD) package, several novel coarse-graining techniques that also treat chemical reactivity. Our innovative algorithmic developments rooted within the dissipative particle dynamics framework, along with performance optimisations and application of acceleration technologies, have enabled extensions in both the length and time scales far beyond those ever realised by atomistic reactive MD simulations. In this paper, we demonstrate these advances by modelling a shockwave propagating through a microstructured material and comparing performance with the state-of-the-art in atomistic reactive MD techniques. As a result of this work, unparalleled explorations in energetic materials research are now possible.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Supercomputing hardware is undergoing a period of significant change. In order to cope with the rapid pace of hardware and, in many cases, programming model innovation, we have developed the Kokkos Programming Model – a C++-based abstraction that permits performance portability across diverse architectures. Our experience has shown that the abstractions developed can significantly frustrate debugging and profiling activities because they break expected code proximity and layout assumptions. In this paper we present the Kokkos Profiling interface, a lightweight, suite of hooks to which debugging and profiling tools can attach to gain deep insights into the execution and data structure behaviors of parallel programs written to the Kokkos interface.
AIAA Aerospace Sciences Meeting, 2018
Unstructured grid adaptation is a tool to control Computational Fluid Dynamics (CFD) discretization error. However, adaptive grid techniques have made limited impact on production analysis workflows where the control of discretization error is critical to obtaining reliable simulation results. Issues that prevent the use of adaptive grid methods are identified by applying unstructured grid adaptation methods to a series of benchmark cases. Once identified, these challenges to existing adaptive workflows can be addressed. Unstructured grid adaptation is evaluated for test cases described on the Turbulence Modeling Resource (TMR) web site, which documents uniform grid refinement of multiple schemes. The cases are turbulent flow over a Hemisphere Cylinder and an ONERA M6 Wing. Adaptive grid force and moment trajectories are shown for three integrated grid adaptation processes with Mach interpolation control and output error based metrics. The integrated grid adaptation process with a finite element (FE) discretization produced results consistent with uniform grid refinement of fixed grids. The integrated grid adaptation processes with finite volume schemes were slower to converge to the reference solution than the FE method. Metric conformity is documented on grid/metric snapshots for five grid adaptation mechanics implementations. These tools produce anisotropic boundary conforming grids requested by the adaptation process.
Abstract not provided.
Abstract not provided.
This report documents the ASC/ATDM Kokkos deliverable "Production Portable Dy- namic Task DAG Capability." This capability enables applications to create and execute a dynamic task DAG ; a collection of heterogeneous computational tasks with a directed acyclic graph (DAG) of "execute after" dependencies where tasks and their dependencies are dynamically created and destroyed as tasks execute. The Kokkos task scheduler executes the dynamic task DAG on the target execution resource; e.g. a multicore CPU, a manycore CPU such as Intel's Knights Landing (KNL), or an NVIDIA GPU. Several major technical challenges had to be addressed during development of Kokkos' Task DAG capability: (1) portability to a GPU with it's simplified hardware and micro- runtime, (2) thread-scalable memory allocation and deallocation from a bounded pool of memory, (3) thread-scalable scheduler for dynamic task DAG, (4) usability by applications.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.