Publications

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Sandia ATDM Performance Execution Tools & Analysis

Hammond, Simon; Vaughan, Courtenay T.; Dinge, Dennis; Lin, Paul T.; Benner, Robert E.; Hughes, Clayton; Trott, Christian R.; Cook, Jeanine; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Kokkos: Hierarchical Parallelism

Hammond, Simon; Trott, Christian R.; Ibanez-Granados, Daniel A.; Edwards, Harold C.; Sunderland, Daniel; Ellingwood, Nathan D.; Brandt, James M.; Gentile, Ann C.; Cook, Jeanine; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Enhanced Profiling for Kokkos Applications

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Threaded Assembly in Aria Expressions

Clausen, Jonathan; Brunini, Victor; Forster, Chris; Noble, David R.; Hoemmen, Mark F.; Hammond, Simon; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Towards Performance Portable Assembly Tools for Multi-Fluid Plasma Simulation

Pawlowski, Roger; Bettencourt, Matthew T.; Cyr, Eric C.; Miller, Sean; Phillips, Edward; Phipps, Eric T.; Shadid, John N.; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Profiling and Debugging Support for the Kokkos Programming Model

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Hammond, Simon; Trott, Christian R.; Ibanez-Granados, Daniel A.; Sunderland, Daniel

Supercomputing hardware is undergoing a period of significant change. In order to cope with the rapid pace of hardware and, in many cases, programming model innovation, we have developed the Kokkos Programming Model – a C++-based abstraction that permits performance portability across diverse architectures. Our experience has shown that the abstractions developed can significantly frustrate debugging and profiling activities because they break expected code proximity and layout assumptions. In this paper we present the Kokkos Profiling interface, a lightweight, suite of hooks to which debugging and profiling tools can attach to gain deep insights into the execution and data structure behaviors of parallel programs written to the Kokkos interface.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI Scopus

Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures

Deveci, Mehmet; Trott, Christian R.; Rajamanickam, Sivasankaran

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.

More Details

TYPE Other Report YEAR 2017

DOI OSTI

The Kokkos Programming Model

Trott, Christian R.; Bova, Steven W.; Ellingwood, Nathan D.; Ibanez-Granados, Daniel A.; Labreche, Duane A.; Sunderland, Daniel; Edwards, Harold C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Kokkoskernels: Portable Math and Graph Kernels

Rajamanickam, Sivasankaran; Kim, Kyungjoo; Deveci, Mehmet; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

December 2017 ECP ST Project Review ECP Project WBS 2.3.1.04 : SNL ATDM PMR

Wilke, Jeremiah; Trott, Christian R.; Edwards, Harold C.; Glass, Micheal W.; Clay, Robert L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

A Brief Description of the Kokkos implementation of the SNAP potential in ExaMiniMD

Thompson, A.P.; Trott, Christian R.

Within the EXAALT project, the SNAP [1] approach is being used to develop high accuracy potentials for use in large-scale long-time molecular dynamics simulations of materials behavior. In particular, we have developed a new SNAP potential that is suitable for describing the interplay between helium atoms and vacancies in high-temperature tungsten[2]. This model is now being used to study plasma-surface interactions in nuclear fusion reactors for energy production. The high-accuracy of SNAP potentials comes at the price of increased computational cost per atom and increased computational complexity. The increased cost is mitigated by improvements in strong scaling that can be achieved using advanced algorithms [3].

More Details

TYPE Other Report YEAR 2017

DOI OSTI

Applications of Compact Batched Kernels

Rajamanickam, Sivasankaran; Bradley, Andrew M.; Deveci, Mehmet; Kim, Kyungjoo; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Kokkoskernels

Rajamanickam, Sivasankaran; Bradley, Andrew M.; Deveci, Mehmet; Kim, Kyungjoo; Trott, Christian R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

KokkosKernels: Performance-Portable Sparse Dense and Graph Kernels

Rajamanickam, Sivasankaran; Bradley, Andrew M.; Deveci, Mehmet; Hoemmen, Mark F.; Hammond, Simon; Kim, Kyungjoo; Trott, Christian R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Kokkos Tutorial

Edwards, Harold C.; Trott, Christian R.; Foertter, Fernanda

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

1541 L2 Milestone: Thread Scalable Expression Assembly in Aria

Clausen, Jonathan; Brunini, Victor; Forster, Christopher J.; Noble, David R.; Trott, Christian R.; Hammond, Simon; Hoemmen, Mark F.; Lin, Paul T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Performance Portable Sparse Matrix Matrix Multiplication with Applications in Scientific Computing and Graph Analytics

Deveci, Mehmet; Trott, Christian R.; Hammond, Simon; Wolf, Michael; Berry, Jonathan; Rajamanickam, Sivasankaran

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Solving the performance portability issue with Kokkos

Trott, Christian R.; Plimpton, Steven J.; Thompson, A.P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

On the Importance of Faster Atomics

Hammond, Simon; Trott, Christian R.; Edwards, Harold C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures

Garcia De Gonzalo, Simon; Hammond, Simon; Trott, Christian R.; Huw, Wen-Mei

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Prototyping the Next Generation of Aria

Clausen, Jonathan; Brunini, Victor; Forster, Christopher J.; Noble, David R.; Trott, Christian R.; Hammond, Simon; Hoemmen, Mark F.; Lin, Paul T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

A Classical MD Primer

Deveci, Mehmet; Trott, Christian R.; Rajamanickam, Sivasankaran

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Performance-portable sparse matrix-matrix multiplication for many-core architectures

Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017

We consider the problem of writing performance portablesparse matrix-sparse matrix multiplication (SPGEMM) kernelfor many-core architectures. We approach the SPGEMMkernel from the perspectives of algorithm design and implementation, and its practical usage. First, we design ahierarchical, memory-efficient SPGEMM algorithm. We thendesign and implement thread scalable data structures thatenable us to develop a portable SPGEMM implementation. We show that the method achieves performance portabilityon massively threaded architectures, namely Intel's KnightsLanding processors (KNLs) and NVIDIA's Graphic ProcessingUnits (GPUs), by comparing its performance to specializedimplementations. Second, we study an important aspectof SPGEMM's usage in practice by reusing the structure ofinput matrices, and show speedups up to 3× compared to thebest specialized implementation on KNLs. We demonstratethat the portable method outperforms 4 native methods on2 different GPU architectures (up to 17× speedup), and it ishighly thread scalable on KNLs, in which it obtains 101× speedup on 256 threads.

More Details

TYPE Conference Poster YEAR 2017

DOI OSTI Scopus

OpenACC for Programmers: Concepts and Strategies

Hammond, Simon; Trott, Christian R.

Abstract not provided.

More Details

TYPE Book YEAR 2017

OSTI

Optimizing the Performance of Sparse-Matrix Vector Products on Next-Generation Processors

Matrix-vector products are ubiquitous in high-performance scientific applications and have a growing set of occurrences in advanced data analysis activities. Achieving high performance for these kernels is therefore paramount, in part, because these operations can consume vast amounts of application execution time. In this report we document the development of several sparse-matrix vector product kernel implementations using a variety of programming models and approaches. Each kernel is run on a broad set of matrices selected to demonstrate the wide variety of matrix structure and sparsity that is possible with a single, generic kernel. For benchmarking and performance analysis, we utilize leading computing architectures for the NNSA/ASC program including Intel's Knights Landing processor and IBM's POWER8.

More Details

TYPE SAND Report YEAR 2017

DOI OSTI

Kokkos Tutorial

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Profiling Kokkos Application

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Kokkos: The C++ Performance Portability Programming Model

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Kokkos: The C++ Performance Portability Programming Model

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Next Generation Science Applications for the Next Generation of Supercomputing

Vaughan, Courtenay T.; Hammond, Simon; Dinge, Dennis; Lin, Paul T.; Pase, Douglas M.; Cook, Jeanine; Trott, Christian R.; Hughes, Clayton; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Next Generation Science Applications for the Next Generation of Supercomputing

Vaughan, Courtenay T.; Hammond, Simon; Dinge, Dennis; Lin, Paul T.; Pase, Douglas M.; Trott, Christian R.; Cook, Jeanine; Hughes, Clayton; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Enabling Low Mach Fluid Simulations Using Trilinos

Hu, Jonathan J.; Devine, Karen; Hoemmen, Mark F.; Lin, Paul T.; Rajamanickam, Sivasankaran; Roberts, Nathan V.; Siefert, Christopher; Trott, Christian R.; Prokopenko, Andrey

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Kokkos: Performance Portability Status

Sunderland, Daniel; Edwards, Harold C.; Trott, Christian R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on High-Performance Accelerators

Garcia De Gonzalo, Simon; Huw, Wen-Mei; Hammond, Simon; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Extending Kokkos with Task Parallelism

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

KokkosKernels: Compact Layouts for Batched Blas and Sparse Matrix-Matrix multiply

Rajamanickam, Sivasankaran; Bradley, Andrew M.; Kim, Kyungjoo; Deveci, Mehmet; Trott, Christian R.; Hammond, Simon

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Performance Issues for Modeling Materials via MD on Current and Future Hardware

Plimpton, Steven J.; Moore, Stan G.; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Preparing Sandia's Application Portfolio for the Future Using Kokkos

Trott, Christian R.; Edwards, Harold C.; Hammond, Simon; Sunderland, Daniel

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Codesign for Production Applications

Hammond, Simon; Trott, Christian R.; Vaughan, Courtenay T.; Dinge, Dennis; Lin, Paul T.; Pase, Douglas M.; Benner, Robert E.; Cook, Jeanine; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Kokkos: Performance Portability for C++ Codes

Brunini, Victor; Clausen, Jonathan; Noble, David R.; Forster, Christopher J.; Trott, Christian R.; Hammond, Simon; Hoemmen, Mark F.; Lin, Paul T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Prototyping the Next-Generation of Aria

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Kokkos: Performance Portability and Productivity for C++ Applications

Edwards, Harold C.; Trott, Christian R.; Sunderland, Daniel

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Early Experience with P100 on Power8

Rajamanickam, Sivasankaran; Deveci, Mehmet; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Performance Portable Sparse Matrix-Matrix Multiplication on Intel Knights Landing and NVIDIA GPUs

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

KokkosKernels Introduction: Design API and Performance

Deveci, Mehmet; Rajamanickam, Sivasankaran; Kim, Kyungjoo; Bradley, Andrew M.; Trott, Christian R.; Hoemmen, Mark F.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Kokkos Technical Review Slides and Discussion Notes

Edwards, Harold C.; Sunderland, Daniel; Hoemmen, Mark F.; Ellingwood, Nathan D.; Trott, Christian R.; Mackey, Greg E.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Kokkos Tutorial

Edwards, Harold C.; Trott, Christian R.; Amelang, Jeff

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

ASC Tri-Lab L2 Codesign Milestone 2016 - Update and Milestone Summary

Hammond, Simon; Trott, Christian R.; Lin, Paul T.; Vaughan, Courtenay T.; Cook, Jeanine; Dinge, Dennis; Pase, Douglas M.; Benner, Robert E.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Pathfinding: From Technology Exploration to Application Support

Trott, Christian R.; Hammond, Simon

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Early Experiences with Trinity - The First Advanced Technology Platform for the ASC Program

Vaughan, Courtenay T.; Dinge, Dennis; Lin, Paul T.; Hammond, Simon; Cook, Jeanine; Trott, Christian R.; Agelastos, Anthony M.; Pase, Douglas M.; Benner, Robert E.; Rajan, Mahesh; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Sustainability and Performance thorugh Kokkos: A Case Study with LAMMPS

Trott, Christian R.; Hammond, Simon; Moore, Stan G.; Shan, Tzu-Ray

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Early Experiences with Trinity - The First Advanced Technology Platform for the ASC Program

Vaughan, Courtenay T.; Dinge, Dennis; Lin, Paul T.; Hammond, Simon; Cook, Jeanine; Trott, Christian R.; Agelastos, Anthony M.; Pase, Douglas M.; Benner, Robert E.; Rajan, Mahesh; Hoekstra, Robert J.; Pierson, Kendall H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Kokkos- Performance Portability Today

Trott, Christian R.; Hammond, Simon; Edwards, Harold C.; Ellingwood, Nathan D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

KokkosP: Runtime Hooks for Portable Performance Analysis

Hammond, Simon; Trott, Christian R.; Edwards, Harold C.; Ellingwood, Nathan D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Multi-Level Memory ? The Next Opportunity for Performance?

Hammond, Simon; Voskuilen, Gwendolyn R.; Rodrigues, Arun; Hemmert, Karl S.; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Performance Portability for Linear Algebra with Kokkos

Trott, Christian R.; Edwards, Harold C.; Ellingwood, Nathan D.; Hammond, Simon; Deveci, Mehmet; Boman, Erik G.; Bradley, Andrew M.; Hoemmen, Mark F.; Rajamanickam, Sivasankaran

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Kokkos Tutorial

Edwards, Harold C.; Trott, Christian R.; Amelang, Jeff

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Kokkos: Manycore Programmability and Performance Portability

Trott, Christian R.; Edwards, Harold C.; Ellingwood, Nathan D.; Hammond, Simon

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Kokkos -- Portability Performance Productivity [PowerPoint]

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

NALU Assembly ? Prototyping the NGP Transition

Trott, Christian R.; Domino, Stefan P.; Brunini, Victor; Lin, Paul T.; Agelastos, Anthony M.; Fisher, Travis C.; Hammond, Simon

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Maintainability and Performance for LAMMPS

Trott, Christian R.; Shan, Tzu-Ray; Moore, Stan G.; Thompson, A.P.; Plimpton, Steven J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Kokkos Tutorial

Edwards, Harold C.; Trott, Christian R.; Amelang, Jeff

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Codesign at Sandia: LULESH and MiniAero

Trott, Christian R.; Hammond, Simon

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Maintainability and Performance for LAMMPS

Trott, Christian R.; Thompson, A.P.; Plimpton, Steven J.; Moore, Stan G.; Shan, Tzu-Ray

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Kokkos: Enabling Performance Portablility

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

ASC Trilab L2 Codesign Milestone 2015

Trott, Christian R.; Hammond, Simon; Dinge, Dennis; Lin, Paul T.; Vaughan, Courtenay T.; Cook, Jeanine; Rajan, Mahesh; Edwards, Harold C.; Hoekstra, Robert J.

For the FY15 ASC L2 Trilab Codesign milestone Sandia National Laboratories performed two main studies. The first study investigated three topics (performance, cross-platform portability and programmer productivity) when using OpenMP directives and the RAJA and Kokkos programming models available from LLNL and SNL respectively. The focus of this first study was the LULESH mini-application developed and maintained by LLNL. In the coming sections of the report the reader will find performance comparisons (and a demonstration of portability) for a variety of mini-application implementations produced during this study with varying levels of optimization. Of note is that the implementations utilized including optimizations across a number of programming models to help ensure claims that Kokkos can provide native-class application performance are valid. The second study performed during FY15 is a performance assessment of the MiniAero mini-application developed by Sandia. This mini-application was developed by the SIERRA Thermal-Fluid team at Sandia for the purposes of learning the Kokkos programming model and so is available in only a single implementation. For this report we studied its performance and scaling on a number of machines with the intent of providing insight into potential performance issues that may be experienced when similar algorithms are deployed on the forthcoming Trinity ASC ATS platform.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Kokkos: An Introduction

Trott, Christian R.; Edwards, Harold C.; Hammond, Simon

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Proxy App Usecases at Sandia

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Kokkos: Enabling Manycore Performance Portability for C++ Applications and Libraries

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

ASC L2 Trilab Codesign Milestone (Codesign at Sandia: LULESH and MiniAero)

Cook, Jeanine; Edwards, Harold C.; Dinge, Dennis; Glass, Micheal W.; Hammond, Simon; Hoekstra, Robert J.; Lin, Paul T.; Rajan, Mahesh; Trott, Christian R.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Kokkos: Enabling Performance Portability Across Next Generation Platforms

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Use of C++11 Features in Kokkos

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials

Journal of Computational Physics

Thompson, A.P.; Swiler, Laura P.; Trott, Christian R.; Foiles, Stephen M.; Tucker, G.J.

We present a new interatomic potential for solids and liquids called Spectral Neighbor Analysis Potential (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projected onto a basis of hyperspherical harmonics in four dimensions. The bispectrum components are the same bond-orientational order parameters employed by the GAP potential [1]. The SNAP potential, unlike GAP, assumes a linear relationship between atom energy and bispectrum components. The linear SNAP coefficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. We demonstrate that a previously unnoticed symmetry property can be exploited to reduce the computational cost of the force calculations by more than one order of magnitude. We present results for a SNAP potential for tantalum, showing that it accurately reproduces a range of commonly calculated properties of both the crystalline solid and the liquid phases. In addition, unlike simpler existing potentials, SNAP correctly predicts the energy barrier for screw dislocation migration in BCC tantalum.

More Details

TYPE Journal Article YEAR 2015

DOI OSTI Scopus

Kokkos: An Introduction

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Kokkos Manycore Device Performance Portability for C++ HPC Applications

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

ASC-ATDM Performance Portability Requirements for 2015-2019

This report outlines the research, development, and support requirements for the Advanced Simulation and Computing (ASC ) Advanced Technology, Development, and Mitigation (ATDM) Performance Portability (a.k.a., Kokkos) project for 2015 - 2019 . The research and development (R&D) goal for Kokkos (v2) has been to create and demonstrate a thread - parallel programming model a nd standard C++ library - based implementation that enables performance portability across diverse manycore architectures such as multicore CPU, Intel Xeon Phi, and NVIDIA Kepler GPU. This R&D goal has been achieved for algorithms that use data parallel pat terns including parallel - for, parallel - reduce, and parallel - scan. Current R&D is focusing on hierarchical parallel patterns such as a directed acyclic graph (DAG) of asynchronous tasks where each task contain s nested data parallel algorithms. This five y ear plan includes R&D required to f ully and performance portably exploit thread parallelism across current and anticipated next generation platforms (NGP). The Kokkos library is being evaluated by many projects exploring algorithm s and code design for NGP. Some production libraries and applications such as Trilinos and LAMMPS have already committed to Kokkos as their foundation for manycore parallelism an d performance portability. These five year requirements includes support required for current and antic ipated ASC projects to be effective and productive in their use of Kokkos on NGP. The greatest risk to the success of Kokkos and ASC projects relying upon Kokkos is a lack of staffing resources to support Kokkos to the degree needed by these ASC projects. This support includes up - to - date tutorials, documentation, multi - platform (hardware and software stack) testing, minor feature enhancements, thread - scalable algorithm consulting, and managing collaborative R&D.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

A study of the viability of exploiting memory content similarity to improve resilience to memory errors

International Journal of High Performance Computing Applications

Levy, Scott; Ferreira, Kurt; Bridges, Patrick G.; Thompson, A.P.; Trott, Christian R.

Building the next-generation of extreme-scale distributed systems will require overcoming several challenges related to system resilience. As the number of processors in these systems grow, the failure rate increases proportionally. One of the most common sources of failure in large-scale systems is memory. In this paper, we propose a novel runtime for transparently exploiting memory content similarity to improve system resilience by reducing the rate at which memory errors lead to node failure. We evaluate the viability of this approach by examining memory snapshots collected from eight high-performance computing (HPC) applications and two important HPC operating systems. Based on the characteristics of the similarity uncovered, we conclude that our proposed approach shows promise for addressing system resilience in large-scale systems.

More Details

TYPE Journal Article YEAR 2015

Scopus OSTI DOI

Algorithms and Abstractions for Assembly in PDE Codes: Workshop Report

Cyr, Eric C.; Phipps, Eric T.; Heroux, Michael A.; Brown, Jed; Coon, Ethan T.; Hoemmen, Mark F.; Kirby, Robert C.; Kolev, Tzanio V.; Sutherland, James C.; Trott, Christian R.

The emergence of high-concurrency architectures offering unprecedented performance has brought many high-performance partial differential equation (PDE) discretization codes to the precipice of a major refactor. To help address this challenge a workshop titled "Algorithms and Abstractions for Assembly in PDE Codes" was held in the Computer Science Research Institute at Sandia National Laboratories on May 12th-14th, 2014. This document summarizes the goals of the workshop and the results of the presentations and subsequent discussions.

More Details

TYPE SAND Report YEAR 2014

DOI OSTI

Designing the Future: How Successful Codesign Helps Shape Hardware and Software Development

Demeshko, Irina; Edwards, Harold C.; Heroux, Michael A.; Pawlowski, Roger; Phipps, Eric T.; Salinger, Andrew G.; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Towards Architecture Aware Performance Portable Finite Element Code

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Trinity Benchmarks on Xeon Phi (Knights Corner)

Rajan, Mahesh; Doerfler, Douglas W.; Hammond, Simon; Trott, Christian R.; Barrett, Richard F.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Kokkos update: Memory Spaces Execution Spaces Execution Policies Defaults and C++11

Rajamanickam, Sivasankaran; Edwards, Harold C.; Trott, Christian R.; Sunderland, Daniel

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Automated Algorithms for Quantum-Level Accuracy in Atomistic Simulations: LDRD Final Report

Thompson, A.P.; Schultz, Peter A.; Crozier, Paul; Moore, Stan G.; Swiler, Laura P.; Stephens, John A.; Trott, Christian R.; Foiles, Stephen M.; Tucker, Garritt J.

This report summarizes the result of LDRD project 12-0395, titled "Automated Algorithms for Quantum-level Accuracy in Atomistic Simulations." During the course of this LDRD, we have developed an interatomic potential for solids and liquids called Spectral Neighbor Analysis Poten- tial (SNAP). The SNAP potential has a very general form and uses machine-learning techniques to reproduce the energies, forces, and stress tensors of a large set of small configurations of atoms, which are obtained using high-accuracy quantum electronic structure (QM) calculations. The local environment of each atom is characterized by a set of bispectrum components of the local neighbor density projected on to a basis of hyperspherical harmonics in four dimensions. The SNAP coef- ficients are determined using weighted least-squares linear regression against the full QM training set. This allows the SNAP potential to be fit in a robust, automated manner to large QM data sets using many bispectrum components. The calculation of the bispectrum components and the SNAP potential are implemented in the LAMMPS parallel molecular dynamics code. Global optimization methods in the DAKOTA software package are used to seek out good choices of hyperparameters that define the overall structure of the SNAP potential. FitSnap.py, a Python-based software pack- age interfacing to both LAMMPS and DAKOTA is used to formulate the linear regression problem, solve it, and analyze the accuracy of the resultant SNAP potential. We describe a SNAP potential for tantalum that accurately reproduces a variety of solid and liquid properties. Most significantly, in contrast to existing tantalum potentials, SNAP correctly predicts the Peierls barrier for screw dislocation motion. We also present results from SNAP potentials generated for indium phosphide (InP) and silica (SiO 2 ). We describe efficient algorithms for calculating SNAP forces and energies in molecular dynamics simulations using massively parallel computers and advanced processor ar- chitectures. Finally, we briefly describe the MSM method for efficient calculation of electrostatic interactions on massively parallel computers.

More Details

TYPE SAND Report YEAR 2014

DOI OSTI

Kokkos a Manycore DevicePerformance Portability Libraryfor C++ HPC Applications

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Kokkos: Enabling manycore performance portability through polymorphic memory access patterns

Journal of Parallel and Distributed Computing