Publications Search

Programmable accelerators have become commonplace in modern computing systems. Advances in programming models and the availability of unprecedented amounts of data have created a space for massively parallel accelerators capable of maintaining context for thousands of concurrent threads resident on-chip. These threads are grouped and interleaved on a cycle-by-cycle basis among several massively parallel computing cores. One path for the design of future supercomputers relies on an ability to model the performance of these massively parallel cores at scale. The SST framework has been proven to scale up to run simulations containing tens of thousands of nodes. A previous report described the initial integration of the open-source, execution-driven GPU simulator, GPGPU-Sim, into the SST framework. This report discusses the results of the integration and how to use the new GPU component in SST. It also provides examples of what it can be used to analyze and a correlation study showing how closely the execution matches that of a Nvidia V100 GPU when running kernels and mini-apps.

More Details

TYPE SAND Report YEAR 2021

DOI OSTI

Review of the Carbon Capture Multidisciplinary Science Center (CCMSC) at the University of Utah (2017)

Hoekstra, Robert J.; Malone, C.M.; Montoya, D.R.; Ferencz, M.R.; Kuhl, A.L.; Wagner, J.

The review was conducted on May 8-9, 2017 at the University of Utah. Overall the review team was impressed with the work presented and found that the CCMSC had met or exceeded the Year 3 milestones. Specific details, comments, and recommendations are included in this document.

More Details

TYPE Other Report YEAR 2020

DOI OSTI

Chronicles of astra: Challenges and lessons from the first petascale arm supercomputer

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Foulk, James W.; Younge, Andrew J.; Hammond, Simon; Foulk, James W.; Curry, Matthew; Aguilar, Michael J.; Hoekstra, Robert J.; Brightwell, Ronald B.

Arm processors have been explored in HPC for several years, however there has not yet been a demonstration of viability for supporting large-scale production workloads. In this paper, we offer a retrospective on the process of bringing up Astra, the first Petascale supercomputer based on 64-bit Arm processors, and validating its ability to run production HPC applications. Through this process several immature technology gaps were addressed, including software stack enablement, Linux bugs at scale, thermal management issues, power management capabilities, and advanced container support. From this experience, several lessons learned are formulated that contributed to the successful deployment of Astra. These insights can be helpful to accelerate deploying and maturing other first-seen HPC technologies. With Astra now supporting many users running a diverse set of production applications at multi-thousand node scales, we believe this constitutes strong supporting evidence that Arm is a viable technology for even the largest-scale supercomputer deployments.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

CDFG Extraction Tool for LLVM

Hughes, Clayton; Hammond, Simon; Hoekstra, Robert J.

With the dawn of the exascale era, computer scientists and engineers are faced with tremendous challenges across all facets of the HPC system - scalability, performance, reliability, and power consumption. In particular, the power-performance benefit from one processor generation to the next is seeing ever-diminishing returns and will require fundamental changes in the way we approach computation. In fact, it is likely that different applications will require different types of accelerators in order to meet power, performance, and reliability requirements at scale. One potential type of accelerator, a dataflow architecture, diverges from the traditional sequentially executed instruction model into one that reflects the inherent instruction-level parallelism in a program. This work presents the initial steps toward a tool that can extract the control-dataflow graph from an application.

More Details

TYPE SAND Report YEAR 2020

DOI OSTI

Improving the mission impact of HPC systems through CO-DESIGN

Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Predictive Science ASC Alliance Program (PSAAP) II 2016 Review of the Carbon Capture Multidisciplinary Science Center (CCMSC) at the University of Utah

Hoekstra, Robert J.; Ruggirello, Kevin P.

The review was conducted on May 9-10, 2016 at the University of Utah. Overall the review team was impressed with the work presented and found that the CCMSC had met or exceeded the Year 2 milestones. Specific details, comments and recommendations are included in this document.

More Details

TYPE Other Report YEAR 2019

DOI OSTI

Abstract Machine Models and Proxy Architectures for Exascale Computing

Ang, James A.; Barrett, Richard F.; Benner, Robert E.; Burke, Daniel; Chan, Cy; Cook, Jeanine; Daley, Christopher S.; Donofrio, David; Hammond, Simon; Hemmert, Karl S.; Hoekstra, Robert J.; Ibrahim, Khaled; Kelly, Suzanne M.; Le, Hoang; Leung, Vitus J.; Michelogiannakis, George; Resnick, David R.; Rodrigues, Arun; Shalf, John; Stark, Dylan; Unat, D.; Wright, Nick J.; Voskuilen, Gwendolyn R.

To achieve exascale computing, fundamental hardware architectures must change. The most significant consequence of this assertion is the impact on the scientific and engineering applications that run on current high performance computing (HPC) systems, many of which codify years of scientific domain knowledge and refinements for contemporary computer systems. In order to adapt to exascale architectures, developers must be able to reason about new hardware and determine what programming models and algorithms will provide the best blend of performance and energy efficiency into the future. While many details of the exascale architectures are undefined, an abstract machine model is designed to allow application developers to focus on the aspects of the machine that are important or relevant to performance and code structure. These models are intended as communication aids between application developers and hardware architects during the co-design process. We use the term proxy architecture to describe a parameterized version of an abstract machine model, with the parameters added to elucidate potential speeds and capacities of key hardware components. These more detailed architectural models are formulated to enable discussion between the developers of analytic models and simulators and computer hardware architects. They allow for application performance analysis and hardware optimization opportunities. In this report our goal is to provide the application development community with a set of models that can help software developers prepare for exascale. In addition, through the use of proxy architectures, we can enable a more concrete exploration of how well new and evolving application codes map onto future architectures. This second version of the document addresses system scale considerations and provides a system-level abstract machine model with proxy architecture information.

More Details

TYPE SAND Report YEAR 2019

DOI OSTI

Balar: A SST GPU Component for Performance Modeling and Profiling

Hughes, Clayton; Hammond, Simon; Khairy, Mahmoud; Zhang, Mengchi; Green, Roland; Rogers, Timothy; Hoekstra, Robert J.

Programmable accelerators have become commonplace in modern computing systems. Advances in programming models and the availability of massive amounts of data have created a space for massively parallel accelerators capable of maintaining context for thousands of concurrent threads resident on-chip. These threads are grouped and interleaved on a cycle-by-cycle basis among several massively parallel computing cores. One path for the design of future supercomputers relies on an ability to model the performance of these massively parallel cores at scale. The SST framework has been proven to scale up to run simulations containing tens of thousands of nodes. A previous report described the initial integration of the open-source, execution-driven GPU simulator, GPGPU-Sim, into the SST framework. This report discusses the results of the integration and how to use the new GPU component in SST. It also provides examples of what it can be used to analyze and a correlation study showing how closely the execution matches that of a Nvidia V100 GPU when running kernels and mini-apps.

More Details

TYPE SAND Report YEAR 2019

DOI OSTI

ASC CSSE Milestone 6812: SST-GPGPU

Hughes, Clayton; Hammond, Simon; Voskuilen, Gwendolyn R.; Rodrigues, Arun; Hemmert, Karl S.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Vanguard Astra Application Experience

Hammond, Simon; Foulk, James W.; Foulk, James W.; Younge, Andrew J.; Vaughan, Courtenay T.; Lin, Paul T.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Vanguard Astra - Petascale ARM Platform for U.S. DOE/ASC Supercomputing

Hoekstra, Robert J.; Foulk, James W.; Hammond, Simon; Foulk, James W.; Younge, Andrew J.; Lin, Paul T.; Vaughan, Courtney

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

ECP HE Node Simulation - SNL

Hughes, Clayton; Rodrigues, Arun; Voskuilen, Gwendolyn R.; Hemmert, Karl S.; Hammond, Simon; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

ASC CSSE Level 2 Milestone Briefing: SST-GPU

Hughes, Clayton; Hammond, Simon; Voskuilen, Gwendolyn R.; Rodrigues, Arun; Hemmert, Karl S.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

SST-GPU: An Execution -Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model

Khairy, Mahmoud; Zhang, Mengchi; Green, Roland; Hammond, Simon; Hoekstra, Robert J.; Rogers, Timothy; Hughes, Clayton

Programmable accelerators have become commonplace in modern computing systems. Advances in programming models and the availability of massive amounts of data have created a space for massively parallel acceleration where the context for thousands of concurrent threads are resident on-chip. These threads are grouped and interleaved on a cycle-by-cycle basis among several massively parallel computing cores. The design of future supercomputers relies on an ability to model the performance of these massively parallel cores at scale. To address the need for a scalable, decentralized GPU model that can model large GPUs, chiplet-based GPUs and multi-node GPUs, this report details the first steps in integrating the open-source, execution driven GPGPU-Sim into the SST framework. The first stage of this project, creates two elements: a kernel scheduler SST element accepts work from SST CPU models and schedules it to an SM-collection element that performs cycle-by-cycle timing using SSTs Mem Hierarchy to model a flexible memory system.

More Details

TYPE SAND Report YEAR 2019

DOI OSTI

Vanguard Astra: A Prototype Petascale Arm Supercomputer

Hughes, Clayton; Foulk, James W.; Foulk, James W.; Hammond, Simon; Younge, Andrew J.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Sandia ATDM DevOps and Performance Analysis

Hoekstra, Robert J.; Bartlett, Roscoe; Hammond, Simon; Cook, Jeanine; Dinge, Dennis; Frye, Joseph R.; Hughes, Clayton; Lin, Paul T.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Vanguard Astra: A Prototype Petascale Arm Supercomputer

Hughes, Clayton; Foulk, James W.; Foulk, James W.; Hammond, Simon; Younge, Andrew J.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Update on Crossroads and Astra Systems

Alvin, Kenneth F.; Foulk, James W.; Hoekstra, Robert J.; Collis, Samuel S.; Lujan, Jim

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

The Astra Supercomputer

Hammond, Simon; Foulk, James W.; Younge, Andrew J.; Foulk, James W.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

FY18 L2 Milestone #6360 Report: Initial Capability of an Arm-based Advanced Architecture Prototype System and Software Environment

Foulk, James W.; Foulk, James W.; Hammond, Simon; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white paper entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia National Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads. This document describes the high-level Vanguard program goals, the Vanguard-Astra project acquisition plan and procurement up to contract placement, the initial software stack environment planned for the Vanguard-Astra platform (Astra), a description of how the communities of users will utilize the platform during the transition from the open network to the classified network, and initial performance results.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

Vanguard Astra and ATSE – an ARM-based Advanced Architecture Prototype System and Software Environment (FY18 L2 Milestone #8759 Report)

Foulk, James W.; Foulk, James W.; Hammond, Simon; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white paper entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia National Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

Analyzing Build System Pressure for the ASC Program

Hammond, Simon; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Publications

Search results