Publications Search

The Portals 4.2 Network Programming Interface

Barrett, Brian W.; Brightwell, Ronald B.; Grant, Ryan; Hemmert, Karl S.; Bays, Nathan R.; Wheeler, Kyle; Riesen, Rolf; Hoefler, Torsten; Maccabe, Arthur B.; Hudson, Trammell

This report presents a specification for the Portals 4 network programming interface. Portals 4 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded systems. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

The Astra Supercomputer

Hammond, Simon; Bays, Nathan R.; Younge, Andrew J.; Bays, Nathan R.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Uncovering signatures of preheat performance in MagLIF experiments using stimulated Raman and Brillouin backscatter spectra

Fein, Jeffrey R.; Bliss, David E.; Geissel, Matthias; Harvey-Thompson, Adam J.; Awe, Thomas J.; Ampleford, David; Glinsky, Michael E.; Bays, Nathan R.; Harding, Eric; Macrunnels, Keven A.; Patel, Sonal G.; Ruiz, Daniel E.; Scoglietti, Daniel J.; Smith, Ian C.; Weis, Matthew R.; Peterson, Kara J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Energy and Power Aware Job Scheduling and Resource Management: Global Survey ?

Koenig, Gegory; Maiterth, Matthias; Jana, Siddhartha; Bates, Natalie; Bays, Nathan R.; Puzovic, Milos; Borghesi, Andrea; Bartolini, Andrea; Montoya, David K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Sparse Matrix-Matrix Multiplication: An MPI+X Story

Siefert, Christopher; Bays, Nathan R.; Luchini, Christopher B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Data Analysis for the Born Qualified Grand LDRD Project

Swiler, Laura P.; Van Bloemen Waanders, Bart; Jared, Bradley H.; Koepke, Joshua R.; Whetten, Shaun R.; Madison, Jonathan D.; Ivanoff, Thomas; Bays, Nathan R.; Cook, Adam W.; Brown-Shaklee, Harlan J.; Kammler, Daniel; Johnson, Kyle L.; Ford, Kurtis; Bishop, Joseph E.; Roach, Robert A.

This report summarizes the data analysis activities that were performed under the Born Qualified Grand Challenge Project from 2016 - 2018. It is meant to document the characterization of additively manufactured parts and processes for this project as well as demonstrate and identify further analyses and data science that could be done relating material processes to microstructure to properties to performance.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

Low Thread-count Gustavson: A multithreaded algorithm for sparse matrix-matrix multiplication using perfect hashing

Bays, Nathan R.; Siefert, Christopher

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Large-Scale System Monitoring Experiences and Recommendations

Ahlgren, V.; Andersson, S.; Brandt, James M.; Cardo, N.; Chunduri, S.; Enos, J.; Fields, P.; Gentile, Ann C.; Gerber, R.; Gienger, M.; Greenseid, J.; Greiner, A.; Hadri, B.; He, Y.; Hoppe, D.; Kaila, U.; Kelly, K.; Klein, M.; Kristiansen, A.; Leak, S.; Mason, M.; Bays, Nathan R.; Piccinali, J-G; Repik, Jason J.; Rogers, J.; Salminen, S.; Showerman, M.; Whitney, C.; Williams, J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI

Characterizing MPI matching via trace-based simulation

Parallel Computing

Ferreira, Kurt B.; Levy, Scott; Bays, Nathan R.; Grant, Ryan

With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads and microbenchmarks, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.

More Details

TYPE Conference Poster YEAR 2018

OSTI Scopus

Vanguard Astra and ATSE – an ARM-based Advanced Architecture Prototype System and Software Environment (FY18 L2 Milestone #8759 Report)

Bays, Nathan R.; Bays, Nathan R.; Hammond, Simon; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white paper entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia National Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

Vanguard L2 Milestone Review

Bays, Nathan R.; Bays, Nathan R.; Hammond, Simon

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Vanguard Astra - Petascale ARM Platform for U.S. DOE/ASC Supercomputing

Younge, Andrew J.; Bays, Nathan R.; Hammond, Simon; Bays, Nathan R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

FY18 L2 Milestone #6360 Report: Initial Capability of an Arm-based Advanced Architecture Prototype System and Software Environment

Bays, Nathan R.; Bays, Nathan R.; Hammond, Simon; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white paper entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia National Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads. This document describes the high-level Vanguard program goals, the Vanguard-Astra project acquisition plan and procurement up to contract placement, the initial software stack environment planned for the Vanguard-Astra platform (Astra), a description of how the communities of users will utilize the platform during the transition from the open network to the classified network, and initial performance results.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

Recent Diagnostic Platform Accomplishments for Studying Vacuum Power Flow Physics at the Sandia Z Accelerator

Laity, George R.; Aragon, Carlos; Bennett, Nichelle L.; Bliss, David E.; Bays, Nathan R.; Fierro, Andrew S.; Gomez, Matthew R.; Hess, Mark H.; Hutsel, Brian T.; Jennings, Christopher A.; Johnston, Mark D.; Kossow, Michael R.; Lamppa, Derek C.; Martin, Matthew R.; Patel, Sonal G.; Porwitzky, A.; Robinson, Allen C.; Rose, David; Vandevender, Pace; Waisman, Eduardo M.; Webb, Timothy J.; Welch, Dale; Rochau, Gregory A.; Savage, Mark E.; Stygar, William; White, William M.; Sinars, Daniel; Cuneo, Michael E.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Open science on Trinity's knights landing partition: An analysis of user job data

ACM International Conference Proceeding Series

Levy, Scott; Bays, Nathan R.; Ferreira, Kurt B.

High-performance computing (HPC) systems are critically important to the objectives of universities, national laboratories, and commercial companies. Because of the cost of deploying and maintaining these systems ensuring their efficient use is imperative. Job scheduling and resource management are critically important to the efficient use of HPC systems. As a result, significant research has been conducted on how to effectively schedule user jobs on HPC systems. Developing and evaluating job scheduling algorithms, however, requires a detailed understanding of how users request resources on HPC systems. In this paper, we examine a corpus of job data that was collected on Trinity, a leadership-class supercomputer. During the stabilization period of its Intel Xeon Phi (Knights Landing) partition, it was made available to users outside of a classified environment for the Trinity Open Science Phase 2 campaign. We collected information from the resource manager about each user job that was run during this Open Science period. In this paper, we examine the jobs contained in this dataset. Our analysis reveals several important characteristics of the jobs submitted during the Open Science period and provides critical insight into the use of one of the most powerful supercomputers in existence. Specifically, these data provide important guidance for the design, development, and evaluation of job scheduling and resource management algorithms.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI Scopus

A comparison of power management mechanisms: P-States vs. node-level power cap control

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

Bays, Nathan R.; Grant, Ryan; Bays, Nathan R.; Levenhagen, Michael; Olivier, Stephen L.; Ward, Harry L.; Younge, Andrew J.

Large-scale HPC systems increasingly incorporate sophisticated power management control mechanisms. While these mechanisms are potentially useful for performing energy and/or power-aware job scheduling and resource management (EPA JSRM), greater understanding of their operation and performance impact on real-world applications is required before they can be applied effectively in practice. In this paper, we compare static p-state control to static node-level power cap control on a Cray XC system. Empirical experiments are performed to evaluate node-to-node performance and power usage variability for the two mechanisms. We find that static p-state control produces more predictable and higher performance characteristics than static node-level power cap control at a given power level. However, this performance benefit is at the cost of less predictable power usage. Static node-level power cap control produces predictable power usage but with more variable performance characteristics. Our results are not intended to show that one mechanism is better than the other. Rather, our results demonstrate that the mechanisms are complementary to one another and highlight their potential for combined use in achieving effective EPA JSRM solutions.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI Scopus

Large-Scale System Monitoring Experiences and Recommendations

Ahlgren, V.; Andersson, S.; Brandt, James M.; Cardo, N.; Chunduri, S.; Enos, J.; Fields, P.; Gentile, Ann C.; Gerber, R.; Gienger, M.; Greenseid, J.; Greiner, A.; Hadri, B.; He, Y.; Hoppe, D.; Kaila, U.; Kelly, K.; Klein, M.; Kristiansen, A.; Leak, S.; Mason, M.; Bays, Nathan R.; Piccinali, J-G; Repik, Jason J.; Rogers, J.; Salminen, S.; Showerman, M.; Whitney, C.; Williams, J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI