Publications Search

The Vanguard program informally began in January 2017 with the submission of a white paper entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia National Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

FY18 L2 Milestone #6360 Report: Initial Capability of an Arm-based Advanced Architecture Prototype System and Software Environment

Foulk, James W.; Foulk, James W.; Hammond, Simon; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white paper entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia National Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads. This document describes the high-level Vanguard program goals, the Vanguard-Astra project acquisition plan and procurement up to contract placement, the initial software stack environment planned for the Vanguard-Astra platform (Astra), a description of how the communities of users will utilize the platform during the transition from the open network to the classified network, and initial performance results.

More Details

TYPE SAND Report YEAR 2018

DOI OSTI

Investigating Factors Impacting Exascale Performance of ASC Codes: A Co-design Effort

Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

A comparison of power management mechanisms: P-States vs. node-level power cap control

Proceedings 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops Ipdpsw 2018

Foulk, James W.; Grant, Ryan; Foulk, James W.; Levenhagen, Michael; Olivier, Stephen L.; Ward, Harry L.; Younge, Andrew J.

Large-scale HPC systems increasingly incorporate sophisticated power management control mechanisms. While these mechanisms are potentially useful for performing energy and/or power-aware job scheduling and resource management (EPA JSRM), greater understanding of their operation and performance impact on real-world applications is required before they can be applied effectively in practice. In this paper, we compare static p-state control to static node-level power cap control on a Cray XC system. Empirical experiments are performed to evaluate node-to-node performance and power usage variability for the two mechanisms. We find that static p-state control produces more predictable and higher performance characteristics than static node-level power cap control at a given power level. However, this performance benefit is at the cost of less predictable power usage. Static node-level power cap control produces predictable power usage but with more variable performance characteristics. Our results are not intended to show that one mechanism is better than the other. Rather, our results demonstrate that the mechanisms are complementary to one another and highlight their potential for combined use in achieving effective EPA JSRM solutions.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI Scopus

A comparison of power management mechanisms: P-States vs. node-level power cap control

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

Foulk, James W.; Grant, Ryan; Foulk, James W.; Levenhagen, Michael; Olivier, Stephen L.; Ward, Harry L.; Younge, Andrew J.

Large-scale HPC systems increasingly incorporate sophisticated power management control mechanisms. While these mechanisms are potentially useful for performing energy and/or power-aware job scheduling and resource management (EPA JSRM), greater understanding of their operation and performance impact on real-world applications is required before they can be applied effectively in practice. In this paper, we compare static p-state control to static node-level power cap control on a Cray XC system. Empirical experiments are performed to evaluate node-to-node performance and power usage variability for the two mechanisms. We find that static p-state control produces more predictable and higher performance characteristics than static node-level power cap control at a given power level. However, this performance benefit is at the cost of less predictable power usage. Static node-level power cap control produces predictable power usage but with more variable performance characteristics. Our results are not intended to show that one mechanism is better than the other. Rather, our results demonstrate that the mechanisms are complementary to one another and highlight their potential for combined use in achieving effective EPA JSRM solutions.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI Scopus

Optimizing for KNL Usage Modes When Data Doesn?t Fit in MCDRAM

Berry, Jonathan; Butcher, Neil; Olivier, Stephen L.; Hammond, Simon; Kogge, Peter

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Optimizing for KNL Usage Models When Data Doesn't Fit in MCDRAM

Berry, Jonathan; Butcher, Neil; Olivier, Stephen L.; Hammond, Simon; Kogge, Peter

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Advanced Power Measurement and Control for the Trinity Supercomputer

Younge, Andrew J.; Grant, Ryan; Foulk, James W.; Levenhagen, Michael; Olivier, Stephen L.; Foulk, James W.; Ward, Harry L.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Multi-threaded Sparse Matrix Matrix Multiplication with Applications in Scientific Computing and Graph Analytics

Deveci, Mehmet; Wolf, Michael; Berry, Jonathan; Rajamanickam, Sivasankaran; Boman, Erik G.; Trott, Christian R.; Hammond, Simon; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Recent Developments in the OpenMP Specification

Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Memory Management Extensions for OpenMP 5.0

Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Assessing task-to-data affinity in the LLVM OpenMP runtime

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Klinkenberg, Jannis; Samfass, Philipp; Terboven, Christian; Duran, Alejandro; Klemm, Michael; Teruel, Xavier; Mateo, Sergi; Olivier, Stephen L.; Muller, Matthias S.

In modern shared-memory NUMA systems which typically consist of two or more multi-core processor packages with local memory, affinity of data to computation is crucial for achieving high performance with an OpenMP program. OpenMP* 3.0 introduced support for task-parallel programs in 2008 and has continued to extend its applicability and expressiveness. However, the ability to support data affinity of tasks is missing. In this paper, we investigate several approaches for task-to-data affinity that combine locality-aware task distribution and task stealing. We introduce the task affinity clause that will be part of OpenMP 5.0 and provide the reasoning behind its design. Evaluation with our experimental implementation in the LLVM OpenMP runtime shows that task affinity improves execution performance up to 4.5x on an 8-socket NUMA machine and significantly reduces runtime variability of OpenMP tasks. Our results demonstrate that a variety of applications can benefit from task affinity and that the presented clause is closing the gap of task-to-data affinity in OpenMP 5.0.

More Details

TYPE Conference Poster YEAR 2018

DOI OSTI Scopus

Enhancing Qthreads for ECP Science and Energy Impact

Brightwell, Ronald B.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

ATDM Operating Systems and On-Node Runtime

Olivier, Stephen L.; Foulk, James W.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

December 2017 ECP ST Project Review: ECP Project WBS 2.3.1.15 (Qthreads)

Brightwell, Ronald B.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

OpenMPIR: Implementing OpenMP tasks with tapir

Proceedings of LLVM-HPC 2017: 4th Workshop on the LLVM Compiler Infrastructure in HPC - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis

Stelle, George; Moses, William S.; Olivier, Stephen L.; Mccormick, Patrick

Optimizing compilers for task-level parallelism are still in their infancy. This work explores a compiler front end that translates OpenMP tasking semantics to Tapir, an extension to LLVM IR that represents fork-join parallelism. This enables analyses and optimizations that were previously inaccessible to OpenMP codes, as well as the ability to target additional runtimes at code generation. Using a Cilk runtime back end, we compare results to existing OpenMP implementations. Initial performance results for the Barcelona OpenMP task suite show performance improvements over existing implementations.

More Details

TYPE Conference Poster YEAR 2017

DOI OSTI Scopus