Publications Search

Standardizing Power Monitoring and Control at Exascale

Computer

Grant, Ryan; Levenhagen, Michael; Olivier, Stephen L.; Debonis, David; Foulk, James W.; Foulk, James W.

Power API - the result of collaboration among national laboratories, universities, and major vendors - provides a range of standardized power management functions, from application-level control and measurement to facility-level accounting, including real-time and historical statistics gathering. Support is already available for Intel and AMD CPUs and standalone measurement devices.

More Details

TYPE Journal Article YEAR 2016

DOI OSTI Scopus

Program optimizations: The interplay between power, performance, and energy

Parallel Computing

Leon, Edgar A.; Karlin, Ian; Grant, Ryan; Dosanjh, Matthew G.

Practical considerations for future supercomputer designs will impose limits on both instantaneous power consumption and total energy consumption. Working within these constraints while providing the maximum possible performance, application developers will need to optimize their code for speed alongside power and energy concerns. This paper analyzes the effectiveness of several code optimizations including loop fusion, data structure transformations, and global allocations. A per component measurement and analysis of different architectures is performed, enabling the examination of code optimizations on different compute subsystems. Using an explicit hydrodynamics proxy application from the U.S. Department of Energy, LULESH, we show how code optimizations impact different computational phases of the simulation. This provides insight for simulation developers into the best optimizations to use during particular simulation compute phases when optimizing code for future supercomputing platforms. We examine and contrast both x86 and Blue Gene architectures with respect to these optimizations.

More Details

TYPE Journal Article YEAR 2016

DOI OSTI Scopus

MPI Sessions: Leveraging runtime infrastructure to increase scalability of applications at exascale

ACM International Conference Proceeding Series

Holmes, Daniel; Mohror, Kathryn; Grant, Ryan; Skjellum, Anthony; Schulz, Martin; Bland, Wesley; Squyres, Jeffrey M.

MPI includes all processes in MPI COMM WORLD; this is untenable for reasons of scale, resiliency, and overhead. This paper offers a new approach, extending MPI with a new concept called Sessions, which makes two key contributions: a tighter integration with the underlying runtime system; and a scalable route to communication groups. This is a fundamental change in how we organise and address MPI processes that removes well-known scalability barriers by no longer requiring the global communicator MPI COMM - WORLD.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

Stalled Active and Idle (SAI): Characterizing Large-scale Dragonfly Networks

Groves, Taylor L.; Hammond, Simon; Hemmert, Karl S.; Grant, Ryan; Levenhagen, Michael; Arnold, Dorian

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

NiMC: Characterizing and Eliminating Network-Induced Memory Contention

Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016

Groves, Taylor L.; Grant, Ryan; Arnold, Dorian

Remote Direct Memory Access (RDMA) is expected to be an integral communication mechanism for future exascale systems - enabling asynchronous data transfers, so that applications may fully utilize all CPU resources while simultaneously sharing data amongst remote nodes. We examined this network-induced memory contention (NiMC), the interactions between RDMA and the memory subsystem when applications and out-of-band services compete for memory resources, and NiMC's resulting impact on application-level performance. For a range of hardware technologies and HPC workloads, we quantified NiMC and show that NiMC's impact grows with scale resulting in up to 3X performance degradation at scales as small as 8K processes even in applications that previously have been shown to be performance resilient in the presence of noise. We also evaluated three potential techniques to reduce NiMC's performance impact, namely hardware offloading, core reservation and software-based network throttling. While all three of these solutions show promise, we provide guidelines that help select the best solution for a given environment.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

(SAI) Stalled Active and Idle: Characterizing Power and Performance of Large-Scale Dragonfly Networks

Groves, Taylor L.; Grant, Ryan; Hemmert, Karl S.; Hammond, Simon; Levenhagen, Michael; Arnold, Dorian

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

Extreme Computing: Pushing the Frontiers of Science

Grant, Ryan

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

High Performance Computing: Power Application Programming Interface Specification (V.1.3)

Foulk, James W.; Kelly, Suzanne M.; Foulk, James W.; Grant, Ryan; Olivier, Stephen L.; Levenhagen, Michael; Debonis, David

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

More Details

TYPE SAND Report YEAR 2016

DOI OSTI

An Overview of Sandia National Laboratory?s High Performance Computing Power Application Programming Interface (API) Specification

Foulk, James W.; Foulk, James W.; Grant, Ryan; Olivier, Stephen L.; Levenhagen, Michael; Debonis, David

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Overcoming Challenges in Scalable Power Monitoring with the Power API

Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

ACES and Cray Collaborate on Advanced Power Management for Trinity

Foulk, James W.; Foulk, James W.; Grant, Ryan; Olivier, Stephen L.; Levenhagen, Michael

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Data Movement with MPI in a Multi-Threaded World

Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Overcoming Challenges in Scalable Power Monitoring with the Power API

Grant, Ryan; Levenhagen, Michael; Olivier, Stephen L.; Debonis, David; Foulk, James W.; Foulk, James W.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

RMA-MT: A Benchmark Suite for Assessing MPI Multi-threaded RMA Performance

Dosanjh, Matthew G.; Groves, Taylor L.; Grant, Ryan; Brightwell, Ronald B.; Bridges, Patrick G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

Simplifying MPI Threading Levels

Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

SHMEM-MT: A benchmark suite for assessing multi-threaded SHMEM performance

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Weeks, Hans; Dosanjh, Matthew G.; Bridges, Patrick G.; Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

NiMC: Characterizing and Eliminating Network-Induced Memory Contention

Groves, Taylor L.; Grant, Ryan; Arnold, Dorian

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

An Overview of Sandia National Laboratory?s High Performance Computing Power Application Programming Interface (API) Specification

Foulk, James W.; Foulk, James W.; Grant, Ryan; Olivier, Stephen L.; Levenhagen, Michael; Debonis, David

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Overtime: A Tool for Analyzing Performance Variation due to Network Interference

Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

Lightweight Threading with MPI Using Persistent Communcations Semantics

Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Power API for HPC: Standardizing Power Measurement and Control

Foulk, James W.; Foulk, James W.; Kelly, Suzanne M.; Levenhagen, Michael; Debonis, David; Olivier, Stephen L.; Grant, Ryan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Lightweight threading with MPI using Persistent Communications Semantics

Grant, Ryan; Skjellum, Anthony; Bangalore, Purushotham V.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Preparing for Exascale: Modeling MPI for Many-Core Systems using Fine-Grain Queues

Bridges, Patrick G.; Dosanjh, Matthew G.; Grant, Ryan; Farmer, Shane; Skjellum, Anthony; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

Power Aware Dynamic Provisioning of HPC Networks

Groves, Taylor L.; Grant, Ryan

Future exascale systems are under increased pressure to find power savings. The network, while it consumes a considerable amount of power is often left out of the picture when discussing total system power. Even when network power is being considered, the references are frequently a decade or older and rely on models that lack validation on modern inter- connects. In this work we explore how dynamic mechanisms of an Infiniband network save power and at what granularity we can engage these features. We explore this within the context of the host controller adapter (HCA) on the node and for the fabric, i.e. switches, using three different mechanisms of dynamic link width, frequency and disabling of links for QLogic and Mellanox systems. Our results show that while there is some potential for modest power savings, real world systems need to improved responsiveness to adjustments in order to fully leverage these savings. This page intentionally left blank.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Re-evaluating Network Onload vs. Offload for the Many-Core Era

Dosanjh, Matthew G.; Grant, Ryan; Bridges, Patrick G.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

Publications

Search results