Publications Search

Abstract Machine Models and Proxy Architectures for Exascale Computing

Ang, James A.; Barrett, Richard F.; Benner, R.E.; Burke, Daniel; Chan, Cy; Cook, Jeanine C.; Daley, Christopher S.; Donofrio, David; Hammond, Simon D.; Hemmert, Karl S.; Hoekstra, Robert J.; Ibrahim, Khaled; Kelly, Suzanne M.; Le, Hoang; Leung, Vitus J.; Michelogiannakis, George; Resnick, David R.; Rodrigues, Arun; Shalf, John; Stark, Dylan; Unat, D.; Wright, Nick J.; Voskuilen, Gwendolyn R.

To achieve exascale computing, fundamental hardware architectures must change. The most significant consequence of this assertion is the impact on the scientific and engineering applications that run on current high performance computing (HPC) systems, many of which codify years of scientific domain knowledge and refinements for contemporary computer systems. In order to adapt to exascale architectures, developers must be able to reason about new hardware and determine what programming models and algorithms will provide the best blend of performance and energy efficiency into the future. While many details of the exascale architectures are undefined, an abstract machine model is designed to allow application developers to focus on the aspects of the machine that are important or relevant to performance and code structure. These models are intended as communication aids between application developers and hardware architects during the co-design process. We use the term proxy architecture to describe a parameterized version of an abstract machine model, with the parameters added to elucidate potential speeds and capacities of key hardware components. These more detailed architectural models are formulated to enable discussion between the developers of analytic models and simulators and computer hardware architects. They allow for application performance analysis and hardware optimization opportunities. In this report our goal is to provide the application development community with a set of models that can help software developers prepare for exascale. In addition, through the use of proxy architectures, we can enable a more concrete exploration of how well new and evolving application codes map onto future architectures. This second version of the document addresses system scale considerations and provides a system-level abstract machine model with proxy architecture information.

More Details

TYPE SAND Report YEAR 2019

DOI OSTI

High Performance Computing - Power Application Programming Interface Specification Version 1.4

Laros, James H.; DeBonis, David D.; Grant, Ryan E.; Kelly, Suzanne M.; Levenhagen, Michael J.; Olivier, Stephen L.; Laros, James H.

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

More Details

TYPE SAND Report YEAR 2016

DOI OSTI

High Performance Computing: Power Application Programming Interface Specification (V.1.3)

Laros, James H.; Kelly, Suzanne M.; Laros, James H.; Grant, Ryan E.; Olivier, Stephen L.; Levenhagen, Michael J.; DeBonis, David D.

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

More Details

TYPE SAND Report YEAR 2016

DOI OSTI

Power API for HPC: Standardizing Power Measurement and Control

Laros, James H.; Laros, James H.; Kelly, Suzanne M.; Levenhagen, Michael J.; DeBonis, David D.; Olivier, Stephen L.; Grant, Ryan E.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Advanced Simulation and Computing Co-Design Strategy

Ang, James A.; Hoang, Thuc T.; Kelly, Suzanne M.; Mcpherson, Allen; Neely, Rob

This ASC Co-design Strategy lays out the full continuum and components of the co-design process, based on what we have experienced thus far and what we wish to do more in the future to meet the program’s mission of providing high performance computing (HPC) and simulation capabilities for NNSA to carry out its stockpile stewardship responsibility.

More Details

TYPE Other Report YEAR 2015

DOI OSTI

High Performance Computing - Power Application Programming Interface Specification

Laros, James H.; Kelly, Suzanne M.; Laros, James H.; Grant, Ryan E.; Olivier, Stephen L.; Levenhagen, Michael J.; DeBonis, David D.

Achieving practical exascale supercomputing will require massive increases in energy efficiency. The bulk of this improvement will likely be derived from hardware advances such as improved semiconductor device technologies and tighter integration, hopefully resulting in more energy efficient computer architectures. Still, software will have an important role to play. With every generation of new hardware, more power measurement and control capabilities are exposed. Many of these features require software involvement to maximize feature benefits. This trend will allow algorithm designers to add power and energy efficiency to their optimization criteria. Similarly, at the system level, opportunities now exist for energy-aware scheduling to meet external utility constraints such as time of day cost charging and power ramp rate limitations. Finally, future architectures might not be able to operate all components at full capability for a range of reasons including temperature considerations or power delivery limitations. Software will need to make appropriate choices about how to allocate the available power budget given many, sometimes conflicting considerations.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

High Performance Computing - Power Application Programming Interface Specification. Version 1.1 [DRAFT]

Laros, James H.; Kelly, Suzanne M.; Laros, James H.; Grant, Ryan E.; Olivier, Stephen L.; Levenhagen, Michael J.; DeBonis, David D.

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

A Power Application Programming Interface (API) Specification for High Performance Computers (HPC)

Laros, James H.; Laros, James H.; Grant, Ryan E.; Levenhagen, Michael J.; DeBonis, David D.; Olivier, Stephen L.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Sandia's Advanced Architecture Test Beds

Laros, James H.; Ang, James A.; Hammond, Simon D.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Overview of HPC Power Use Cases

Kelly, Suzanne M.; Laros, James H.; Elmore, Ryan; Hammond, Steven; Munch, Kris

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

A Power API for the HPC Community

DeBonis, David D.; Grant, Ryan E.; Olivier, Stephen L.; Levenhagen, Michael J.; Kelly, Suzanne M.; Laros, James H.; Laros, James H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI OSTI

High Performance Computing - Power Application Programming Interface Specification

Laros, James H.; Kelly, Suzanne M.; Laros, James H.; Grant, Ryan E.; Olivier, Stephen L.; Levenhagen, Michael J.; DeBonis, David D.

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

More Details

TYPE SAND Report YEAR 2014

DOI OSTI

Version 1 of Exascale Abstract Machine Models and Associated Proxy Architectures

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

An Evaluation of BitTorrent?s Performance In HPC Environments

Dosanjh, Matthew D.; Kelly, Suzanne M.; Laros, James H.; Vaughan, Courtenay T.; Bridges, Patrick

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Abstract machine models and proxy architectures for exascale computing

Ang, James A.; Barrett, Richard F.; Benner, R.E.; Burke, D.; Chan, C.; Donofrio, David; Hammond, Simon D.; Hemmert, Karl S.; Kelly, Suzanne M.; Le, H.; Leung, Vitus J.; Resnick, David R.; Rodrigues, Arun; Shalf, John; Stark, Dylan S.; Unat, Didem; Wright, N.J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

DOI OSTI

Addressing Power/Energy Challenges for Extreme Scale HPC

Laros, James H.; Kelly, Suzanne M.; Pedretti, Kevin P.; Grant, Ryan E.; Levenhagen, Michael J.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

An Evaluation of BitTorrent's Performance In HPC Environments

Dosanjh, Matthew D.; Kelly, Suzanne M.; Laros, James H.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI DOI

A Use Case Approach to Deriving Power API Requirements

Kelly, Suzanne M.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

An Evaluation of BitTorrent's Performance In HPC Environments

Dosanjh, Matthew D.; Kelly, Suzanne M.; Laros, James H.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI DOI

Power/energy use cases for high performance computing

Laros, James H.; Kelly, Suzanne M.

Power and Energy have been identified as a first order challenge for future extreme scale high performance computing (HPC) systems. In practice the breakthroughs will need to be provided by the hardware vendors. But to make the best use of the solutions in an HPC environment, it will likely require periodic tuning by facility operators and software components. This document describes the actions and interactions needed to maximize power resources. It strives to cover the entire operational space in which an HPC system occupies. The descriptions are presented as formal use cases, as documented in the Unified Modeling Language Specification [1]. The document is intended to provide a common understanding to the HPC community of the necessary management and control capabilities. Assuming a common understanding can be achieved, the next step will be to develop a set of Application Programing Interfaces (APIs) to which hardware vendors and software developers could utilize to steer power consumption.

More Details

TYPE SAND Report YEAR 2013

DOI OSTI

A Use Case Approach to Deriving Power API Requirements

Kelly, Suzanne M.; Laros, James H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Performance on Advanced Systems Test Beds

Trott, Christian R.; Hammond, Simon D.; Kelly, Suzanne M.; Laros, James H.; Ang, James A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI OSTI

An Evaluation of BitTorrent's Performance In HPC Enviroments

Kelly, Suzanne M.; Laros, James H.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

I don't wanna grow up... stuck at predictive capability maturity model level zero!

Rider, William J.; Kelly, Suzanne M.; Barrett, Richard F.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

NNSA/ASC Test Bed Update

Hammond, Simon D.; Barrett, Richard F.; Vaughan, Courtenay T.; Trott, Christian R.; Laros, James H.; Kelly, Suzanne M.; Ang, James A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

SST and Test-Bed Hack-a-thon

Hammond, Simon D.; Rodrigues, Arun; Kelly, Suzanne M.; Ang, James A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

I Don't Wanna Grow Up...Stuck at Predictive Capability Maturity Model Level Zero

Kelly, Suzanne M.; Rider, William J.; Barrett, Richard F.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Reliable Computation Using Unreliable Components

Ballance, Robert A.; Noe, John P.; Kelly, Suzanne M.; Stearley, Jon S.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Reliable Computation Using Unreliable Components

Ballance, Robert A.; Noe, John P.; Kelly, Suzanne M.; Stearley, Jon S.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

SST (micro) Introduction - Presentation to SST Hack-a-thon Attendees

Rodrigues, Arun; Hammond, Simon D.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Advanced Systems Technology Test Beds Overview - Presentation to SST Hack-a-thon Attendees

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

SST and ExMatEx Update

Hammond, Simon D.; Rodrigues, Arun; Kelly, Suzanne M.; Vandyke, John P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

March 2013 ASC Newsletter

Kelly, Suzanne M.; Fang, H.E.; Wagner, Gregory J.; Templeton, Jeremy A.

More Details

TYPE Presentation YEAR 2013

OSTI

Early Experiences with Intel MIC Architecture

Ang, James A.; Kelly, Suzanne M.; Hammond, Simon D.; Barrett, Richard F.; Levenhagen, Michael J.; Rodrigues, Arun; Pedretti, Kevin P.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Sep 2012 ASC Newsletter Items - Sandia National Laboratories

Kelly, Suzanne M.; Bond, Ryan B.

More Details

TYPE Presentation YEAR 2012

OSTI

June 2012 ASC Newsletter

Weaver, Karla W.; Singer, Neal E.; Crowell, Jeffrey A.; Kelly, Suzanne M.

More Details

TYPE Presentation YEAR 2012

OSTI

Early Experiences with Intel MIC Architcture

Ang, James A.; Hammond, Simon D.; Barrett, Richard F.; Levenhagen, Michael J.; Rodrigues, Arun; Pedretti, Kevin; Laros, James H.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Sandia Advanced Architecture Testbeds

Ang, James A.; Laros, James H.; Kelly, Suzanne M.; Pedretti, Kevin

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Energy Based Performance Tuning for Large Scale High Performance Computing Systems

Laros, James H.; Pedretti, Kevin; Kelly, Suzanne M.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Sandia Advanced Technology Test Bed Project

Laros, James H.; Ang, James A.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Reliable Computation Using Unpredictable Components

Wilke, Jason W.; Ballance, Robert A.; Rajan, Mahesh R.; Kelly, Suzanne M.; Noe, John P.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Providing Test Beds for Exascale Explorations - Intro Slides

Kelly, Suzanne M.; Ang, James A.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Providing Test Beds for Exascale Explorations - Technical Talk Slides

Kelly, Suzanne M.; Ang, James A.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI OSTI

Out with the old and in with the New: APIs for Exascale - Will new mean better?

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

A peer-to-peer architecture for supporting dynamic shared libraries in large-scale systems

Kelly, Suzanne M.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Towards High Performance Computing Application Energy Efficiency

Laros, James H.; Pedretti, Kevin; Kelly, Suzanne M.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Report of experiments and evidence for ASC L2 milestone 4467 : demonstration of a legacy application's path to exascale

Barrett, Brian B.; Kelly, Suzanne M.; Klundt, Ruth A.; Laros, James H.; Leung, Vitus J.; Levenhagen, Michael J.; Lofstead, Gerald F.; Moreland, Kenneth D.; Oldfield, Ron A.; Pedretti, Kevin T.T.; Rodrigues, Arun; Barrett, Richard F.; Thompson, David C.; Ward, Harry L.; Vandyke, John P.; Vaughan, Courtenay T.; Wheeler, Kyle B.; Brandt, James M.; Brightwell, Ronald B.; Curry, Matthew L.; Fabian, Nathan D.; Ferreira, Kurt; Gentile, Ann C.; Hemmert, Karl S.

This report documents thirteen of Sandia's contributions to the Computational Systems and Software Environment (CSSE) within the Advanced Simulation and Computing (ASC) program between fiscal years 2009 and 2012. It describes their impact on ASC applications. Most contributions are implemented in lower software levels allowing for application improvement without source code changes. Improvements are identified in such areas as reduced run time, characterizing power usage, and Input/Output (I/O). Other experiments are more forward looking, demonstrating potential bottlenecks using mini-application versions of the legacy codes and simulating their network activity on Exascale-class hardware. The purpose of this report is to prove that the team has completed milestone 4467-Demonstration of a Legacy Application's Path to Exascale. Cielo is expected to be the last capability system on which existing ASC codes can run without significant modifications. This assertion will be tested to determine where the breaking point is for an existing highly scalable application. The goal is to stretch the performance boundaries of the application by applying recent CSSE RD in areas such as resilience, power, I/O, visualization services, SMARTMAP, lightweight LWKs, virtualization, simulation, and feedback loops. Dedicated system time reservations and/or CCC allocations will be used to quantify the impact of system-level changes to extend the life and performance of the ASC code base. Finally, a simulation of anticipated exascale-class hardware will be performed using SST to supplement the calculations. Determine where the breaking point is for an existing highly scalable application: Chapter 15 presented the CSSE work that sought to identify the breaking point in two ASC legacy applications-Charon and CTH. Their mini-app versions were also employed to complete the task. There is no single breaking point as more than one issue was found with the two codes. The results were that applications can expect to encounter performance issues related to the computing environment, system software, and algorithms. Careful profiling of runtime performance will be needed to identify the source of an issue, in strong combination with knowledge of system software and application source code.

More Details

TYPE SAND Report YEAR 2012

DOI OSTI

Advanced Architecture Test Beds

Kelly, Suzanne M.; Ang, James A.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Demonstration of a Legacy Application's Path to Exascale - ASC L2 Milestone 4467

Barrett, Brian B.; Kelly, Suzanne M.; Klundt, Ruth A.; Laros, James H.; Leung, Vitus J.; Levenhagen, Michael J.; Lofstead, Gerald F.; Moreland, Kenneth D.; Oldfield, Ron A.; Pedretti, Kevin P.; Rodrigues, Arun; Barrett, Richard F.; Ward, Harry L.; Vandyke, John P.; Vaughan, Courtenay T.; Wheeler, Kyle B.; Brandt, James M.; Brightwell, Ronald B.; Curry, Matthew L.; Fabian, Nathan D.; Ferreira, Kurt; Gentile, Ann C.; Hemmert, Karl S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Enhancements to Red Storm and Catamount to Increase Power Efficiency During Application Execution

Laros, James H.; Pedretti, Kevin; Kelly, Suzanne M.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Principles of Scalable HPC System Design

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

System Software Working Group

Kelly, Suzanne M.; Ballance, Robert A.; Brightwell, Ronald B.; Laros, James H.; Minnich, Ronald G.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Results of Software Threading Experiments in ASC Codes

Kelly, Suzanne M.; Lindblad, Alex L.; Drake, Richard R.; Quadros, William R.; Staten, Matthew L.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Results of Software Threading Experiments in ASC Codes

Kelly, Suzanne M.; Lindblad, Alex L.; Drake, Richard R.; Quadros, William R.; Staten, Matthew L.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Energy Based Performance Tuning for Large Scale High Performance Computing Systems

Laros, James H.; Pedretti, Kevin P.; Kelly, Suzanne M.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Shared Libraries on a Capability Class Computer Presentation

Kelly, Suzanne M.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Shared Libraries on a Capability Class Computer

Kelly, Suzanne M.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Energy Based Performance Tuning for Large Scale High Performance Computing Systems

Pedretti, Kevin P.; Kelly, Suzanne M.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Systems Software

Kelly, Suzanne M.; Brightwell, Ronald B.; Ballance, Robert A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

System Software Report from ASC/CSSE-FOUS Exascale Planning

Minnich, Ronald G.; Brightwell, Ronald B.; Kelly, Suzanne M.; Ballance, Robert A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Use of shared libraries on a capability class supercomputer : ASC booth talk

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

LDRD final report : a lightweight operating system for multi-core capability class supercomputers

Pedretti, Kevin T.T.; Levenhagen, Michael J.; Ferreira, Kurt; Brightwell, Ronald B.; Kelly, Suzanne M.; Bridges, Patrick G.

The two primary objectives of this LDRD project were to create a lightweight kernel (LWK) operating system(OS) designed to take maximum advantage of multi-core processors, and to leverage the virtualization capabilities in modern multi-core processors to create a more flexible and adaptable LWK environment. The most significant technical accomplishments of this project were the development of the Kitten lightweight kernel, the co-development of the SMARTMAP intra-node memory mapping technique, and the development and demonstration of a scalable virtualization environment for HPC. Each of these topics is presented in this report by the inclusion of a published or submitted research paper. The results of this project are being leveraged by several ongoing and new research projects.

More Details

TYPE SAND Report YEAR 2010

DOI OSTI

Supercomputers: What are they and why are they important?

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

System Software Research for Extreme-Scale Computing

Oldfield, Ron A.; Brightwell, Ronald B.; Pedretti, Kevin P.; Riesen, Rolf; Ferreira, Kurt; Kelly, Suzanne M.; Laros, James H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Investigating methods of supporting dynamically linked executables on high performance computing platforms

Laros, James H.; Kelly, Suzanne M.; Levenhagen, Michael J.; Pedretti, Kevin T.T.

Shared libraries have become ubiquitous and are used to achieve great resource efficiencies on many platforms. The same properties that enable efficiencies on time-shared computers and convenience on small clusters prove to be great obstacles to scalability on large clusters and High Performance Computing platforms. In addition, Light Weight operating systems such as Catamount have historically not supported the use of shared libraries specifically because they hinder scalability. In this report we will outline the methods of supporting shared libraries on High Performance Computing platforms using Light Weight kernels that we investigated. The considerations necessary to evaluate utility in this area are many and sometimes conflicting. While our initial path forward has been determined based on this evaluation we consider this effort ongoing and remain prepared to re-evaluate any technology that might provide a scalable solution. This report is an evaluation of a range of possible methods of supporting dynamically linked executables on capability class1 High Performance Computing platforms. Efforts are ongoing and extensive testing at scale is necessary to evaluate performance. While performance is a critical driving factor, supporting whatever method is used in a production environment is an equally important and challenging task.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Catamount N-Way Performance on XT5

Brightwell, Ronald B.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Investigating Real Power Usage on High Performance Computing Platforms

Pedretti, Kevin P.; Kelly, Suzanne M.; Vandyke, John P.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

HPC Architecture Research Presentations for Kansas State University

Doerfler, Douglas W.; Hemmert, Karl S.; Barrett, Brian B.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

Summary of multi-core hardware and programming model investigations

Pedretti, Kevin T.T.; Kelly, Suzanne M.; Levenhagen, Michael J.

This report summarizes our investigations into multi-core processors and programming models for parallel scientific applications. The motivation for this study was to better understand the landscape of multi-core hardware, future trends, and the implications on system software for capability supercomputers. The results of this study are being used as input into the design of a new open-source light-weight kernel operating system being targeted at future capability supercomputers made up of multi-core processors. A goal of this effort is to create an agile system that is able to adapt to and efficiently support whatever multi-core hardware and programming models gain acceptance by the community.

More Details

TYPE SAND Report YEAR 2008

DOI OSTI

Application Performance under Different XT Operating Systems

Vaughan, Courtenay T.; Vandyke, John P.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Application Performance under Different XT Operating Systems

Vaughan, Courtenay T.; Vandyke, John P.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

The Red Storm High Performance Computer

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

The Design of the Red Storm High Performance Computer

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

Red Storm - poster

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2007

OSTI

Red Storm IO Performance Analysis

Laros, James H.; Ward, Harry L.; Kelly, Suzanne M.; Kellogg, Brian R.; Tomkins, James

Abstract not provided.

More Details

TYPE Conference YEAR 2007

OSTI

Extending catamount for multi-core, processors

Vandyke, John P.; Vaughan, Courtenay T.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2007

OSTI

Extending catamount for multi-core processors

Vandyke, John P.; Vaughan, Courtenay T.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2007

OSTI

Sandia Scalable System Software Architecture

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2007

OSTI

IO Testing Results

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

Red Storm: an Update on the Upgrade

Ballance, Robert A.; Doerfler, Douglas W.; Kelly, Suzanne M.; Tomkins, James; Stevenson, Joel O.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

Accelerated Portals - Integration into XT3 Code Base

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

Making Red Storm a Success Subtitle: It Takes a Village to Build a Supercomputer

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

A Light Weight Kernel (LWK) Operating System for Massively Parallel Processor (MPP) Systems

Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

Red Storm Capabilities

Ballance, Robert A.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2006

OSTI

Software architecture of the light weight kernel, catamount

Kelly, Suzanne M.

Catamount is designed to be a low overhead operating system for a parallel computing environment. Functionality is limited to the minimum set needed to run a scientific computation. The design choices and implementations will be presented. A massively parallel processor (MPP), high performance computing (HPC) system is particularly sensitive to operating system overhead. Traditional, multi-purpose, operating systems are designed to support a wide range of usage models and requirements. To support the range of needs, a large number of system processes are provided and are often interdependent on each other. The overhead of these processes leads to an unpredictable amount of processor time available to a parallel application. Except in the case of the most embarrassingly parallel of applications, an MPP application must share interim results with its peers before it can make further progress. These synchronization events are made at specific points in the application code. If one processor takes longer to reach that point than all the other processors, everyone must wait. The overall finish time is increased. Sandia National Laboratories began addressing this problem more than a decade ago with an architecture based on node specialization. Sets of nodes in an MPP are designated to perform specific tasks, each running an operating system best suited to the specialized function. Sandia chose to not use a multi-purpose operating system for the computational nodes and instead began developing its first light weight operating system, SUNMOS, which ran on the compute nodes on the Intel Paragon system. Based on its viability, the architecture evolved into the PUMA operating system. Intel ported PUMA to the ASCI Red TFLOPS system, thus creating the Cougar operating system. Most recently, Cougar has been ported to Cray's XT3 system and renamed to Catamount. As the references indicate, there are a number of descriptions of the predecessor operating systems. While the majority of those discussions still apply to Catamount, this paper takes a fresh look at the architecture as it is currently implemented.

More Details

TYPE Conference YEAR 2005

OSTI

Early experience with red storm

Kelly, Suzanne M.; Ballance, Robert A.

Red Storm is a massively parallel processor. The Red Storm design goals are: (1) Balanced system performance - CPU, memory, interconnect, and I/O; (2) Usability - functionality of hardware and software meets needs of users for Massively Parallel Computing; (3)S calability - system hardware and software scale, single cabinet system to {approx} 30,000 processor system; (4) reliability - machines tays up long enough between interrupts to make real progress on completing application run (at least 50 hours MTBI), requires full system RAS capability; (5) Upgradability - system can be upgraded with a processor swap and additional cabinets to 100T or greater; (6) red/black switching - capability to switch major portions of the machine between classified and unclassified computing environments; (7) space, power, cooling - high density, low power system; and (8) price/performance - excellent performance per dollar, use high volume commodity parts where feasible.

More Details

TYPE Conference YEAR 2005

OSTI

ASCI Red for dummies : a recipe book for easy use of the ASCI Red platform

McAllister, Paula L.; Sault, Allen G.; Kelly, Suzanne M.; Miller, Joel D.; Quinlan, Gerald F.

It has been recognized that documentation for new customers of ASCI Red, aka janus or the Intel Teraflops at Sandia National Laboratories, has been sadly lacking. This document has been prepared by a team of subject matter experts to fill that void and to provide a starting point for providing a similar document for ASCI Red Storm in the future. This document is intended for SNL users who need to jumpstart their use of Janus and Janus-s.

More Details

TYPE Report YEAR 2003

DOI OSTI

An Investigation into Reliability, Availability, and Serviceability (RAS) Features for Massively Parallel Processor Systems

Kelly, Suzanne M.; Ogden, Jeffry B.

A study has been completed into the RAS features necessary for Massively Parallel Processor (MPP) systems. As part of this research, a use case model was built of how RAS features would be employed in an operational MPP system. Use cases are an effective way to specify requirements so that all involved parties can easily understand them. This technique is in contrast to laundry lists of requirements that are subject to misunderstanding as they are without context. As documented in the use case model, the study included a look at incorporating system software and end-user applications, as well as hardware, into the RAS system.

More Details

TYPE Report YEAR 2002

DOI OSTI

A Configurable, Object-Oriented, Transportation System Software Framework

Kelly, Suzanne M.; Myre, John W.; Price, Mark H.; Russell, Eric D.

The Transportation Surety Center, 6300, has been conducting continuing research into and development of information systems for the Configurable Transportation Security and Information Management System (CTSS) project, an Object-Oriented Framework approach that uses Component-Based Software Development to facilitate rapid deployment of new systems while improving software cost containment, development reliability, compatibility, and extensibility. The direction has been to develop a Fleet Management System (FMS) framework using object-oriented technology. The goal for the current development is to provide a software and hardware environment that will demonstrate and support object-oriented development commonly in the FMS Central Command Center and Vehicle domains.

More Details

TYPE Report YEAR 2000

DOI OSTI

Publications

Search results