Publications

3 Results
Skip to search filters

FY18 L2 Milestone #8759 Report: Vanguard Astra and ATSE ? an ARM-based Advanced Architecture Prototype System and Software Environment

Laros, James H.; Pedretti, Kevin P.; Hammond, Simon D.; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan E.; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white pa- per entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia Na- tional Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads. This document describes the high-level Vanguard program goals, the Vanguard-Astra project acquisition plan and procurement up to contract placement, the initial software stack environment planned for the Vanguard-Astra platform (Astra), a description of how the communities of users will utilize the platform during the transition from the open network to the classified network, and initial performance results.

More Details

FY18 L2 Milestone #6360 Report: Initial Capability of an Arm-based Advanced Architecture Prototype System and Software Environment

Laros, James H.; Pedretti, Kevin P.; Hammond, Simon D.; Aguilar, Michael J.; Curry, Matthew L.; Grant, Ryan E.; Hoekstra, Robert J.; Klundt, Ruth A.; Monk, Stephen T.; Ogden, Jeffry B.; Olivier, Stephen L.; Scott, Randall D.; Ward, Harry L.; Younge, Andrew J.

The Vanguard program informally began in January 2017 with the submission of a white pa- per entitled "Sandia's Vision for a 2019 Arm Testbed" to NNSA headquarters. The program proceeded in earnest in May 2017 with an announcement by Doug Wade (Director, Office of Advanced Simulation and Computing and Institutional R&D at NNSA) that Sandia Na- tional Laboratories (Sandia) would host the first Advanced Architecture Prototype platform based on the Arm architecture. In August 2017, Sandia formed a Tri-lab team chartered to develop a robust HPC software stack for Astra to support the Vanguard program goal of demonstrating the viability of Arm in supporting ASC production computing workloads. This document describes the high-level Vanguard program goals, the Vanguard-Astra project acquisition plan and procurement up to contract placement, the initial software stack environment planned for the Vanguard-Astra platform (Astra), a description of how the communities of users will utilize the platform during the transition from the open network to the classified network, and initial performance results.

More Details

HPC top 10 InfiniBand Machine : a 3D Torus IB interconnect on Red Sky

Naegle, John H.; Monk, Stephen T.; Schutt, James A.; Doerfler, Douglas W.; Rajan, Mahesh R.

This presentation discusses the following topics: (1) Red Sky Background; (2) 3D Torus Interconnect Concepts; (3) Difficulties of Torus in IB; (4) New Routing Code for IB a 3D Torus; (5) Red Sky 3D Torus Implementation; and (6) Managing a Large IB Machine. Computing at Sandia: (1) Capability Computing - Designed for scaling of single large runs, Usually proprietary for maximum performance, and Red Storm is Sandia's current capability machine; (2) Capacity Computing - Computing for the masses, 100s of jobs and 100s of users, Extreme reliability required, Flexibility for changing workload, Thunderbird will be decommissioned this quarter, Red Sky is our future capacity computing platform, and Red Mesa machine for National Renewable Energy Lab. Red Sky main themes are: (1) Cheaper - 5X capacity of Tbird at 2/3 the cost, Substantially cheaper per flop than our last large capacity machine purchase; (2) Leaner - Lower operational costs, Three security environments via modular fabric, Expandable, upgradeable, extensible, and Designed for 6yr. life cycle; and (3) Greener - 15% less power-1/6th power per flop, 40% less water-5M gallons saved annually, 10X better cooling efficiency, and 4x denser footprint.

More Details
3 Results
3 Results