Publications

Results 1–25 of 266

Enabling power measurement and control on Astra: The first petascale Arm supercomputer

Concurrency and Computation: Practice and Experience

Grant, Ryan E.; Hammond, Simon D.; Laros, James H.; Levenhagen, Michael J.; Olivier, Stephen L.; Pedretti, Kevin P.; Ward, Harry L.; Younge, Andrew J.

Astra, deployed in 2018, was the first petascale supercomputer to utilize processors based on the ARM instruction set. The system was also the first under Sandia's Vanguard program which seeks to provide an evaluation vehicle for novel technologies that with refinement could be utilized in demanding, large-scale HPC environments. In addition to ARM, several other important first-of-a-kind developments were used in the machine, including new approaches to cooling the datacenter and machine. This article documents our experiences building a power measurement and control infrastructure for Astra. While this is often beyond the control of users today, the accurate measurement, cataloging, and evaluation of power, as our experiences show, is critical to the successful deployment of a large-scale platform. While such systems exist in part for other architectures, Astra required new development to support the novel Marvell ThunderX2 processor used in compute nodes. In addition to documenting the measurement of power during system bring up and for subsequent on-going routine use, we present results associated with controlling the power usage of the processor, an area which is becoming of progressively greater interest as data centers and supercomputing sites look to improve compute/energy efficiency and find additional sources for full system optimization.

More Details

TYPE Conference Paper YEAR 2023

Scopus OSTI

StressBench: A Configurable Full System Network and I/O Benchmark Framework

Chester, D.G.C.; Groves, T.G.; Hammond, Simon D.; Law, Timothy.L.; Wright, S.A.W.; Smedley-Stevenson, R.S.; S.A., Fahmy.S.A.; Mudalige, G.R.M.; Jarvis, S.A.J.

Abstract not provided.

More Details

TYPE Conference Proceeding YEAR 2021

OSTI DOI

A-SST Initial Specification

Rodrigues, Arun; Hammond, Simon D.; Hemmert, Karl S.; Hughes, Clayton H.; Kenny, Joseph P.; Voskuilen, Gwendolyn R.

The U.S. Army Research Office (ARO), in partnership with IARPA, are investigating innovative, efficient, and scalable computer architectures that are capable of executing next-generation large scale data-analytic applications. These applications are increasingly sparse, unstructured, non-local, and heterogeneous. Under the Advanced Graphic Intelligence Logical computing Environment (AGILE) program, Performer teams will be asked to design computer architectures to meet the future needs of the DoD and the Intelligence Community (IC). This design effort will require flexible, scalable, and detailed simulation to assess the performance, efficiency, and validity of their designs. To support AGILE, Sandia National Labs will be providing the AGILE-enhanced Structural Simulation Toolkit (A-SST). This toolkit is a computer architecture simulation framework designed to support fast, parallel, and multi-scale simulation of novel architectures. This document describes the A-SST framework, some of its library of simulation models, and how it may be used by AGILE Performers.

More Details

TYPE SAND Report YEAR 2021

OSTI DOI

Towards an Extensible Framework for Accelerated System Simulation

Voskuilen, Gwendolyn R.; Rodrigues, Arun; Hughes, Clayton H.; Hemmert, Karl S.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2021

OSTI DOI

Integrated System and Application Continuous Performance Monitoring and Analysis Capability

Aaziz, Omar R.; Allan, Benjamin A.; Brandt, James M.; Cook, Jeanine C.; Devine, Karen D.; Elliott, James E.; Gentile, Ann C.; Hammond, Simon D.; Kelley, Brian M.; Lopatina, Lena L.; Moore, Stan G.; Olivier, Stephen L.; Pedretti, Kevin P.; Poliakoff, David Z.; Pawlowski, Roger P.; Regier, Phillip A.; Schmitz, Mark E.; Schwaller, Benjamin S.; Surjadidjaja, Vanessa S.; Swan, Matthew S.; Tucker, Nick T.; Tucker, Tom T.; Vaughan, Courtenay T.; Walton, Sara P.

Scientific applications run on high-performance computing (HPC) systems are critical for many national security missions within Sandia and the NNSA complex. However, these applications often face performance degradation and even failures that are challenging to diagnose. To provide unprecedented insight into these issues, the HPC Development, HPC Systems, Computational Science, and Plasma Theory & Simulation departments at Sandia crafted and completed their FY21 ASC Level 2 milestone entitled "Integrated System and Application Continuous Performance Monitoring and Analysis Capability." The milestone created a novel integrated HPC system and application monitoring and analysis capability by extending Sandia's Kokkos application portability framework, Lightweight Distributed Metric Service (LDMS) monitoring tool, and scalable storage, analysis, and visualization pipeline. The extensions to Kokkos and LDMS enable collection and storage of application data during run time, as it is generated, with negligible overhead. This data is combined with HPC system data within the extended analysis pipeline to present relevant visualizations of derived system and application metrics that can be viewed at run time or post run. This new capability was evaluated using several week-long, 290-node runs of Sandia's ElectroMagnetic Plasma In Realistic Environments ( EMPIRE ) modeling and design tool and resulted in 1TB of application data and 50TB of system data. EMPIRE developers remarked this capability was incredibly helpful for quickly assessing application health and performance alongside system state. In short, this milestone work built the foundation for expansive HPC system and application data collection, storage, analysis, visualization, and feedback framework that will increase total scientific output of Sandia's HPC users.

More Details

TYPE SAND Report YEAR 2021

OSTI DOI

StressBench: A Configurable Full System Network and I/O Benchmark Framework

Chester, D.C.; Groves, T.G.; Hammond, Simon D.; Law, Tim L.; Wright, S.A.W.; R., Smedley-Stevenson R.; Fahmy, S.A.F.; G.R., Mudalige G.R.; Jarvis, S.A.J.

Abstract not provided.

More Details

TYPE Conference Paper YEAR 2021

OSTI DOI

SST-ExplorerEnabling System-level Performance and Reliability Analysis for Designs with Real-World IPs

Rodrigues, Arun; Awad, Amro A.; Hughes, Clayton H.; Agarwal, Sapan A.; Skoufis, Michael S.; Voskuilen, Gwendolyn R.; Nema, Shubham N.; Razdan, Rohin R.; Gardner, Alan G.; Hemmert, Karl S.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2021

OSTI DOI

ERAS: Enabling the Integration of Real-World Intellectual Properties (IPs) in Architectural Simulators

Nema, Shubham N.; Razdan, Rohin R.; Rodrigues, Arun; Hemmert, Karl S.; Voskuilen, Gwendolyn R.; Adak, Debratim A.; Hammond, Simon D.; Awad, Amro A.; Hughes, Clayton H.

Sandia National Laboratories is investigating scalable architectural simulation capabilities with a focus on simulating and evaluating highly scalable supercomputers for high performance computing applications. There is a growing demand for RTL model integration to provide the capability to simulate customized node architectures and heterogeneous systems. This report describes the first steps integrating the ESSENTial Signal Simulation Enabled by Netlist Transforms (ESSENT) tool with the Structural Simulation Toolkit (SST). ESSENT can emit C++ models from models written in FIRRTL to automatically generate components. The integration workflow will automatically generate the SST component and necessary interfaces to ’plug’ the ESSENT model into the SST framework.

More Details

TYPE SAND Report YEAR 2021

OSTI DOI

SST-ExplorerEnabling System-level Performance and Reliability Analysis for Designs with Real-World IPs

Rodrigues, Arun; Awad, Amro A.; Hughes, Clayton H.; Agarwal, Sapan A.; Skoufis, Michael S.; Voskuilen, Gwendolyn R.; Nema, Shubham N.; Razdan, Rohin R.; Gardner, Alan G.; Hemmert, Karl S.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2021

OSTI DOI

Integrated System and Application Continuous Performance Monitoring and Analysis Capability

Brandt, James M.; Cook, Jeanine C.; Aaziz, Omar R.; Allan, Benjamin A.; Devine, Karen D.; Elliott, James J.; Gentile, Ann C.; Hammond, Simon D.; Kelley, Brian M.; Lopatina, Lena L.; Moore, Stan G.; Olivier, Stephen L.; Pedretti, Kevin P.; Poliakoff, David Z.; Pawlowski, Roger P.; Regier, Phillip A.; Schmitz, Mark E.; Schwaller, Benjamin S.; Surjadidjaja, Vanessa S.; Swan, Matthew S.; Tucker, Tom T.; Tucker, Nick T.; Vaughan, Courtenay T.; Walton, Sara P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Vanguard-II Application Evaluation Thrust

Laros, James H.; Hammond, Simon D.; Pedretti, Kevin P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Sandias Experiences with Arm

Younge, Andrew J.; Hammond, Simon D.; Pedretti, Kevin P.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2021

OSTI DOI

Kokkos Tools for Productive Development

Poliakoff, David Z.; Hammond, Simon D.; Lewis, Cannada L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Fugaku and A64FX Update - April 2021

Hammond, Simon D.; Curry, Matthew J.; Davis, Kevin D.; Dang, Vinh Q.; Guba, Oksana G.; Hoekstra, Robert J.; Laros, James H.; Pedretti, Kevin P.; Poliakoff, David Z.; Rajamanickam, Sivasankaran R.; Trott, Christian R.; Younge, Andrew J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Experiences with Arm

Hammond, Simon D.; Laros, James H.; Pedretti, Kevin P.; Younge, Andrew J.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Enabling Application and System Data Fusion

Gentile, Ann C.; Brandt, James M.; Cook, Jeanine C.; Hammond, Simon D.; Poliakoff, David Z.; Schwaller, Benjamin S.; Surjadidjaja, Vanessa S.; Tucker, Tom

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2021

OSTI DOI

Codesign for the Masses

Lewis, Cannada L.; Hammond, Simon D.; Wilke, Jeremiah J.

In this position paper we will address challenges and opportunities relating to the design and codesign of application specific circuits. Given our background as computational scientists, our perspective is from the viewpoint of a highly motivated application developer as opposed to career computer architects

More Details

TYPE Other Report YEAR 2021

OSTI DOI

Using MLIR Framework for Codesign of ML Architectures Algorithms and Simulation Tools

Lewis, Cannada L.; Hughes, Clayton H.; Hammond, Simon D.; Rajamanickam, Sivasankaran R.

MLIR (Multi-Level Intermediate Representation), is an extensible compiler framework that supports high-level data structures and operation constructs. These higher-level code representations are particularly applicable to the artificial intelligence and machine learning (AI/ML) domain, allowing developers to more easily support upcoming heterogeneous AI/ML accelerators and develop flexible domain specific compilers/frameworks with higher-level intermediate representations (IRs) and advanced compiler optimizations. The result of using MLIR within the LLVM compiler framework is expected to yield significant improvement in the quality of generated machine code, which in turn will result in improved performance and hardware efficiency

More Details

TYPE Other Report YEAR 2021

OSTI DOI

DeACT: Architecture-Aware Virtual Memory Support for Fabric Attached Memory Systems

Proceedings - International Symposium on High-Performance Computer Architecture

Kommareddy, Vamsee R.; Hughes, Clayton H.; Hammond, Simon D.; Awad, Amro

1 The exponential growth of data has driven technology providers to develop new protocols, such as cache coherent interconnects and memory semantic fabrics, to help users and facilities leverage advances in memory technologies to satisfy these growing memory and storage demands. Using these new protocols, fabric-Attached memories (FAM) can be directly attached to a system interconnect and be easily integrated with a variety of processing elements (PEs). Moreover, systems that support FAM can be smoothly upgraded and allow multiple PEs to share the FAM memory pools using well-defined protocols. The sharing of FAM between PEs allows efficient data sharing, improves memory utilization, reduces cost by allowing flexible integration of different PEs and memory modules from several vendors, and makes it easier to upgrade the system. One promising use-case for FAMs is in High-Performance Compute (HPC) systems, where the underutilization of memory is a major challenge. However, adopting FAMs in HPC systems brings new challenges. In addition to cost, flexibility, and efficiency, one particular problem that requires rethinking is virtual memory support for security and performance. To address these challenges, this paper presents decoupled access control and address translation (DeACT), a novel virtual memory implementation that supports HPC systems equipped with FAM. Compared to the state-of-The-Art two-level translation approach, DeACT achieves speedup of up to 4.59x (1.8x on average) without compromising security.1Part of this work was done when Vamsee was working under the supervision of Amro Awad at UCF. Amro Awad is now with the ECE Department at NC State.

More Details

TYPE Conference Paper YEAR 2021

Scopus OSTI DOI

Stealth-Persist: Architectural Support for Persistent Applications in Hybrid Memory Systems

Proceedings - International Symposium on High-Performance Computer Architecture

Alwadi, Mazen; Kommareddy, Vamsee R.; Hughes, Clayton H.; Hammond, Simon D.; Awad, Amro

Non-volatile memories (NVMs) have the characteristics of both traditional storage systems (persistent) and traditional memory systems (byte-Addressable). However, they suffer from high write latency and have a limited write endurance. Researchers have proposed hybrid memory systems that combine DRAM and NVM, utilizing the lower latency of the DRAM to hide some of the shortcomings of the NVM-improving system's performance by caching resident NVM data in the DRAM. However, this can nullify the persistency of the cached pages, leading to a question of trade-offs in terms of performance and reliability. In this paper, we propose Stealth-Persist, a novel architecture support feature that allows applications that need persistence to run in the DRAM while maintaining the persistency features provided by the NVM. Stealth-Persist creates the illusion of a persistent memory for the application to use, while utilizing the DRAM for performance optimizations. Our experimental results show that Stealth-Persist improves the performance by 42.02% for persistent applications.

More Details

TYPE Conference Paper YEAR 2021

Scopus OSTI DOI

Vision for Co-designing a Unified-Memory Centric Heterogeneous Node Architecture

Rajamanickam, Sivasankaran R.; Krishna, Tushar K.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference Paper YEAR 2021

OSTI

SST-GPU: A Scalable SST GPU Component for Performance Modeling and Profiling

Hughes, Clayton H.; Hammond, Simon D.; Zhang, Mengchi Z.; Liu, Yechen L.; Rogers, Tim R.; Hoekstra, Robert J.

Programmable accelerators have become commonplace in modern computing systems. Advances in programming models and the availability of unprecedented amounts of data have created a space for massively parallel accelerators capable of maintaining context for thousands of concurrent threads resident on-chip. These threads are grouped and interleaved on a cycle-by-cycle basis among several massively parallel computing cores. One path for the design of future supercomputers relies on an ability to model the performance of these massively parallel cores at scale. The SST framework has been proven to scale up to run simulations containing tens of thousands of nodes. A previous report described the initial integration of the open-source, execution-driven GPU simulator, GPGPU-Sim, into the SST framework. This report discusses the results of the integration and how to use the new GPU component in SST. It also provides examples of what it can be used to analyze and a correlation study showing how closely the execution matches that of a Nvidia V100 GPU when running kernels and mini-apps.

More Details

TYPE SAND Report YEAR 2021

OSTI DOI

Compute Memory Trends: from Application Requirements to Architectural Needs

Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Vanadis - a Configurable Processor Core Model for SST

Hammond, Simon D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Early Experiences with A64FX

Hammond, Simon D.; Younge, Andrew J.; Pedretti, Kevin P.; Laros, James H.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Results 1–25 of 266

Results 1–25 of 266