Center for Computing Research (CCR)

The Portals 4.3 Network Programming Interface

Schonbein, William W.; Barrett, Brian W.; Brightwell, Ronald B.; Grant, Ryan G.; Hemmert, Karl S.; Pedretti, Kevin P.; Underwood, Keith U.; Riesen, Rolf R.; Hoefler, Torsten H.; Barbe, Mathieu B.; Filho, Luiz H.; Ratchov, Alexandre R.; Maccabe, Arthur B.

This report presents a specification for the Portals 4 network programming interface. Portals 4 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded systems. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.

More Details

TYPE SAND Report YEAR 2022

OSTI DOI

SNL ATDM Software Ecosystem Operating Systems and On-Node Runtime

Olivier, Stephen L.; Brightwell, Ronald B.; Ferreira, Kurt B.; Grant, Ryan E.; Levy, Scott L.; Pedretti, Kevin P.; Younge, Andrew J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

HPC Operating SystemResearch Areas and Challenges

Pedretti, Kevin P.; Brightwell, Ronald B.; Younge, Andrew J.; Lange, Jack L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

The Hardware of Smaller Clusters (V.3.0)

Lacy, Susan L.; Brightwell, Ronald B.

Chris Saunders and three technologists are in high demand from Sandia’s deep learning teams, and they’re kept busy by building new clusters of computer nodes for researchers who need the power of supercomputing on a smaller scale. Sandia researchers working on Laboratory Directed Research & Development (LDRD) projects, or innovative ideas for solutions on short timeframes, formulate new ideas on old themes and frequently rely on smaller cluster machines to help solve problems before introducing their code to larger HPC resources. These research teams need an agile hardware and software environment where nascent ideas can be tested and cultivated on a smaller scale.

More Details

TYPE Other Report YEAR 2020

OSTI DOI

Chronicles of astra: Challenges and lessons from the first petascale arm supercomputer

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Pedretti, Kevin P.; Younge, Andrew J.; Hammond, Simon D.; Laros, James H.; Curry, Matthew J.; Aguilar, Michael J.; Hoekstra, Robert J.; Brightwell, Ronald B.

Arm processors have been explored in HPC for several years, however there has not yet been a demonstration of viability for supporting large-scale production workloads. In this paper, we offer a retrospective on the process of bringing up Astra, the first Petascale supercomputer based on 64-bit Arm processors, and validating its ability to run production HPC applications. Through this process several immature technology gaps were addressed, including software stack enablement, Linux bugs at scale, thermal management issues, power management capabilities, and advanced container support. From this experience, several lessons learned are formulated that contributed to the successful deployment of Astra. These insights can be helpful to accelerate deploying and maturing other first-seen HPC technologies. With Astra now supporting many users running a diverse set of production applications at multi-thousand node scales, we believe this constitutes strong supporting evidence that Arm is a viable technology for even the largest-scale supercomputer deployments.

More Details

TYPE Conference Poster YEAR 2020

Scopus OSTI

ALAMO: Autonomous Lightweight Allocation Management and Optimization

Brightwell, Ronald B.; Ferreira, Kurt B.; Grant, Ryan E.; Levy, Scott L.; Lofstead, Gerald F.; Olivier, Stephen L.; Pedretti, Kevin P.; Younge, Andrew J.; Gentile, Ann C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

September 2019 ECP ST Project Review

Trujillo, Gabrielle T.; Turner, Daniel Z.; Brightwell, Ronald B.; Oldfield, Ron A.; Clay, Robert L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Meeting the Future Needs of HPC with MPI

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Thoughts on Autonomous Resource Management for HPC

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Opportunities and Challenges for Accelerated Network Interfaces in HPC

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Memory Technology Impacts on Current Near-Term and Future Systems

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Meeting the Future Needs of HPC with MPI

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Finepoints: Partitioned multithreaded MPI communication

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Grant, Ryan E.; Dosanjh, Matthew D.; Levenhagen, Michael J.; Brightwell, Ronald B.; Skjellum, Anthony

The MPI multithreading model has been historically difficult to optimize; the interface that it provides for threads was designed as a process-level interface. This model has led to implementations that treat function calls as critical regions and protect them with locks to avoid race conditions. We hypothesize that an interface designed specifically for threads can provide superior performance than current approaches and even outperform single-threaded MPI. In this paper, we describe a design for partitioned communication in MPI that we call finepoints. First, we assess the existing communication models for MPI two-sided communication and then introduce finepoints as a hybrid of MPI models that has the best features of each existing MPI communication model. In addition, “partitioned communication” created with finepoints leverages new network hardware features that cannot be exploited with current MPI point-to-point semantics, making this new approach both innovative and useful both now and in the future. To demonstrate the validity of our hypothesis, we implement a finepoints library and show improvements against a state-of-the-art multithreaded optimized Open MPI implementation on a Cray XC40 with an Aries network. Our experiments demonstrate upÂ to a 12 × reduction in wait time for completion of send operations. This new model is shown working on a nuclear reactor physics neutron-transport proxy-application, providing upÂ to 26.1% improvement in communication time and upÂ to 4.8% improvement in runtime over the best performing MPI communication mode, single-threaded MPI.

More Details

TYPE Conference Poster YEAR 2019

Scopus OSTI DOI

SNL ATDM Software Ecosystem

Olivier, Stephen L.; Brightwell, Ronald B.; Pedretti, Kevin P.; Younge, Andrew J.; Evans, Noah; Levy, Scott L.; Ferreira, Kurt B.; Grant, Ryan E.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

The Portals 4.2 Network Programming Interface

Barrett, Brian W.; Brightwell, Ronald B.; Grant, Ryan E.; Hemmert, Karl S.; Pedretti, Kevin P.; Wheeler, Kyle W.; Riesen, Rolf R.; Hoefler, Torsten H.; Maccabe, Arthur B.; Hudson, Trammell H.

This report presents a specification for the Portals 4 network programming interface. Portals 4 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded systems. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.

More Details

TYPE SAND Report YEAR 2018

OSTI DOI

System Software Perspective on Resilience

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Vanguard Astra: Maturing the ARM Software Ecosystem for U.S. DOE/ASC Supercomputing

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Hardware/Software Co-Design for High Performance Interconnects for Extreme-Scale Systems

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Portals 4: Status of Specification and Implementation

Younge, Andrew J.; Grant, Ryan E.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Improving MPI Multi-threaded RMA Performance

Hjelm, Nathan T.; Dosanjh, Matthew D.; Groves, Taylor G.; Grant, Ryan E.; Brightwell, Ronald B.; Bridges, Patrick B.; Arnold, Dorian A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI DOI

Architectural Convergence of Big Data and Extreme-Scale Computing: Marriage of Convenience or Conviction

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Resource Management Challenges in the Era of Extreme Heterogeneity

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

A Tale of Two Systems: Using Containers to Deploy HPC Applications on Supercomputers and Clouds

Younge, Andrew J.; Pedretti, Kevin P.; Grant, Ryan E.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI DOI

ATDM Operating Systems and On-Node Runtime

Olivier, Stephen L.; Pedretti, Kevin P.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Enhancing Qthreads for ECP Science and Energy Impact

Brightwell, Ronald B.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

December 2017 ECP ST Project Review: ECP Project WBS 2.3.1.15 (Qthreads)

Brightwell, Ronald B.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

December 2017 ECP ST Project Review: ECP Project WBS 2.3.5.04 (SNL ATDM Software Ecosystem)

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Other Report YEAR 2017

OSTI DOI

sPIN: High-performance streaming processing in the network

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017

Hoefer, Torsten; Di Girolamo, Salvatore; Taranov, Konstantin; Grant, Ryan E.; Brightwell, Ronald B.

Optimizing communication performance is imperative for large-scale computing because communication overheads limit the strong scalability of parallel applications. Today's network cards contain rather powerful processors optimized for data movement. However, these devices are limited to fixed functions, such as remote direct memory access. We develop sPIN, a portable programming model to offload simple packet processing functions to the network card. To demonstrate the potential of the model, we design a cycle-accurate simulation environment by combining the network simulator Log-GOPSim and the CPU simulator gem5. We implement offloaded message matching, datatype processing, and collective communications and demonstrate transparent full-application speedups. Furthermore, we show how sPIN can be used to accelerate redundant in-memory filesystems and several other use cases. Our work investigates a portable packet-processing network acceleration model similar to compute acceleration with CUDA or OpenCL. We show how such network acceleration enables an eco-system that can significantly speed up applications and system services.

More Details

TYPE Conference Poster YEAR 2017

Scopus OSTI

A Tale of Two Systems: Using Containers to Deploy HPC Applications on Supercomputers and Clouds

Younge, Andrew J.; Pedretti, Kevin P.; Grant, Ryan E.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI DOI

Enabling Diverse Software Stacks on Supercomputers Using High Performance Virtual Clusters

Proceedings - IEEE International Conference on Cluster Computing, ICCC

Younge, Andrew J.; Pedretti, Kevin P.; Grant, Ryan E.; Gaines, Brian G.; Brightwell, Ronald B.

While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed computing models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging software ecosystems.In this paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifically, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, effectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.

More Details

TYPE Conference Poster YEAR 2017

Scopus OSTI

What Will Determine the Future Success of MPI?

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Sandia's ARM-centric Co-Design Strategy: Introduction to the NNSA/ASC Vanguard Project

Ang, James A.; Brightwell, Ronald B.; Hammond, Simon D.; Hemmert, Karl S.; Hoekstra, Robert J.; Laros, James H.; Pedretti, Kevin P.; Rodrigues, Arun

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters

Younge, Andrew J.; Pedretti, Kevin P.; Grant, Ryan E.; Gaines, Brian G.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Challenges and Opportunities for HPC Interconnects and MPI

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Preparing MPI for Exascale

Grant, Ryan E.; Dosanjh, Matthew D.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

HPC Co-Design

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Embracing Diversity: OS Support for Integrating High- Performance Computing and Data Analytics

Brightwell, Ronald B.; Pedretti, Kevin P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

The Portals 4.1 Network Programming Interface

Barrett, Brian W.; Brightwell, Ronald B.; Grant, Ryan E.; Hemmert, Karl S.; Pedretti, Kevin P.; Wheeler, Kyle W.; Underwood, Keith; Riesen, Rolf R.; Maccabe, Arthur B.; Hudson, Trammel H.

This report presents a specification for the Portals 4 networ k programming interface. Portals 4 is intended to allow scalable, high-performance network communication betwee n nodes of a parallel computing system. Portals 4 is well suited to massively parallel processing and embedded syste ms. Portals 4 represents an adaption of the data movement layer developed for massively parallel processing platfor ms, such as the 4500-node Intel TeraFLOPS machine. Sandia's Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4 is tar geted to the next generation of machines employing advanced network interface architectures that support enh anced offload capabilities.

More Details

TYPE SAND Report YEAR 2017

OSTI DOI

Qthreads and On-Node Run time Coordination

Olivier, Stephen L.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

ATDM Operating System Project: A Multi-Stack Approach for Application Composition and Performance

Pedretti, Kevin P.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Enhancing Qthreads for ECP Science and Energy Impact And Sandia ATDM On-Node Runtime Coordination

Brightwell, Ronald B.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Modeling Concurrent Point-to-Point Communication Cost in MPI Performance Models

Farmer, Shane F.; Skjellum, Anthony S.; Bridges, Patrick G.; Dosanjh, Matthew D.; Grant, Ryan E.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Embracing Diversity: OS Support for Integrating High-Performance Computing and Data Analytics

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

RMA-MT: A Benchmark Suite for Assessing MPI Multi-threaded RMA Performance

Proceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016

Dosanjh, Matthew D.; Groves, Taylor G.; Grant, Ryan E.; Brightwell, Ronald B.; Bridges, Patrick G.

Reaching Exascale will require leveraging massive parallelism while potentially leveraging asynchronous communication to help achieve scalability at such large levels of concurrency. MPI is a good candidate for providing the mechanisms to support communication at such large scales. Two existing MPI mechanisms are particularly relevant to Exascale: multi-threading, to support massive concurrency, and Remote Memory Access (RMA), to support asynchronous communication. Unfor-tunately, multi-threaded MPI RMA code has not been extensively studied. Part of the reason for this is that no public benchmarks or proxy applications exist to assess its performance. The contributions of this paper are the design and demonstration of the first available proxy applications and micro-benchmark suite for multi-threaded RMA in MPI, a study of multi-threaded RMA performance of different MPI implementations, and an evaluation of how these benchmarks can be used to test development for both performance and correctness.

More Details

TYPE Conference Poster YEAR 2016

Scopus OSTI DOI

Qthreads: Run Time Library Support for Task Parallel Programming

Brightwell, Ronald B.; Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Hobbes: A Multi‐Stack Approach for Application Composition and Performance Isolation

Pedretti, Kevin P.; Brightwell, Ronald B.; Mukherjee, Shyamali M.; Evans, Noah; Kocoloski, Brian; Ouyang, Jiannan O.; Peter, Dinda P.; Hale, Kyle H.; Bridges, Patrick B.; Mondragon, Oscar H.; Lang, Michael L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

OS/Runtime Abstractions and Interfaces for Managing the Memory Hierarchy

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

XPRESS: eXascale PRogramming Environment and System Software

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

XPRESS: eXascale Programming Environment and System Software

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Practical resilient cases for FA-MPI, A transactional fault-Tolerant MPI

Proceedings of the 3rd ExaMPI Workshop at the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2015

Hassani, Amin; Skjellum, Anthony; Bangalore, Purushotham V.; Brightwell, Ronald B.

MPI is insu-cient when confronting failures. FA-MPI (Fault-Aware MPI) provides extensions to the MPI standard de-signed to enable data-parallel applications to achieve re-silience without sacri-cing scalability. FA-MPI introduces transactions as a novel extension to theMPI message-passing model. Transactions support failure detection, isolation, mitigation, and recovery via application-driven policies. To achieve maximum achievable performance of modern ma-chines, overlapping communication and I/O with computa-Tion through non-blocking operations is of growing impor-Tance. Therefore, we emphasize fault-Tolerant, non-blocking communication operations plus a set of nestable lightweight transactional TryBlock API extensions able to exploit sys-Tem and application hierarchy. This strategy enables appli-cations to run to completion with higher probability than nominally. We modi-ed two proxy applications|MiniFE and LULESH|by adding FA-MPI semantics to them. Fi-nally we present performance and overhead results for 1K MPI processes.

More Details

TYPE Conference Poster YEAR 2015

Scopus OSTI

Publications