Publications Search

Proceedings of PMBS 2019: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis

Aaziz, Omar R.; Vaughan, Courtenay T.; Cook, Jonathan; Cook, Jeanine; Kuehn, Jeffery; Richards, David

In this work we investigate the dynamic communication behavior of parent and proxy applications, and investigate whether or not the dynamic communication behavior of the proxy matches that of its respective parent application. The idea of proxy applications is that they should match their parent well, and should exercise the hardware and perform similarly, so that from them lessons can be learned about how the HPC system and the application can best be utilized. We show here that some proxy/parent pairs do not need the extra detail of dynamic behavior analysis, while others can benefit from it, and through this we also identified a parent/proxy mismatch and improved the proxy application.

More Details

TYPE Conference Poster YEAR 2019

OSTI Scopus

Advanced Data Structures for National Cyber Security

Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Optimization-Based Design

Valentin, Miguel A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Modeling Assisted Atomic Precision Advanced Manufacturing (APAM) Towards Room Temperature Operation

Gao, Xujiao; Mamaluy, Denis; Anderson, Evan; Campbell, Deanna M.; Grine, Albert; Katzenmeyer, Aaron M.; Lu, T.M.; Schmucker, Scott W.; Tracy, Lisa A.; Ward, Daniel R.; Misra, Shashank

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

How to Write Code You?re Not Embarrassed to Share

Milewicz, Reed M.

More Details

TYPE Presentation YEAR 2019

OSTI

How Robust Are Graph Neural Networks to Structural Noise?

Fox, James S.; Rajamanickam, Sivasankaran

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Kokkos Kernels: Library Based Approach for Performance Portable Sparse/Dense linear algebra and Graph Kernels

Rajamanickam, Sivasankaran

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

A Portable SIMD Primitive in Kokkos for Heterogeneous Architectures

Sahasrabudhe, Damodar; Phipps, Eric T.; Rajamanickam, Sivasankaran; Berzins, Martin

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

User Level Threading: Qthreads and OpenMP

Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

mdspan in C++: A Case Study in the Integration of Performance Portable Features into International Language Standards

Hollman, David S.; Lelbach, Bryce A.; Edwards, H.C.; Hoemmen, Mark F.; Sunderland, Daniel; Trott, Christian R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

DOI OSTI

Predicting Molecular Toxicity from Structural Identity

Wright, Catherine; Rajamanickam, Sivasankaran

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Benchmarking quantum comptuers with robust phase estimation of molecular hydrogen

Russo, Antonio E.; Baczewski, Andrew D.; Morrison, Benjamin; Rudinger, Kenneth M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

From First-Principles toward atomistic understanding of mechanical properties in High-Entropy Alloys

Schultz, Peter A.; Tranchida, Julien; Chandross, Michael E.; Thompson, Aidan P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

INCA: In-Network Compute Assistance

Schonbein, Whit; Grant, Ryan; Dosanjh, Matthew G.F.; Arnold, Dorian

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

DOI OSTI

RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search

Vineyard, Craig; Helinski, Ryan; Koc, Cetin; Green, Sam

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

The Case for Modular Generalizable Proxy Applications for Systems Software Research

Marts, William P.; Dosanjh, Matthew G.F.; Grant, Ryan; Bridges, Patrick

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

TheKokkosC++ Performance Portability Programming Model

Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Evaluation of Programming Models to Address Load Imbalance on Distributed Multi-Core CPUs: A Case Study with Block Low-Rank Factorization

Proceedings of PAW-ATM 2019: Parallel Applications Workshop, Alternatives to MPI+X, Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis

Pei, Yu; Bosilca, George; Yamazaki, Ichitaro; Ida, Akihiro; Dongarra, Jack

To minimize data movement, many parallel ap-plications statically distribute computational tasks among the processes. However, modern simulations often encounters ir-regular computational tasks whose computational loads change dynamically at runtime or are data dependent. As a result, load imbalance among the processes at each step of simulation is a natural situation that must be dealt with at the programming level. The de facto parallel programming approach, flat MPI (one process per core), is hardly suitable to manage the lack of balance, imposing significant idle time on the simulation as processes have to wait for the slowest process at each step of simulation. One critical application for many domains is the LU factor-ization of a large dense matrix stored in the Block Low-Rank (BLR) format. Using the low-rank format can significantly reduce the cost of factorization in many scientific applications, including the boundary element analysis of electrostatic field. However, the partitioning of the matrix based on underlying geometry leads to different sizes of the matrix blocks whose numerical ranks change at each step of factorization, leading to the load imbalance among the processes at each step of factorization. We use BLR LU factorization as a test case to study the programmability and performance of five different programming approaches: (1) flat MPI, (2) Adaptive MPI (Charm++), (3) MPI + OpenMP, (4) parameterized task graph (PTG), and (5) dynamic task discovery (DTD). The last two versions use a task-based paradigm to express the algorithm; we rely on the PaRSEC run-time system to execute the tasks. We first point out programming features needed to efficiently solve this category of problems, hinting at possible alternatives to the MPI+X programming paradigm. We then evaluate the programmability of the different approaches, detailing our experience implementing the algorithm using each of the models. Finally, we show the performance result on the Intel Haswell-based Bridges system at the Pittsburgh Supercomputing Center (PSC) and analyze the effectiveness of the implementations to address the load imbalance.

More Details

TYPE Conference Poster YEAR 2019

OSTI Scopus

Publications

Search results