Publications
Publication | Type | Year |
---|---|---|
Understanding Memory Failures on a Petascale Arm SystemThe 31st International Symposium on High-Performance Parallel and Distributed Computing
|
Conference Paper – 2022 Conference Paper | 2022 |
SNL ATDM Software Ecosystem Operating Systems and On-Node Runtime2022 Exascale Computing Project Annual Meeting (Virtual)
|
Display or Poster (non-conference) – 2022 Display or Poster (non-conference) | 2022 |
Characterizing Failures in HPC Using Benford?s LawThe SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP22)
|
Conference Presentation – 2022 Conference Presentation | 2022 |
"Smarter" NICs for Faster Molecular Dynamics: A Case Study36th IEEE International Parallel & Distributed Processing Symposium
|
Conference Proceeding – 2022 Conference Proceeding | 2022 |
Characterizing Per-node Memory Failures Using Benford?s LawFTXS 2021 Workshop on Fault Tolerance for HPC at eXtreme Scale held in conjuction with SC21
|
Abstract – 2021 Abstract | 2021 |
A Benchmark to Understand Communication Performance in Hybrid MPI and GPU ApplicationsExaMPI21Workshop on Exascale MPI
|
Conference Paper – 2021 Conference Paper | 2021 |
A Benchmark to Understand Communication Performance in Hybrid MPI and GPU ApplicationsExaMPI21Workshop on Exascale MPI
|
Conference Paper – 2021 Conference Paper | 2021 |
Characterizing Memory Failures Using Benford?s Law14th Workshop on Resiliency in High Performance Computing (Resilience) in Clusters, Clouds, and Grids
|
Conference Paper – 2021 Conference Paper | 2021 |
Characterizing Per-node Memory Failures Using Benford?s LawWorkshop on Fault Tolerance for HPC at eXtreme Scale (FTXS 2021)
|
Conference Paper – 2021 Conference Paper | 2021 |
Evaluating MPI Resource Usage Summary StatisticsJournal of Parallel Computing |
Journal Article – 2021 Journal Article | 2021 |
Understanding the Effects of DRAM Correctable Error Logging at ScaleIEEE Cluster Conference
|
Conference Paper – 2021 Conference Paper | 2021 |
MiniMod: A Modular Miniapplication Benchmarking Framework for HPCIEEE Cluster 2021
|
Conference Paper – 2021 Conference Paper | 2021 |
pMEMCPY: a simple, lightweight, and portable I/O library for storing data in persistent memoryREX-IO at IEEE Cluster 2021
|
Conference Paper – 2021 Conference Paper | 2021 |
An Initial Examination of the Effect of Container Resource Constraints on Application PerturbationWorkshop on Resource Arbitration for Dynamic Runtimes (RADR) |
Conference Presentation – 2021 Conference Presentation | 2021 |
SNL ATDM Software Ecosystem Operating Systems and On-Node Runtime2021 Exascale Computing Project Annual Meeting (Virtual) |
Display or Poster (non-conference) – 2021 Display or Poster (non-conference) | 2021 |
Co-design of System Software for Compute Accelerators and SmartNICsASCR Workshop on Reimagining Codesign |
Conference Paper – 2021 Conference Paper | 2021 |
Examining the Impact of Approximate Coordination on Checkpoint/Restarthttps://ckpt-symposium.lbl.gov/home
|
Abstract – 2020 Abstract | 2020 |
Low-cost MPI Multithreaded Message Matching BenchmarkingInternational Conference on High Performance Computing and Communication (HPCC)
|
Conference Presentation – 2020 Conference Presentation | 2020 |
Low-cost MPI Multithreaded Message Matching BenchmarkingInternational Conference on High Performance Computing and Communications (HPCC) |
Conference Paper – 2020 Conference Paper | 2020 |
RaDD Runtimes: Radical and Different Distributed Runtimes with SmartNICsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC) |
Conference Presentation – 2020 Conference Presentation | 2020 |
RaDD Runtimes: Radical and Different Distributed Runtimes with SmartNICsFourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware |
Conference Paper – 2020 Conference Paper | 2020 |
Evaluating MPI Message Size Summary StatisticsEuroMPI/USA '20 |
Conference Proceeding – 2020 Conference Proceeding | 2020 |
FY20 CSSE L2 Milestone 7186Completion of L2 Milestone 7186 |
Presentation (non-conference) – 2020 Presentation (non-conference) | 2020 |
Data Services for Visualization and Analysis ? ASC Level II Milestone (7186) |
SAND Report – 2020 SAND Report | 2020 |
ALAMO: Autonomous Lightweight Allocation, Management and OptimizationSmoky Mountains Computational Sciences and Engineering Conference |
Conference Paper – 2020 Conference Paper | 2020 |
Document Title | Type | Year |