Center for Computing Research (CCR)

Using Tpetra without CUDA UVM

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Trilinos Users Group Data Services Update

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Integrating PGAS and MPI-based Graph Analysis

McCrary, Trevor M.; Devine, Karen D.; Younge, Andrew J.

This project demonstrates that Chapel programs can interface with MPI-based libraries written in C++ without storing multiple copies of shared data. Chapel is a language for productive parallel computing using global address spaces (PGAS). We identified two approaches to interface Chapel code with the MPI-based Grafiki and Trilinos libraries. The first uses a single Chapel executable to call a C function that interacts with the C++ libraries. The second uses the mmap function to allow separate executables to read and write to the same block of memory on a node. We also encapsulated the second approach in Docker/Singularity containers to maximize ease of use. Comparisons of the two approaches using shared and distributed memory installations of Chapel show that both approaches provide similar scalability and performance.

More Details

TYPE Other Report YEAR 2021

OSTI DOI

Integrated System and Application Continuous Performance Monitoring and Analysis Capability

Aaziz, Omar R.; Allan, Benjamin A.; Brandt, James M.; Cook, Jeanine C.; Devine, Karen D.; Elliott, James E.; Gentile, Ann C.; Hammond, Simon D.; Kelley, Brian M.; Lopatina, Lena L.; Moore, Stan G.; Olivier, Stephen L.; Pedretti, Kevin P.; Poliakoff, David Z.; Pawlowski, Roger P.; Regier, Phillip A.; Schmitz, Mark E.; Schwaller, Benjamin S.; Surjadidjaja, Vanessa S.; Swan, Matthew S.; Tucker, Nick T.; Tucker, Tom T.; Vaughan, Courtenay T.; Walton, Sara P.

Scientific applications run on high-performance computing (HPC) systems are critical for many national security missions within Sandia and the NNSA complex. However, these applications often face performance degradation and even failures that are challenging to diagnose. To provide unprecedented insight into these issues, the HPC Development, HPC Systems, Computational Science, and Plasma Theory & Simulation departments at Sandia crafted and completed their FY21 ASC Level 2 milestone entitled "Integrated System and Application Continuous Performance Monitoring and Analysis Capability." The milestone created a novel integrated HPC system and application monitoring and analysis capability by extending Sandia's Kokkos application portability framework, Lightweight Distributed Metric Service (LDMS) monitoring tool, and scalable storage, analysis, and visualization pipeline. The extensions to Kokkos and LDMS enable collection and storage of application data during run time, as it is generated, with negligible overhead. This data is combined with HPC system data within the extended analysis pipeline to present relevant visualizations of derived system and application metrics that can be viewed at run time or post run. This new capability was evaluated using several week-long, 290-node runs of Sandia's ElectroMagnetic Plasma In Realistic Environments ( EMPIRE ) modeling and design tool and resulted in 1TB of application data and 50TB of system data. EMPIRE developers remarked this capability was incredibly helpful for quickly assessing application health and performance alongside system state. In short, this milestone work built the foundation for expansive HPC system and application data collection, storage, analysis, visualization, and feedback framework that will increase total scientific output of Sandia's HPC users.

More Details

TYPE SAND Report YEAR 2021

OSTI DOI

Integrated System and Application Continuous Performance Monitoring and Analysis Capability

Brandt, James M.; Cook, Jeanine C.; Aaziz, Omar R.; Allan, Benjamin A.; Devine, Karen D.; Elliott, James J.; Gentile, Ann C.; Hammond, Simon D.; Kelley, Brian M.; Lopatina, Lena L.; Moore, Stan G.; Olivier, Stephen L.; Pedretti, Kevin P.; Poliakoff, David Z.; Pawlowski, Roger P.; Regier, Phillip A.; Schmitz, Mark E.; Schwaller, Benjamin S.; Surjadidjaja, Vanessa S.; Swan, Matthew S.; Tucker, Tom T.; Tucker, Nick T.; Vaughan, Courtenay T.; Walton, Sara P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Integrating PGAS and MPI-Based Graph Analysis

McCrary, Trevor M.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

ExaGraph: Partitioning and Coloring

Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Acer, Seher A.; Bogle, Ian A.; Slota, George M.; Madduri, Kamesh M.; Gilbert, Michael S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

A Career of Load Balancing

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Removal of the UVM Requirement from Tpetra: MultiVector and BlockMultiVector

Devine, Karen D.; Danielson, Geoffrey C.; Fuller, Timothy J.; Hu, Jonathan J.; Kelley, Brian M.; Kim, Kyungjoo K.; Siefert, Christopher S.; Smith, Timothy A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

Advanced Partitioning Strategies for Scalable Remapping in Climate Models

Grindeanu, Iulian G.; Mahadevan, Vijay S.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2021

OSTI DOI

Distributed Memory Graph Coloring Algorithms for Multiple GPUs

Proceedings of IA3 2020: 10th Workshop on Irregular Applications: Architectures and Algorithms, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis

Bogle, Ian; Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Slota, George M.

Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments; it identifies sets of independent data that can be updated in parallel. Many algorithms exist for graph coloring on a single GPU or in distributed memory, but hybrid MPI+GPU algorithms have been unexplored until this work, to the best of our knowledge. We present several MPI+GPU coloring approaches that use implementations of the distributed coloring algorithms of Gebremedhin et al. and the shared-memory algorithms of Deveci et al. The on-node parallel coloring uses implementations in KokkosKernels, which provide parallelization for both multicore CPUs and GPUs. We further extend our approaches to solve for distance-2 coloring, giving the first known distributed and multi-GPU algorithm for this problem. In addition, we propose novel methods to reduce communication in distributed graph coloring. Our experiments show that our approaches operate efficiently on inputs too large to fit on a single GPU and scale up to graphs with 76.7 billion edges running on 128 GPUs.

More Details

TYPE Conference Paper YEAR 2020

Scopus OSTI

Distributed Graph Coloring on Multiple GPUs

Bogle, Ian A.; Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Slota, George M.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Distributed Memory Graph Coloring Algorithms for Multiple GPUs

Bogle, Ian A.; Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Slota, George M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Attributing Performance Variation from Integrated Application and System Data

Aaziz, Omar R.; Allan, Benjamin A.; Brandt, James M.; Cook, Jeanine C.; Devine, Karen D.; Elliott, James J.; Gentile, Ann C.; Olivier, Stephen L.; Pedretti, Kevin P.; Tucker, Tom T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Distributed Biconnectivity

Bogle, Ian A.; Slota, George M.; Rajamanickam, Sivasankaran R.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

State of the Tpetra Linear Solver Stack

Siefert, Christopher S.; Devine, Karen D.; Hoemmen, Mark F.; Hu, Jonathan J.; Kelley, Brian M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Task Placement for Improvement of Parallel Scalability

Ellis, John E.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Trilinos Data Services 2019

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Geometric mapping of tasks to processors on parallel computers with mesh or torus networks

IEEE Transactions on Parallel and Distributed Systems

Deveci, Mehmet; Devine, Karen D.; Pedretti, Kevin P.; Taylor, Mark A.; Rajamanickam, Sivasankaran R.; Çatalyurek, Umit V.

We present a new method for reducing parallel applications’ communication time by mapping their MPI tasks to processors in a way that lowers the distance messages travel and the amount of congestion in the network. Assuming geometric proximity among the tasks is a good approximation of their communication interdependence, we use a geometric partitioning algorithm to order both the tasks and the processors, assigning task parts to the corresponding processor parts. In this way, interdependent tasks are assigned to “nearby” cores in the network. We also present a number of algorithmic optimizations that exploit specific features of the network or application to further improve the quality of the mapping. We specifically address the case of sparse node allocation, where the nodes assigned to a job are not necessarily located in a contiguous block nor within close proximity to each other in the network. However, our methods generalize to contiguous allocations as well, and results are shown for both contiguous and non-contiguous allocations. We show that, for the structured finite difference mini-application MiniGhost, our mapping methods reduced communication time up to 75 percent relative to MiniGhost’s default mapping on 128K cores of a Cray XK7 with sparse allocation. For the atmospheric modeling code E3SM/HOMME, our methods reduced communication time up to 31% on 16K cores of an IBM BlueGene/Q with contiguous allocation.

More Details

TYPE Journal Article YEAR 2019

Scopus OSTI DOI

A parallel graph algorithm for detecting mesh singularities in distributed memory ice sheet simulations

ACM International Conference Proceeding Series

Bogle, Ian; Devine, Karen D.; Perego, Mauro P.; Rajamanickam, Sivasankaran R.; Slota, George M.

We present a new, distributed-memory parallel algorithm for detection of degenerate mesh features that can cause singularities in ice sheet mesh simulations. Identifying and removing mesh features such as disconnected components (icebergs) or hinge vertices (peninsulas of ice detached from the land) can significantly improve the convergence of iterative solvers. Because the ice sheet evolves during the course of a simulation, it is important that the detection algorithm can run in situ with the simulation - - running in parallel and taking a negligible amount of computation time - - so that degenerate features (e.g., calving icebergs) can be detected as they develop. We present a distributed memory, BFS-based label-propagation approach to degenerate feature detection that is efficient enough to be called at each step of an ice sheet simulation, while correctly identifying all degenerate features of an ice sheet mesh. Our method finds all degenerate features in a mesh with 13 million vertices in 0.0561 seconds on 1536 cores in the MPAS Albany Land Ice (MALI) model. Compared to the previously used serial pre-processing approach, we observe a 46,000x speedup for our algorithm, and provide additional capability to do dynamic detection of degenerate features in the simulation.

More Details

TYPE Conference Poster YEAR 2019

Scopus OSTI DOI

A Parallel Graph Algorithm for Detecting Mesh Singularities in Distributed Memory Ice Sheet Simulations

Bogle, Ian A.; Devine, Karen D.; Perego, Mauro P.; Rajamanickam, Sivasankaran R.; Slota, George M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI DOI

FASTMath: Frameworks Algorithms and Scalable Technologies for Mathematics

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

FASTMath: Kokkos Kernels and Linear Solvers

Rajamanickam, Sivasankaran R.; Bogle, Ian A.; Hu, Jonathan J.; Devine, Karen D.; Slota, George M.; Perego, Mauro P.; Kim, Kyungjoo K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Reducing E3SM Communication through Task Mapping

Ellis, John E.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Parallel Sparse Tensor Decomposition with the Trilinos Parallel Linear Algebra Framework

Devine, Karen D.; Kolda, Tamara G.; Phipps, Eric T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Trilinos Framework and Solvers

Rajamanickam, Sivasankaran R.; Hu, Jonathan J.; Devine, Karen D.; Wolf, Michael W.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Exploiting Scientific Software to Solve Problems in Data Analytics

Devine, Karen D.; Boman, Erik G.; Dunlavy, Daniel D.; Kolda, Tamara G.; Wolf, Michael W.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Tpetra and Data Services

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Exploiting Geometric Partitioning in Task Mapping for Parallel Computes

Deveci, Mehmet D.; Devine, Karen D.; Pedretti, Kevin P.; Taylor, Mark A.; Rajamanickam, Sivasankaran R.; Catalyurek, Umit V.

Abstract not provided.

More Details

TYPE Other Report YEAR 2018

OSTI DOI

ExaGraph at Sandia: Graph Coloring Clustering and Partitioning for Exascale Computing

Boman, Erik G.; Deveci, Mehmet D.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Wolf, Michael W.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

ExaGraph: Combinatorial Methods for Enabling Exascale Applications

Author, No A.; Halappanavar, Mahantesh H.; Buluc, Aydin B.; Boman, Erik G.; pothen, alex p.; Tumeo, Antonino T.; Azad, Ariful A.; Khan, Arif K.; Ferdous, SM F.; Rajamanickam, Sivasankaran R.; Wolf, Michael W.; Deveci, Mehmet D.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Trilinos Data Services

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Task Placement to Reduce Application Communication Costs

Devine, Karen D.; Brandt, James M.; Deveci, Mehmet D.; Gentile, Ann C.; Leung, Vitus J.; Olivier, Stephen L.; Pedretti, Kevin P.; Rajamanickam, Sivasankaran R.; Taylor, Mark A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Parallel Tensor Decompositions for Massive Heterogeneous Incomplete Data

Phipps, Eric T.; Kolda, Tamara G.; Anderson-Bergman, Clifford I.; Devine, Karen D.; Dunlavy, Daniel D.; Hong, David H.; Vuduc, Richard V.; Li, Jaijai L.; Young, Jeff Y.; Ballard, Grey B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Partitioning Trillion-Edge Graphs in Minutes

Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium, IPDPS 2017

Slota, George M.; Rajamanickam, Sivasankaran R.; Devine, Karen D.; Madduri, Kamesh

We introduce XtraPuLP, a new distributed-memory graph partitioner designed to process trillion-edge graphs. XtraPuLP is based on the scalable label propagation community detection technique, which has been demonstrated as a viable means to produce high quality partitions with minimal computation time. On a collection of large sparse graphs, we show that XtraPuLP partitioning quality is comparable to state-of-the-art partitioning methods. We also demonstrate that XtraPuLP can produce partitions of real-world graphs with billion+ vertices in minutes. Further, we show that using XtraPuLP partitions for distributed-memory graph analytics leads to significant end-to-end execution time reduction.

More Details

TYPE Conference Poster YEAR 2017

Scopus OSTI DOI

A computational spectral graph theory tutorial

Lehoucq, Richard B.; Boman, Erik G.; Devine, Karen D.; Berry, Jonathan W.; Dunlavy, Daniel D.; Wolf, Michael W.; Henson, Van H.; Sanders, Geoff S.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Enabling Low Mach Fluid Simulations Using Trilinos

Hu, Jonathan J.; Devine, Karen D.; Hoemmen, Mark F.; Lin, Paul L.; Rajamanickam, Sivasankaran R.; Roberts, Nathan V.; Siefert, Christopher S.; Trott, Christian R.; Prokopenko, Andrey P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Load Balancing throughout My Career

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Distributing linear systems for parallel computation

Devine, Karen D.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

The Zoltan2 Toolkit: Partitioning Task Placement Coloring and Ordering

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Trilinos NGP Planning

Rajamanickam, Sivasankaran R.; Devine, Karen D.; Hu, Jonathan J.; Hoemmen, Mark F.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Smart HPC Centers: data analysis feedback and response

Brandt, James M.; Gentile, Ann C.; martin, c m.; Allan, Benjamin A.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Partitioning and Task Placement with Zoltan2

Deveci, Mehmet D.; Devine, Karen D.; Boman, Erik G.; Leung, Vitus J.; Rajamanickam, Sivasankaran R.; Taylor, Mark A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Multi-Jagged: A Scalable Parallel Spatial Partitioning Algorithm

IEEE Transactions on Parallel and Distributed Systems

Deveci, Mehmet; Rajamanickam, Sivasankaran R.; Devine, Karen D.; Catalyurek, Umit V.

Geometric partitioning is fast and effective for load-balancing dynamic applications, particularly those requiring geometric locality of data (particle methods, crash simulations). We present, to our knowledge, the first parallel implementation of a multidimensional-jagged geometric partitioner. In contrast to the traditional recursive coordinate bisection algorithm (RCB), which recursively bisects subdomains perpendicular to their longest dimension until the desired number of parts is obtained, our algorithm does recursive multi-section with a given number of parts in each dimension. By computing multiple cut lines concurrently and intelligently deciding when to migrate data while computing the partition, we minimize data movement compared to efficient implementations of recursive bisection. We demonstrate the algorithm's scalability and quality relative to the RCB implementation in Zoltan on both real and synthetic datasets. Our experiments show that the proposed algorithm performs and scales better than RCB in terms of run-time without degrading the load balance. Our implementation partitions 24 billion points into 65,536 parts within a few seconds and exhibits near perfect weak scaling up to 6K cores.

More Details

TYPE Journal Article YEAR 2016

Scopus OSTI DOI

Parallel Graph Coloring for Manycore Architectures

Deveci, Mehmet D.; Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI DOI

Infrastructure for In Situ System Monitoring and Application Data Analysis

Brandt, James M.; Devine, Karen D.; Gentile, Ann C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI DOI

Infrastructure for In Situ System Monitoring and Application Data Analysis

Brandt, James M.; Devine, Karen D.; Gentile, Ann C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI DOI

Meshes Load Balancing Graph Algorithms

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

The Zoltan2 Toolkit: Partitioning Task Placement Coloring and Ordering

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Architecture-aware Task Placement

Deveci, Mehmet D.; Devine, Karen D.; Leung, Vitus J.; Prokopenko, Andrey V.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Distributing Linear Systems for Parallel Computation

Devine, Karen D.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Demonstrating Improved Application Performance Using Dynamic Monitoring and Task Mapping

Brandt, James M.; Devine, Karen D.; Gentile, Ann C.; Pedretti, Kevin P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

The Zoltan2 Toolkit: Partitioning Task Placement Coloring and Ordering

Devine, Karen D.; Boman, Erik G.; Rajamanickam, Sivasankaran R.; Leung, Vitus J.; Riesen, Lee A.; Deveci, Mehmet D.; Catalyurek, Umit V.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Demonstrating improved application performance using dynamic monitoring and task mapping

2014 IEEE International Conference on Cluster Computing, CLUSTER 2014

Brandt, James M.; Devine, Karen D.; Gentile, Ann C.; Pedretti, Kevin P.

This work demonstrates the integration of monitoring, analysis, and feedback to perform application-to-resource mapping that adapts to both static architecture features and dynamic resource state. In particular, we present a framework for mapping MPI tasks to compute resources based on run-time analysis of system-wide network data, architecture-specific routing algorithms, and application communication patterns. We address several challenges. Within each node, we collect local utilization data. We consolidate that information to form a global view of system performance, accounting for system-wide factors including competing applications. We provide an interface for applications to query the global information. Then we exploit the system information to change the mapping of tasks to nodes so that system bottlenecks are avoided. We demonstrate the benefit of this monitoring and feedback by remapping MPI tasks based on route-length, bandwidth, and credit-stalls metrics for a parallel sparse matrix-vector multiplication kernel. In the best case, remapping based on dynamic network information in a congested environment recovered 48.9% of the time lost to congestion, reducing matrix-vector multiplication time by 7.8%. Our experiments focus on the Cray XE/XK platform, but the integration concepts are generally applicable to any platform for which applicable metrics and route knowledge can be obtained.

More Details

TYPE Conference Poster YEAR 2014

Scopus OSTI DOI

2D Partitioning for Scalable Matrix Computations on Scale-Free Graphs

Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Meshes Geometry and Load Balancing Capability Area

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Albany on Next-Generation Systems

Devine, Karen D.; Salinger, Andrew G.; Demeshko, Irina D.; Hansen, Glen H.; Edwards, Harold C.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Using architecture information and real-time resource state to reduce power consumption and communication costs in parallel applications

Brandt, James M.; Devine, Karen D.; Gentile, Ann C.; Leung, Vitus J.; Olivier, Stephen L.; Pedretti, Kevin P.; Rajamanickam, Sivasankaran R.; Bunde, David P.; Deveci, Mehmet D.; Catalyurek, Umit V.

As computer systems grow in both size and complexity, the need for applications and run-time systems to adjust to their dynamic environment also grows. The goal of the RAAMP LDRD was to combine static architecture information and real-time system state with algorithms to conserve power, reduce communication costs, and avoid network contention. We devel- oped new data collection and aggregation tools to extract static hardware information (e.g., node/core hierarchy, network routing) as well as real-time performance data (e.g., CPU uti- lization, power consumption, memory bandwidth saturation, percentage of used bandwidth, number of network stalls). We created application interfaces that allowed this data to be used easily by algorithms. Finally, we demonstrated the benefit of integrating system and application information for two use cases. The first used real-time power consumption and memory bandwidth saturation data to throttle concurrency to save power without increasing application execution time. The second used static or real-time network traffic information to reduce or avoid network congestion by remapping MPI tasks to allocated processors. Results from our work are summarized in this report; more details are available in our publications [2, 6, 14, 16, 22, 29, 38, 44, 51, 54].

More Details

TYPE SAND Report YEAR 2014

OSTI DOI

Demonstrating Improved Application Performance Using Dynamic Monitoring and Task Mapping

Brandt, James M.; Devine, Karen D.; Gentile, Ann C.; Pedretti, Kevin P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI DOI

Installing the Anasazi eigensolver package with application to some graph eigenvalue problems

Lehoucq, Richard B.; Boman, Erik G.; Devine, Karen D.; Thornquist, Heidi K.; Slattengren, Nicole S.

The purpose of this report is to document a basic installation of the Anasazi eigensolver package and provide a brief discussion on the numerical solution of some graph eigenvalue problems.

More Details

TYPE SAND Report YEAR 2014

OSTI DOI

FASTMath Partitioning and Task Placement

Devine, Karen D.; Diamond, Gerrett D.; Ibanez, Dan I.; Leung, Vitus J.; Prokopenko, Andrey V.; Rajamanickam, Sivasankaran R.; Shephard, Mark S.; smith, cameron s.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Zoltan Three-Slide Overview for ATPESC 2014

Devine, Karen D.; Rajamanickam, Sivasankaran R.; Prokopenko, Andrey V.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Exploiting geometric partitioning in task mapping for parallel computers

Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS

Deveci, Mehmet; Rajamanickam, Sivasankaran R.; Leung, Vitus J.; Pedretti, Kevin P.; Olivier, Stephen L.; Bunde, David P.; Catalyurek, Umit V.; Devine, Karen D.

We present a new method for mapping applications' MPI tasks to cores of a parallel computer such that communication and execution time are reduced. We consider the case of sparse node allocation within a parallel machine, where the nodes assigned to a job are not necessarily located within a contiguous block nor within close proximity to each other in the network. The goal is to assign tasks to cores so that interdependent tasks are performed by 'nearby' cores, thus lowering the distance messages must travel, the amount of congestion in the network, and the overall cost of communication. Our new method applies a geometric partitioning algorithm to both the tasks and the processors, and assigns task parts to the corresponding processor parts. We show that, for the structured finite difference mini-app Mini Ghost, our mapping method reduced execution time 34% on average on 65,536 cores of a Cray XE6. In a molecular dynamics mini-app, Mini MD, our mapping method reduced communication time by 26% on average on 6144 cores. We also compare our mapping with graph-based mappings from the LibTopoMap library and show that our mappings reduced the communication time on average by 15% in MiniGhost and 10% in MiniMD. © 2014 IEEE.

More Details

TYPE Conference YEAR 2014

Scopus OSTI

Scalable Matrix Computations on Large Scale-Free Graphs Using 2D Graph Partitioning

Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI DOI

Using 2D Matrix Distributions in Trilinos

Devine, Karen D.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

A computational spectral graph theory tutorial

Boman, Erik G.; Devine, Karen D.; Lehoucq, Richard B.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Exploiting Geometric Partitioning in Task Mapping for Parallel Computers

Rajamanickam, Sivasankaran R.; Leung, Vitus J.; Pedretti, Kevin P.; Olivier, Stephen L.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

The Zoltan Toolkits: Parallel Partitioning Load Balancing Coloring and Ordering

Devine, Karen D.; Boman, Erik G.; Rajamanickam, Sivasankaran R.; Leung, Vitus J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Scalable Matrix Computations on Large Scale-Free Graphs Using 2D Graph Partitioning

Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI DOI

Multi-jagged: A Scalable Multi-section based Spatial Partitioning Algorithm

Rajamanickam, Sivasankaran R.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Combinatorial Scientific Computing for Exascale Systems and Applications

Devine, Karen D.; Rajamanickam, Sivasankaran R.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Using the Cray Gemini Performance Counters

Pedretti, Kevin P.; Vaughan, Courtenay T.; Barrett, Richard F.; Devine, Karen D.; Hemmert, Karl S.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Scalable Matrix Computations on Large Scale-Free Graphs Using 2D Graph Partitioning

Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI DOI

Trilinos-based Software for Eigenanalysis of Graphs

Boman, Erik G.; Devine, Karen D.; Lehoucq, Richard B.; Slattengren, Nicole S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Efficient Computation of Eigenpairs for Large Scale-free Graphs

Boman, Erik G.; Devine, Karen D.; Lehoucq, Richard B.; Slattengren, Nicole S.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Multi-jagged: A Scalable Multi-section based Spatial Partitioning Algorithm

Rajamanickam, Sivasankaran R.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Meshes Geometry and Load Balancing Capability Area

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Zoltan2: Next-Generation Combinatorial Toolkit

Boman, Erik G.; Devine, Karen D.; Leung, Vitus J.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Data Partitioning for Scientific Applications and Emerging Architectures

Devine, Karen D.; Leung, Vitus J.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Data Distribution for HPC Applications

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Eigensolvers on HPC Platforms

Boman, Erik G.; Devine, Karen D.; Lehoucq, Richard B.; Slattengren, Nicole S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

The Trilinos Project - Enabling predictive science and engineering through software libraries for scalable computing

Willenbring, James M.; Heroux, Michael A.; Devine, Karen D.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Exploiting Geometry and Adjacencies in Mesh Partitioning

Devine, Karen D.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Meshes Load Balancing and Geometry Capability Area

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Architecture-aware Load Balancing and Ordering

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Tutorial: The Zoltan Toolkit

Rajamanickam, Sivasankaran R.; Boman, Erik G.; Devine, Karen D.; Leung, Vitus J.; Riesen, Lee A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

TUG 2010 meshes, geometry and load balancing capability area

Devine, Karen D.; Copps, Kevin D.; Ebeida, Mohamed S.; Hensinger, David M.; Knupp, Patrick K.; Sjaardema, Gregory D.; Williams, Alan B.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Parallel mesh management using interoperable tools

Devine, Karen D.

This presentation included a discussion of challenges arising in parallel mesh management, as well as demonstrated solutions. They also described the broad range of software for mesh management and modification developed by the Interoperable Technologies for Advanced Petascale Simulations (ITAPS) team, and highlighted applications successfully using the ITAPS tool suite.

More Details

TYPE Conference YEAR 2010

OSTI

Exploring Feasibility of 2D Sparse Matrix Partitioning: Background

Wolf, Michael W.; Boman, Erik G.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

MapReduce in MPI for Large-Scale Graph Algorithms

Parallel Computing

Plimpton, Steven J.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2010

OSTI

Improved parallel mesh partitioning with hypergraphs and Zoltan

Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Comparing Programming Paradigms for Graph Algorithms

Devine, Karen D.; Plimpton, Steven J.; Bayer, Gregory B.; Barrett, Brian B.; Berry, Jonathan W.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Parallel Partitioning Coloring and Ordering for Scientific Computing

Boman, Erik G.; Devine, Karen D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI DOI

Interoperable mesh components for large-scale, distributed-memory simulations

Journal of Physics: Conference Series

Devine, Karen D.; Diachin, L.; Kraftcheck, J.; Jansen, K.E.; Leung, Vitus J.; Luo, X.; Miller, M.; Ollivier-Gooch, C.; Ovcharenko, A.; Sahni, O.; Shephard, M.S.; Tautges, T.; Xie, T.; Zhou, M.

SciDAC applications have a demonstrated need for advanced software tools to manage the complexities associated with sophisticated geometry, mesh, and field manipulation tasks, particularly as computer architectures move toward the petascale. In this paper, we describe a software component - an abstract data model and programming interface - designed to provide support for parallel unstructured mesh operations. We describe key issues that must be addressed to successfully provide high-performance, distributed-memory unstructured mesh services and highlight some recent research accomplishments in developing new load balancing and MPI-based communication libraries appropriate for leadership class computing. Finally, we give examples of the use of parallel adaptive mesh modification in two SciDAC applications. © 2009 IOP Publishing Ltd.

More Details

TYPE Conference YEAR 2009

Scopus OSTI

TUG 2008 Meshes Geometry and Load Balancing Capability Area

Devine, Karen D.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Distributed micro-releases of bioterror pathogens : threat characterizations and epidemiology from uncertain patient observables

Adams, Brian M.; Devine, Karen D.; Najm, H.N.; Marzouk, Youssef M.

Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern since the anthrax attacks of 2001. The ability to characterize the parameters of such attacks, i.e., to estimate the number of people infected, the time of infection, the average dose received, and the rate of disease spread in contemporary American society (for contagious diseases), is important when planning a medical response. For non-contagious diseases, we address the characterization problem by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To keep the approach relevant for response planning, we limit ourselves to 3.5 days of data. In computational tests performed for anthrax, we usually find these observation windows sufficient, especially if the outbreak model employed in the inverse problem is accurate. For contagious diseases, we formulated a Bayesian inversion technique to infer both pathogenic transmissibility and the social network from outbreak observations, ensuring that the two determinants of spreading are identified separately. We tested this technique on data collected from a 1967 smallpox epidemic in Abakaliki, Nigeria. We inferred, probabilistically, different transmissibilities in the structured Abakaliki population, the social network, and the chain of transmission. Finally, we developed an individual-based epidemic model to realistically simulate the spread of a rare (or eradicated) disease in a modern society. This model incorporates the mixing patterns observed in an (American) urban setting and accepts, as model input, pathogenic transmissibilities estimated from historical outbreaks that may have occurred in socio-economic environments with little resemblance to contemporary society. Techniques were also developed to simulate disease spread on static and sampled network reductions of the dynamic social networks originally in the individual-based model, yielding faster, though approximate, network-based epidemic models. These reduced-order models are useful in scenario analysis for medical response planning, as well as in computationally intensive inverse problems.

More Details

TYPE SAND Report YEAR 2008

OSTI DOI