Publications

6 Results
Skip to search filters

Optimizing Distributed Load Balancing for Workloads with Time-Varying Imbalance

Proceedings - IEEE International Conference on Cluster Computing, ICCC

Lifflander, Jonathan; Slattengren, Nicole S.; Pébaÿ, Philippe P.; Miller, Phil; Rizzi, Francesco N.; Bettencourt, Matthew T.

This paper explores dynamic load balancing algorithms used by asynchronous many-task (AMT), or 'taskbased', programming models to optimize task placement for scientific applications with dynamic workload imbalances. AMT programming models use overdecomposition of the computational domain. Overdecompostion provides a natural mechanism for domain developers to expose concurrency and break their computational domain into pieces that can be remapped to different hardware. This paper explores fully distributed load balancing strategies that have shown great promise for exascalelevel computing but are challenging to theoretically reason about and implement effectively. We present a novel theoretical analysis of a gossip-based load balancing protocol and use it to build an efficient implementation with fast convergence rates and high load balancing quality. We demonstrate our algorithm in a nextgeneration plasma physics application (EMPIRE) that induces time-varying workload imbalance due to spatial non-uniformity in particle density across the domain. Our highly scalable, novel load balancing algorithm, achieves over a 3x speedup (particle work) compared to a bulk-synchronous MPI implementation without load balancing.

More Details

Design and Implementation Techniques for an MPI-Oriented AMT Runtime

Proceedings of ExaMPI 2020: Exascale MPI Workshop, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis

Lifflander, Jonathan; Miller, Phil; Slattengren, Nicole S.; Morales, Nicolas M.; Stickney, Paul; Pébaÿ, Philippe P.

We present the execution model of Virtual Transport (VT) a new, Asynchronous Many-Task (AMT) runtime system that provides unprecedented integration and interoperability with MPI. We have developed VT in conjunction with large production applications to provide a highly incremental, high-value path to AMT adoption in the dominant ecosystem of MPI applications, libraries, and developers. Our aim is that the'MPI+X' model of hybrid parallelism can smoothly extend to become'MPI+VT +X'. We illustrate a set of design and implementation techniques that have been useful in building VT. We believe that these ideas and the code embodying them will be useful to others building similar systems, and perhaps provide insight to how MPI might evolve to better support them. We motivate our approach with two applications that are adopting VT and have begun to benefit from increased asynchrony and dynamic load balancing.

More Details
6 Results
6 Results