A Communication- and Memory-Aware Model for Load Balancing Tasks
Abstract not provided.
Abstract not provided.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Contact mechanics, or the modeling of the impenetrability of solid objects, is fundamental to computational solid mechanics (CSM) applications yet is oftentimes the most challenging in terms of computational efficiency and performance. These challenges arise from the irregularity and highly dynamic nature of contact simulation, particularly with algorithms designed for distributed memory architectures. First among these challenges is the inherent load imbalance when distributing contact load across compute nodes. This imbalance is highly problem dependent, and relates to the surface area of contact manifolds and the volume around them, rather than the distribution of the mesh over compute nodes, meaning the application load can vary drastically over different phases. The dynamic nature of contact problems motivates the use of distributed asynchronous many-tasking (AMT) frameworks to efficiently handle irregular workloads. In this paper, we present our work on distBVH, a distributed contact solution using the DARMA/vt library for asynchronous tasking that is also capable of running on-node Kokkos-based kernels. We explore how distBVH addresses the various challenges of CSM contact problems. We evaluate the use of many of DARMA/vt’s dynamic load balancers and demonstrate how our load balancing approach can provide significant performance improvements on various computational solid mechanics benchmarks. Additionally, we show how our approach can take advantage of DARMA/vt for tasking and efficient on-node kernels using Kokkos to scale over hundreds of processing elements.
This report presents our work to model the workloads of a linear electromagnetic application based on the method of moments in the frequency domain to effectively load balance the matrix assembly. This application is particularly challenging to load balance due to its lack of persistent iterative behavior, its operation under tight memory constraint (where the matrix may fill 80% of memory on each node), and the algorithmic complexity of the computational method. This report describes the first step in our work to apply an inspector-executor approach for load balancing workloads where key parameters are exposed during the inspector phase and a pre-trained model is applied to predict relative task weights for the load balancer.
The goal of this report is to provide insight to the development of vt-tv, a C++ HPC visualization tool designed for insightful analysis of load-balancing metrics in the DARMA toolkit. In particular, it delves into its modular data model and diverse usage scenarios, emphasizing adaptability and efficiency.