4.2.4. Rebalance

Achieving an even distribution of work between MPI ranks is essential to effective parallel scalability. The decomposition methods supported in the Finite Element Assembly block assume that cost of computation is roughly equal for all elements in the mesh, and seek to evenly balance the number of elements between MPI ranks. In many cases this is a good assumption, and all that is necessary to achieve good parallel scalability with Aria. However in some more complicated models this assumption breaks down, and it is possible to significantly improve Aria’s parallel performance by performing one or more rebalance operations that can leverage timing information during the simulation to improve the parallel decomposition. This is particularly common in models containing features like:

  • Different equations solved on different mesh blocks

  • Chemistry calculations present on only a subset of the mesh, or where there is a reaction front that moves with time during the run.

  • Multiple equation systems

  • Mesh adaptivity (either standard AMR or CDFEM)

In some cases > 2x speedups can be achieved via appropriate use of dynamic rebalance. There are two recommended options for enabling rebalance depending on the features in a model:

  1. For cases without transient adaptivity, CDFEM, or chemistry calculations with a moving reaction front a single rebalance operation run after 3-5 time steps is often sufficient. This can be achieved via a few line commands at the Aria Region scope:

    Begin Aria Region myRegion
      Enable Rebalance
      Rebalance Time Step Frequency = 3
      Maximum Number of Rebalances = 1
    
      ...
    End
    
  2. For more dynamic cases with transient adaptivity, CDFEM, or some chemistry calculations it can be advantageous to rebalance multiple times during the run. Note that each rebalance operation has some associated cost so rebalancing too frequently can hurt performance. For example, the input below will check if a rebalance is needed every 10 time steps, and do the rebalance if the mesh imbalance is more than 1.2. The default threshold if the with threshold = X argument is omitted is 1.25:

    Begin Aria Region myRegion
      Enable Rebalance with threshold = 1.2
      Rebalance Time Step Frequency = 10
    
      ...
    End
    
  3. When using adaptivity, you can also enable a rebalance to happen after every mesh adapt, using:

    Begin Aria Region myRegion
      Rebalance after adaptivity
    
      ...
    End
    

A full listing of the commands for controlling rebalance is provided here

Note

The issue with creating multiple output files that applies to transient adaptivity also applies to rebalance.