4.7. Solving the problem

4.7.1. Nonlinear Solution Strategy

4.7.1.1. Equation System Overview

In coupled physics problems one is tasked with solving a set of nonlinear coupled equations and ideally one would solve these equations as a fully-coupled monolithic system. In practice solving the set of equations monolithically may prove impractical owing to large linear system memory requirements and difficulty in locating a small radius of convergence. Thus in some cases one may wish to perform a segregated solve of the equations where each set of equations can be solved in a slightly different manner. This ability to solve subsets of the entire equation set is supported through a feature called Equation System by which criteria for solving a set of nonlinear equations can be applied to a select group of equations.

By default a single set of coupled equations is treated internally as an Equation System without need of additional input syntax. Here the solution strategy for the Equation System is solely determined by a collection of nonlinear solution specifications. A typical setup for a single equation system might be

Begin Aria Region myRegion
  .
  Use Linear Solver solve_temperature
  .
  eq energy for temperature On block_1 using q1 with lumped_mass diff
  .
  nonlinear solution strategy = newton
  use dof averaged nonlinear residual
  maximum nonlinear iterations = 20
  nonlinear residual tolerance = 1.e-6
  .
  .
End Aria Region myRegion

Simulations in which different subsets of the coupled equations are solved in a segregated manner require that each subset be explicitly defined as an Equation System with a corresponding solution strategy which may differ for each equation system. When multiple Equation Systems are present, the overall segregated solution is governed by segregated iteration and convergence criteria at an outer level.

A typical setup for segregated solution of multiple equation systems might be

Begin Aria Region myRegion
  .
  Segregated global nonlinear residual tolerance = 6.0e-9
  .
  Begin Equation System temperature
    .
    eq energy for temperature On block_1 using q1 with lumped_mass diff
    Use Linear Solver solve_temperature
    .
    nonlinear solution strategy = newton
    use dof averaged nonlinear residual
    maximum nonlinear iterations = 20
    nonlinear residual tolerance = 1.e-6
    .
  End Equation System temperature
  .
  Begin Equation System species
    .
    eq species for species_0 On block_1 using q1 with diff
    .
    Use Linear Solver solve_species
    .
    nonlinear solution strategy = newton
    use dof averaged nonlinear residual
    maximum nonlinear iterations = 10
    nonlinear residual tolerance = 1.e-8
    .
  End Equation System species
  .
End Aria Region myRegion

4.7.1.2. Nonlinear Solution Specifications

The general solution strategy used in Aria is to solve the requested system of equations for solution variables $\symSlnVec$ incrementally. In the context of a Newton formulation then within the Newton step one will solve

$\symJac \Delta \symSlnVec = -R(\symSlnVec)$

where $\symJac$ is the Jacobian matrix, $\Delta \symSlnVec$ is the solution increment and $R(\symSlnVec)$ is the system residual for the governing equations. See the theory section for more information on the solution strategy. This chapter describes basic nonlinear solution command line settings that influence how a converged solution will be obtained. For transient solutions these settings are used in combination with time integration settings for advancing a solution to the next time step.

It is instructive here to review a typical log file output of the Aria solution summary in order to establish how a user might influence the solution procedure.

Equation System AriaRegion->main:
* Step  : Transient, Strategy: NEWTON, Time: 2.20e+03, Step: 5.29e+00
* Matrix: Solver: "Solve_Temperature", Unknowns: 1360, Nonzeros: 11578
* Mesh  : Processor 0 of 1: 1253 of 1253 elems, 1363 of 1363 nodes
* Computing View Factors for enclosure space
  N O N L I N E A R                L I N E A R
----------------------   ------------------------------------
Step  Resid    Delta     Itns Status  Resid    Asm/Slv Time
---- -------- --------   ---- ------ -------- ---------------
1    6.37e-02 1.10e+01   59    ok    8.23e-07 7.0e-03/2.0e-03
2    1.10e-04 8.18e-03   58    ok    1.19e-09 7.0e-03/2.0e-03
3    3.06e-07 8.81e-05   61    ok    3.68e-12 6.0e-03/2.0e-03
Termination reason: 8.80515e-05 < nonlinear_correction_tolerance(0.001)

In the log file output shown above, columns under the N O N L I N E A R, heading are related to the command lines described in this chapter. The columns Resid and Delta represent L2 norms of an aggregate residual $R^{*}(\symSlnVec)$ and solution increment vector $\Delta\symSlnVec$ while the column denoted Step denotes the iteration within this Newton step. Columns under the L I N E A R heading represent the number of linear solver iterations (Itns), the Resid column represents the linear system residual $R^{*}(\Delta \symSlnVec)$ . Here $R^{*}(\Delta\symSlnVec) = \symJac \Delta \symSlnVec + R(\symSlnVec)$ , the residual for the assembled system of equations, is solved for $\Delta\symSlnVec$ to the tolerance requested in the Linear Solver specifications. The column labeled Status indicates whether the linear solve completed successfully, ok or fail, within the allowable number of iterations defined in the Linear Solver specifications. The column labeled Asm/Slv Time contain assembly and solution times for the system of equations.

Given a current value of $\symSlnVec$ then within each Newton step the residual $R^{*}(\Delta\symSlnVec)$ is first evaluated using $\Delta\symSlnVec=0$ and tested for convergence. Here the convergence conditions are satisfied when either the NONLINEAR RESIDUAL TOLERANCE or NONLINEAR CORRECTION TOLERANCE are achieved. If the solution is not yet converged, the system of equations is turned over to the equation solver where it is solved for a new $\Delta\symSlnVec$ subject to the Linear Solver specifications. Then the Solver solution is used to update the nonlinear solution $\symSlnVec=\symSlnVec_{old}+\Delta\symSlnVec$ and the process begins anew subject to a user supplied MAXIMUM NONLINEAR ITERATIONS.

Within the Newton step a solution is said to FAIL if it is not converged within MAXIMUM NONLINEAR ITERATIONS and this occurrence will be noted in the log file. For transient simulations one will normally repeat the solution procedure with a smaller time step as described in Time Step Selection. In some cases the user may chose to override the FAIL condition using the ACCEPT SOLUTION AFTER MAXIMUM NONLINEAR ITERATIONS command line but one is cautioned that this option should be used with discretion as it may discredit any expectation of accuracy in transient results. For completeness, a summary of pass, fail, and fail but accepted steps are included at the end of the log file.

For most applications the default NONLINEAR RESIDUAL TOLERANCE (1e-6) is adequate when accompanied by a Linear Solver specification of RESIDUAL NORM TOLERANCE minimum of 1.0e-8. Inclusion of the command line USE DOF AVERAGED NONLINEAR RESIDUAL is highly recommended, particularly when performing uniform mesh refinement studies. A summary of these and other nonlinear solution related command lines is given in what follows.

4.7.2. Linear Solution Strategy

4.7.2.1. Overview

Aria supports several choices of linear solver packages. Starting with the 5.0 release the Tpetra based linear solvers are the only linear solvers supported.

Within the input file linear solver command block will appear at the Domain scope as illustrated below. Note that different linear solvers can be used for different equation systems. Furthermore, a solver command block can appear in the input but not be used.

Begin Sierra
  .
  .
  Begin Tpetra Equation Solver third_solver
    .
    Tpetra linear solver commands
    .
  End
  .
  Begin Procedure My_Aria_Procedure
     .
     Begin Aria Region My_Region
       .
       use linear solver first_solver
       .
     End

     Begin Aria Region My_Other_Region
       .
       use linear solver third_solver
       .
     End

  End
  .
End Sierra

See Solver Selection Guidelines for suggestions on how to select a type of linear solver for different kinds of problems Aria can solve.

4.7.3. Tpetra Solver Overview

The syntax for specifying a Tpetra-based solver uses a nested command block structure. The outermost level at the Sierra scope is the BEGIN TPETRA EQUATION SOLVER block. This block must contain a single nested solver block corresponding to your desired solver type. Several types of direct and iterative linear solvers are provided through the Amesos2 and Belos packages of Trilinos. The supported solver blocks are listed in Solvers. If one of the iterative solvers is selected, it must also have a nested preconditioner block. A variety of preconditioners are supported through the Ifpack2 and MueLu Trilinos packages. See Preconditioners for details on the supported preconditioners and their available options.

4.7.4. Solver Error Codes

If the linear solve fails it will report an error code. While these codes are not universally standardized across linear solvers, they can generally be interpreted with the following rules:

Error 1 (or -1): The linear solver reached the maximum number of linear solver iterations and did not reach a solution. Check the final linear solver residual to see if this is serious or not. Even if the error is not serious (residuals are small) this usually means the solve is much more expensive than necessary and a better solver/preconditioner combination should be used.
Error 2 (or -2, 4, or -4): The linear solver encountered a fatal internal error. This is often because of a singular matrix or a NaN or Infinite term. This is almost always a serious error.
Error 3 (or -3): The linear solver had an internal loss of precision. This is often not a serious error, but may indicate the need for a better solver/preconditioner combination. As always, check the final linear solver residual reported in the log file to determine if it is serious or not.

4.7.5. Expert Syntax

The standard solver and preconditioner blocks only support a subset of the commonly used options that are supported by the underlying Trilinos packages. The subset was chosen to cover the vast majority of linear solver use cases within Aria simulations, but an alternative approach is provided for expert users who would like to be able to pass parameters directly to the underlying Belos, Ifpack2, and MueLu packages. This expert interface is used by providing the path to an XML file containing the definition for a Teuchos::ParameterList that is supported by the underlying Trilinos package in one of the BEGIN EXPERT BELOS SOLVER, BEGIN EXPERT IFPACK2 PRECONDITIONER, or BEGIN MUELU PRECONDITIONER blocks as desired.

Sample XML for expert Belos solver

<ParameterList>
  <Parameter name="solution method" type="string" value="CG"/>
  <ParameterList name="Belos-Tpetra">
    <Parameter name="Convergence Tolerance" type="double" value="1.e-12"/>
    <Parameter name="Maximum Iterations" type="int" value="1000"/>
  </ParameterList>
</ParameterList>

The solution method parameter is required. Any additional parameters that are not specified in the Belos-Tpetra sublist will use their default values as defined by Belos.

Sample XML for expert Ifpack2 preconditioner

<ParameterList>
  <Parameter name="preconditioner type:" type="string" value="ILUT"/>
  <Parameter name="fact: ilut level-of-fill" type="double" value="1.0"/>
  <Parameter name="fact: drop tolerance" type="double" value="0.0"/>
</ParameterList>

The preconditioner type: parameter is required. Any additional parameters that are not specified will use their default values as defined by Ifpack2.

Sample XML for expert MueLu preconditioner

<ParameterList name="MueLu-Tpetra">
  <Parameter        name="verbosity"                        type="string"   value="high"/>
  <Parameter        name="coarse: type"                     type="string"   value="ILUT"/>
  <Parameter        name="coarse: max size"                 type="int"      value="1000"/>
  <Parameter        name="max levels"                       type="int"      value="4"/>
  <Parameter        name="smoother: type"                   type="string"   value="CHEBYSHEV"/>
  <ParameterList    name="smoother: params">
     <Parameter name="chebyshev: degree"                    type="int"      value="2"/>
     <Parameter name="chebyshev: ratio eigenvalue"          type="double"   value="20"/>
     <Parameter name="chebyshev: min eigenvalue"            type="double"   value="1.0"/>
     <Parameter name="chebyshev: zero starting solution"    type="bool"     value="true"/>
     <Parameter name="chebyshev: eigenvalue max iterations" type="int"      value="15"/>
  </ParameterList>
  <Parameter        name="aggregation: type"                type="string"   value="uncoupled"/>
  <Parameter        name="aggregation: drop tol"            type="double"   value="0.02"/>

  <Parameter        name="repartition: enable"              type="bool"     value="true"/>
  <Parameter        name="repartition: min rows per proc"   type="int"      value="1000"/>
  <Parameter        name="repartition: start level"         type="int"      value="2"/>
  <Parameter        name="repartition: max imbalance"       type="double"   value="1.327"/>
  <Parameter        name="repartition: partitioner"         type="string"   value="zoltan2"/>
</ParameterList>

Unspecified parameters will use their default values as defined by MueLu.

For further documentation of the parameters supported by each Trilinos package please see the corresponding documentation on trilinos.org.

4.7.6. Solver Selection Guidelines

Selecting an appropriate linear solver that is both robust and efficient for a particular Aria simulation can be confusing. In general there is a tradeoff between solver speed and robustness, and this can significantly affect the total time to solution. The following guidelines for choosing a linear solver may be useful:

As a starting point preset solvers are provided for several common types of problems. These are recommended as a starting point for any analysis.

The thermal preset solver type is recommended as the starting point for general thermal analysis problems. It uses a GMRES solver with Jacobi preconditioning and is a good balance between runtime per linear solver iteration and robustness for most thermal problems.

The thermal_symmetric and thermal_bicgstab presets can be slightly faster for thermal analysis problems that do not involve contact. They are CG and BiCGSTAB solvers respectively, each with Jacobi preconditioning. These trade some robustness for shorter runtime per linear iteration when compared with the GMRES solver used by the thermal preset.

The thermal_multigrid preset is found to be effective for general scalar equation analysis, including thermal problems and for scalar problems that often require direct solves.

For fluids problems the continuity preset is recommended for the continuity equation system, and the scalar_transport preset is recommended for the momentum equation system, as well as any additional scalar advection-diffusion equations. The continuity preset is a GMRES solver with an algebraic multigrid preconditioner provided through the MueLu Trilinos package. The scalar_transport preset is a GMRES solver with a SGS preconditioner.

For more general multiphysics problems involving multiple coupled equations in a single equation system the multiphysics or teko_multiphysics preset solver is the recommended starting point. The former is a GMRES solver with a DD-ILU preconditioner. The latter is a GMRES solver with a heuristically determined Teko preconditioner.

When using any of the preset solver types, an equivalent solver block is written to the log file and can be used as a starting point for modifying additional solver or preconditioner settings. This solver block will contain the preset preconditioner along with default solver settings. In reusing the this block, users will often choose to increase the number of MAXIMUM ITERATIONS in hopes of lowering the solution time, but it is noted that increases beyond triple the default setting reap low reward in obtaining a solution.

If the suggested preset solver shows evidence of struggling to solve a particular problem in the log file (for example frequently approaching or reaching the maximum number of linear iterations) consider using a stronger solver and/or preconditioner. For the solvers move from CG to BiCGSTAB to GMRES. If already using GMRES try increasing the number of RESTART ITERATIONS to match the MAXIMUM ITERATIONS. For the preconditioners move from Jacobi to SGS to DD-ILU or DD-ILUT. Initially leave all preconditioner-specific options to their default values.
If GMRES/DD-ILU still shows evidence of struggling to solve the problem consider increasing the ILU fill and overlap levels. First trying overlap of 1, then fill of 1, then both is a good balance between the cost and robustness of the preconditioner. See Preconditioners for setting the fill and overlap.
If GMRES/DD-ILU with fill and overlap is still insufficient consider:

Trying one of the direct solvers (KLU2, SuperLU, or UMFPACK) if your problem is small. These are the most robust option available to Aria, but significantly more expensive in both memory usage and runtime than the iterative solver options. They also do not scale in parallel effectively, and should not be used for problems requiring more than a single compute node on HPC systems.

For multiphysics problems involving many equations in a single monolithic equation system consider moving to a segregated solution scheme involving multiple smaller equation systems. This will make the linear solves easier, but can make nonlinear convergence of the overall problem more challenging. Loosely coupled equations are more likely to be successfully split into separate equation systems without causing nonlinear convergence problems.

For multiphysics problems where splitting the monolithic equation system into separate equation systems is unfeasible, block preconditioning strategies through Teko may prove effective. This provides extensive flexibility in constructing physics-based preconditioners tailored to the multiphysics problem. As a consequence, setting up block preconditioners is challenging. Fortunately, the teko_multiphysics preset solver can automatically a Teko preconditioner based on the methods described in Teko Heuristic Multiphysics Preconditioner. If you need to manually construct a Teko preconditioner, consult Block Preconditioning for additional information and tips on setting up block preconditioners for multiphysics problems.

Multigrid preconditioning can be extremely effective and exhibit excellent parallel scalability, however setup of multigrid preconditioners is an extremely complex topic that is both problem type and problem size dependent. Aria supports general algebraic multigrid preconditioning through the MueLu package from Trilinos, and can read MueLu XML files for defining the multigrid setup. Please consult the MueLu documentation for suggestions on selecting a multigrid preconditioner. The default MUELU preconditioner is not recommended for general use. However, it is sometimes useful for steady state scalar equation simulations, Poisson problems that are nearly singular and difficult to solve with preset solvers.

4.7.7. Linear Solvers on the GPU

Performant linear solvers for GPUs in general are an open research question. This section contains some guidelines for users in selecting a linear solver (and more importantly preconditioner) when using Aria on GPU platforms.

All solvers and preconditioners are expected to run correctly when Aria is compiled for GPU platforms, however not all are able to use the GPU and therefore will have poor performance.
All of the iterative solvers (CG, BiCGSTAB, and GMRES) can run with good performance on GPUs.
Jacobi preconditioning has excellent GPU performance when it is an effective preconditioner.
The DD-ILU preconditioner is capable of running on the GPU, but performance can be limited by inherent data dependencies in the ILU factorization and solve algorithms. By default Aria will use an alternative matrix reordering algorithm (metis) when run on the GPU that reduces those data dependencies, however it typically leads to slightly higher linear iteration counts than the RCM reordering method that is the CPU default. Despite increased iteration counts, users can typically expect comparable or slightly improved linear solver runtimes for DD-ILU when compared to a similar CPU machine.
The SGS preconditioner is an inherently serial algorithm that does not run on the GPU. There is an SGS2 preconditioner that uses a modified algorithm to expose parallelism to run on the GPU that users can try, but increasing the number of linear solver iterations and using Jacobi may result in a lower time to solution.
The DD-ILUT preconditioner is also an inherently serial algorithm that does not run on the GPU.
There are experimental FastILU and DD-PARILUT preconditioners available that use iterative approximations to the classical DD-ILU and DD-ILUT algorithms to expose parallelism for improved GPU performance. The development team is still evaluating these preconditioner options and they are not currently recommended for production use by users.
Teko block preconditioners can be efficient on the GPU, provided each of the sub-block solvers is also performant on the GPU. For example, an iterative sub-block solver with Jacobi preconditioning outperforms DD-ILU, provided the iterative solver converges in sufficiently few iterations.

4.7.8. Tolerance Settings

In addition to selecting the solver and preconditioner type, selecting appropriate convergence tolerance and scaling settings is also important. The default settings (R0 scaling and 1e-6 tolerance) are a reasonable starting place, but may need to be adjusted depending on the problem and nonlinear solver settings. Typically setting the RESIDUAL SCALING to either NONE or R0 is most useful.

NONE applies no scaling to the residual for determining linear solver convergence. If the problem has a low residual to being with the solver will only take a few iterations, however if the initial residual is large this option may take more linear solver iterations than is ideal. The linear solver tolerance selected with this option should be at least an order of magnitude lower than the nonlinear residual tolerance used in the Newton solve or the nonlinear problem may have trouble converging.
R0 scales the residual by the initial residual. This is useful if you want a looser acceptance tolerance when the problem starts with high residual, however if the problem starts with a small residual this may cause the solver to take an unreasonably large number of iterations. If the initial residual is large and the linear solver tolerance is too loose this can result in the linear solve producing bad/poorly solved results which may cause convergence problems.

An autonomous linear convergence tolerance can be computed within Aria, and is declared within the Tpetra solvers by setting the parameter CONVERGENCE TOLERANCE = AUTOMATIC in the Tpetra solver block. This option computes a linear convergence tolerance for each nonlinear iteration based on the initial linear residual and the user-specified nonlinear residual tolerance. The user can also specify an optional minimum linear residual tolerance (with a default value of $1\times10^{-6}$ ) to ensure that the computed linear convergence tolerance within Aria is not used if it’s too small. Below is an example of using this autonomous setting with a specified minimum

BEGIN TPETRA EQUATION SOLVER  GMRES_DDILU
  BEGIN GMRES SOLVER
    BEGIN DD-ILU PRECONDITIONER
    END
    MAXIMUM ITERATIONS = 1000
    RESIDUAL SCALING = NONE
    CONVERGENCE TOLERANCE = AUTOMATIC MINIMUM = 1.0E-8
  END
END TPETRA EQUATION SOLVER

4.7.9. Block Preconditioning

The advice contained in this section is relevant to users trying to manually construct a Teko-based preconditioner. The teko_multiphysics preset solver provides an automated method of generating a Teko-based preconditioner based on the methods described in Teko Heuristic Multiphysics Preconditioner.

4.7.9.1. Overview

Aria supports block preconditioning through the Teko package [33]. Teko support in Aria is currently an expert user feature. Aria currently offers two modes of interacting with Teko. The first is provided through the TEKO PRECONDITIONER block, which interfaces directly with the TPETRA EQUATION SOLVER blocks in an Aria input deck. The second, provided through the EXPERT TEKO PRECONDITIONER block, offers users a way to use Teko is through the expert interface that uses an XML file specify the relevant Teko parameters. Please see https://trilinos.github.io/teko.html or Expert Teko Syntax for more information on how to structure this XML.

Note

The TEKO PRECONDITIONER and EXPERT TEKO PRECONDITIONER blocks currently only support using the GMRES SOLVER block.

4.7.9.2. Block Gauss-Seidel

Teko provides the capability to construct block-based preconditioners using the Block Gauss-Seidel method. For many multiphysics problems, this is a reasonable first choice. Aria requires the mapping from DOFs to sub-block in the Teko preconditioner. An example input file block is shown below.

BEGIN TPETRA EQUATION SOLVER GMRES_TEKO_BGS
  BEGIN GMRES SOLVER
    BEGIN TEKO PRECONDITIONER
      DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
      DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE

      BEGIN TEKO SUBBLOCK PRECONDITIONER DDILU_PC
        BEGIN DD-ILU PRECONDITIONER
        END
      END

      BEGIN TEKO SUBBLOCK PRECONDITIONER AMG_PC
        BEGIN MUELU PRECONDITIONER
        END
      END

      BEGIN BLOCK GAUSS-SEIDEL SOLVER
        USE INVERSE DDILU_PC FOR SUBBLOCK 1
        USE INVERSE AMG_PC FOR SUBBLOCK 2
      END
    END
  END
END TPETRA EQUATION SOLVER

The Teko input block starts by defining the DOF-to-block mapping through the DEFINE MATRIX SUBBLOCK command. Note that the DOF to sub-block relationship need not be one-to-one, as many DOFs can be mapped to the same sub-block. This gives the user tremendous flexibility in customizing the block preconditioner. For additional guidance on determining good DOF to sub-block mappings, refer to the advice in Road-map for Constructing Teko Solvers. Once the DOF-to-block mapping is specified, the means to invert the block system and each sub-block are required. These are both specified through the block solver specification inside the TEKO PRECONDITIONER scope. Currently, BLOCK GAUSS-SEIDEL, BLOCK JACOBI, and SIMPLE are supported. Each of these block solvers provide a USE INVERSE <METHOD> FOR SUBBLOCK <N> command to describe how to solve each sub-block, where the method name refers to a solver or preconditioner defined in a separate TEKO SUBBLOCK SOLVER or TEKO SUBBLOCK PRECONDITIONER block.

In the example above , BLOCK GAUSS-SEIDEL is the main preconditioning method. Sub-block 1 (velocity) uses a DD-ILU preconditioner to solve the sub-block. Sub-block 2 (pressure), on the other hand, uses MueLu to construct an algebraic multigrid preconditioner to solve the sub-block. The Teko preconditioning feature in Aria allows the specification of sub-block solvers and preconditioners through the TEKO SUBBLOCK SOLVER and TEKO SUBBLOCK PRECONDITIONER block commands, respectively. Each resembles the syntax of the TPETRA EQUATION SOLVER blocks used to solve linear systems in Aria. For advice on choosing appropriate sub-block preconditioners, refer to Road-map for Constructing Teko Solvers.

4.7.9.3. Navier–Stokes and SIMPLE

Note

When using SIMPLE for Navier–Stokes, velocity must be assigned to block 1 and pressure to block 2.

While the Block Gauss-Seidel preconditioners provided by Teko are effective for preconditioning multiphysics problems, they tend to be sub-optimal for Navier–Stokes. Teko provides several block preconditioning strategies specifically for the solution of the incompressible Navier–Stokes equations. One such method is inspired by the Semi-Implicit Method for Pressure Linked Equations (SIMPLE) and SIMPLEC operating splitting techniques. For a more complete description of this approximate block approximation, refer to [45]. An example input file block for setting up a SIMPLEC block preconditioned is shown below. Extending our previous example input file block from the section on Block Gauss-Seidel, we have the following input:

BEGIN TPETRA EQUATION SOLVER GMRES_TEKO_SIMPLE
  BEGIN GMRES SOLVER
    BEGIN TEKO PRECONDITIONER
      # NOTE: velocity must be assigned to block 1 for SIMPLE
      DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
      DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE

      BEGIN TEKO SUBBLOCK PRECONDITIONER DDILU_PC
        BEGIN DD-ILU PRECONDITIONER
        END
      END

      BEGIN TEKO SUBBLOCK PRECONDITIONER AMG_PC
        BEGIN MUELU PRECONDITIONER
        END
      END

      BEGIN SIMPLE SOLVER
        USE INVERSE DDILU_PC FOR SUBBLOCK 1
        USE INVERSE AMG_PC FOR SUBBLOCK 2
        EXPLICIT VELOCITY INVERSE TYPE = ABSROWSUM # or LUMPED, DIAGONAL
        ALPHA = 0.9 # Pressure Schur-Complement (under/over) relaxation, > 0.0
      END
    END
  END
END TPETRA EQUATION SOLVER

Note that for SIMPLE and SIMPLEC, block 1 is assumed to be velocity. In the above case, the SIMPLEC method is used with the velocity inverse approximated using a DD-ILU preconditioner and the pressure (Schur complement approximation) inverse approximated through algebraic multigrid.

While the SIMPLE and SIMPLEC methods are envisioned for usage in preconditioning Navier–Stokes systems, nothing prevents their usage more generally. The only requirement to testing one of these methods is that the system only contains two sub-block numbers. For systems with more than two sub-blocks, the more general Block Gauss-Seidel method is appropriate. If the physics include Navier–Stokes coupled with an additional term, the Hierarchical Block Gauss-Seidel method can be employed.

4.7.9.4. Hierarchical Block Gauss-Seidel

Note

Hierarchical blocks with a single sub-block must use a preconditioner or solver provided by the TEKO SUBBLOCK PRECONDITIONER or TEKO SUBBLOCK SOLVER blocks, respectively. For hierarchical blocks with more than one sub-block, the TEKO BLOCK PRECONDITIONER or TEKO BLOCK SOLVER blocks must be used.

As noted in Navier–Stokes and SIMPLE, many simulations may include coupling between the Navier–Stokes equations and some other physics. For example, the velocity and pressure components in the Navier–Stokes equations may be coupled with temperature as in the Boussinesq approximation for buoyancy-driven flows. However, it is still possible to use a Teko-based preconditioner that employs the SIMPLE or SIMPLEC methods for the velocity-pressure sub-blocks. This is accomplished by hierarchically decomposing the block-system into smaller blocks, then stitching together the various smaller blocks through a hierarchical block Gauss-Seidel approach. Example syntax for a Teko preconditioner using the hierarchical block Gauss-Seidel is shown below.

BEGIN TPETRA EQUATION SOLVER GMRES_TEKO_SIMPLE
  BEGIN GMRES SOLVER
    BEGIN TEKO PRECONDITIONER
      DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
      DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE
      DEFINE MATRIX SUBBLOCK NUMBER=3 DOF=TEMPERATURE

      BEGIN HIERARCHICAL BLOCK GAUSS-SEIDEL SOLVER
        USE SUBBLOCKS 1,2 FOR HIERARCHICAL BLOCK 1
        USE SUBBLOCKS 3 FOR HIERARCHICAL BLOCK 2
        USE BLOCK INVERSE SIMPLE_SOLVER FOR HIERARCHICAL BLOCK 1
        USE BLOCK INVERSE DDILU_TEMPERATURE_PC FOR HIERARCHICAL BLOCK 2
      END

      BEGIN TEKO BLOCK SOLVER SIMPLE_SOLVER
        BEGIN GMRES SOLVER
          BEGIN TEKO PRECONDITIONER
            DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
            DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE

            BEGIN TEKO SUBBLOCK PRECONDITIONER DDILU_PC
              BEGIN DD-ILU PRECONDITIONER
              END
            END

            BEGIN TEKO SUBBLOCK PRECONDITIONER AMG_PC
              BEGIN MUELU PRECONDITIONER
              END
            END

            BEGIN SIMPLE SOLVER
              USE INVERSE DDILU_PC FOR SUBBLOCK 1
              USE INVERSE AMG_PC FOR SUBBLOCK 2
              EXPLICIT VELOCITY INVERSE TYPE = ABSROWSUM # or LUMPED, DIAGONAL
              ALPHA = 0.9 # Pressure Schur-Complement (under/over) relaxation, > 0.0
            END
          END
        END
      END TEKO BLOCK SOLVER

      BEGIN TEKO SUBBLOCK PRECONDITIONER DDILU_TEMPERATURE_PC
        BEGIN DD-ILU PRECONDITIONER
        END
      END

    END
  END
END TPETRA EQUATION SOLVER

In the above example, the HIERARCHICAL BLOCK GAUSS-SEIDEL SOLVER block is used to decompose the matrix into two main sub-systems: one containing velocity and pressure, the other temperature. The velocity and pressure terms are solved via the SIMPLE method defined in a TEKO BLOCK SOLVER block. The temperature, on the other hand, is only a single block. In this case, the TEKO SUBBLOCK PRECONDITIONER block is used to specify a DD-ILU preconditioner to approximate the inverse of the temperature block. Note, however, that if we had lumped an additional block with temperature, either the TEKO BLOCK SOLVER or TEKO BLOCK PRECONDITIONER blocks are required to invert the given block.

4.7.9.5. Preconditioned Sub-block Solvers

In many scenarios, there may exist a sub-block that is difficult to solve. This is especially problematic if there exists one sub-block that is significantly tougher than the others, as the sub-block that is difficult to solve may lead to stagnation in the overall linear solver convergence. (For information on detecting this scenario, refer to Diagnostic Sub-block Solvers). Fortunately, Teko provides options to use any preconditioned iterative solver defined in a TPETRA EQUATION SOLVER block for the sub-block solvers to mitigate this issue.

An example input deck expanding on our previous example input for Block Gauss-Seidel is provided below.

BEGIN TPETRA EQUATION SOLVER GMRES_TEKO_BGS
  BEGIN GMRES SOLVER
    BEGIN TEKO PRECONDITIONER
      DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
      DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE

      BEGIN TEKO SUBBLOCK SOLVER DDILU_SOLVER
        BEGIN GMRES SOLVER
          CONVERGENCE TOLERANCE = 1e-4
          MAXIMUM ITERATIONS = 30
          RESTART ITERATIONS = 30
          BEGIN DD-ILU PRECONDITIONER
          END
        END
      END

      BEGIN TEKO SUBBLOCK SOLVER AMG_SOLVER
        BEGIN GMRES SOLVER
          CONVERGENCE TOLERANCE = 1e-4
          MAXIMUM ITERATIONS = 30
          RESTART ITERATIONS = 30
          BEGIN MUELU PRECONDITIONER
          END
        END
      END

      BEGIN BLOCK GAUSS-SEIDEL SOLVER
        USE INVERSE DDILU_SOLVER FOR SUBBLOCK 1
        USE INVERSE AMG_SOLVER FOR SUBBLOCK 2
      END
    END
  END
END TPETRA EQUATION SOLVER

In the above example, a GMRES iteration seeking to reduce the residual by 4 orders of magnitude with a maximum of 30 iterations is used as the solver for the velocity and pressure sub-blocks. The velocity solve is preconditioned using DD-ILU from Ifpack2, while the pressure is preconditioned using algebraic multigrid from MueLu. The use of an iterative solver for the velocity and pressure sub-blocks means that the Block Gauss-Seidel preconditioner does not remain constant with each iteration. This requires the use of a flexible solver, such as flexible GMRES [46], to account for this. Aria automatically switches from GMRES to flexible GMRES when using a Teko-based preconditioner.

The use of preconditioned sub-block solves is not limited to Block Gauss-Seidel. An iterative solver can be used to invert the velocity and Schur complement pressure blocks in the SIMPLE and SIMPLEC methods from Navier–Stokes and SIMPLE. This is easy to accomplish, as it requires only altering the sub-block solver specification. Extending our example from the Navier–Stokes and SIMPLE section, we have:

BEGIN TPETRA EQUATION SOLVER GMRES_TEKO_SIMPLE
  BEGIN GMRES SOLVER
    BEGIN TEKO PRECONDITIONER
      # NOTE: velocity must be assigned to block 1 for SIMPLE
      DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
      DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE

      BEGIN TEKO SUBBLOCK SOLVER DDILU_SOLVER
        BEGIN GMRES SOLVER
          CONVERGENCE TOLERANCE = 1e-4
          MAXIMUM ITERATIONS = 30
          RESTART ITERATIONS = 30
          BEGIN DD-ILU PRECONDITIONER
          END
        END
      END

      BEGIN TEKO SUBBLOCK SOLVER AMG_SOLVER
        BEGIN GMRES SOLVER
          CONVERGENCE TOLERANCE = 1e-4
          MAXIMUM ITERATIONS = 30
          RESTART ITERATIONS = 30
          BEGIN MUELU PRECONDITIONER
          END
        END
      END

      BEGIN SIMPLE SOLVER
        USE INVERSE DDILU_SOLVER FOR SUBBLOCK 1
        USE INVERSE AMG_SOLVER FOR SUBBLOCK 2
        EXPLICIT VELOCITY INVERSE TYPE = ABSROWSUM # or LUMPED, DIAGONAL
        ALPHA = 0.9 # Pressure Schur-Complement (under/over) relaxation, > 0.0
      END
    END
  END
END TPETRA EQUATION SOLVER

4.7.9.6. Expert Teko Syntax

While the syntax provided through the TEKO PRECONDITIONER is meant to cover most use-cases, it is nevertheless helpful to have access to the underlying Teko XML specification through the EXPERT TEKO PRECONDITIONER block. For example, the Diagnostic Sub-block Solvers currently require the use of the EXPERT TEKO PRECONDITIONER block. To demonstrate the EXPERT TEKO PRECONDITIONER syntax, we consider converting the Block Gauss-Seidel based Teko solver in Preconditioned Sub-block Solvers. An example input deck is provided below.

BEGIN TPETRA EQUATION SOLVER GMRES_TEKO_BGS
  BEGIN GMRES SOLVER
    BEGIN EXPERT TEKO PRECONDITIONER
      DEFINE MATRIX SUBBLOCK NUMBER=1 DOF=VELOCITY
      DEFINE MATRIX SUBBLOCK NUMBER=2 DOF=PRESSURE

      INVERSE METHOD = BGS
      XML FILE = teko.xml
    END
  END
END TPETRA EQUATION SOLVER

In exchange for removing the block solver specification, the EXPERT TEKO PRECONDITIONER block syntax requires specifying the block inverse through the INVERSE METHOD command and input XML file through the XML FILE command. We recall that the solver specified in the Preconditioned Sub-block Solvers example uses a Block Gauss-Seidel solver with a DD-ILU preconditioned GMRES solver for the velocity sub-block and a MueLu preconditioned GMRES solver for the pressure sub-block. The equivalent Teko XML that implements these solver settings is defined below:

<ParameterList>
  <ParameterList name="BGS">
    <Parameter name="Type" type="string" value="Block Gauss-Seidel"/>

    <!-- Velocity -->
    <Parameter name="Inverse Type 1" type="string" value="gmres"/>
    <Parameter name="Preconditioner Type 1" type="string" value="ddilu"/>

    <!-- Pressure -->
    <Parameter name="Inverse Type 2" type="string" value="gmres"/>
    <Parameter name="Preconditioner Type 2" type="string" value="amg"/>
  </ParameterList>

  <ParameterList name="ddilu">
    <Parameter name="Type" type="string" value="Ifpack2"/>
    <Parameter name="Prec Type" type="string" value="SCHWARZ"/>
    <ParameterList name="Ifpack2 Settings">
      <Parameter name="schwarz: use reordering" type="bool" value="true"/>
      <Parameter name="subdomain solver name" type="string" value="RILUK"/>
      <ParameterList name="subdomain solver parameters">
       <Parameter name="fact: iluk level-of-fill" type="int" value="1"/>
      </ParameterList>
    </ParameterList>
  </ParameterList>

  <ParameterList name="amg">
    <Parameter        name="Type"                             type="string"   value="MueLu"/>
    <Parameter        name="verbosity"                        type="string"   value="high"/>
    <Parameter        name="coarse: max size"                 type="int"      value="1000"/>
    <Parameter        name="max levels"                       type="int"      value="4"/>
    <Parameter        name="smoother: type"                   type="string"   value="CHEBYSHEV"/>
    <ParameterList    name="smoother: params">
       <Parameter name="chebyshev: degree"                    type="int"      value="4"/>
       <Parameter name="chebyshev: ratio eigenvalue"          type="double"   value="20"/>
       <Parameter name="chebyshev: min eigenvalue"            type="double"   value="1.0"/>
       <Parameter name="chebyshev: zero starting solution"    type="bool"     value="true"/>
       <Parameter name="chebyshev: eigenvalue max iterations" type="int"      value="15"/>
    </ParameterList>
    <Parameter        name="aggregation: type"                type="string"   value="uncoupled"/>
    <Parameter        name="aggregation: drop scheme"         type="string"   value="distance laplacian"/>
    <Parameter        name="aggregation: drop tol"            type="double"   value="0.01"/>

    <Parameter        name="repartition: enable"              type="bool"     value="true"/>
    <Parameter        name="repartition: min rows per proc"   type="int"      value="1000"/>
    <Parameter        name="repartition: start level"         type="int"      value="2"/>
    <Parameter        name="repartition: max imbalance"       type="double"   value="1.327"/>
    <Parameter        name="repartition: partitioner"         type="string"   value="zoltan2"/>
  </ParameterList>

  <ParameterList name="gmres">
    <Parameter name="Type" type="string" value="Belos"/>
    <Parameter name="Solver Type" type="string" value="Pseudo Block GMRES"/>
    <ParameterList name="Solver Types">
      <ParameterList name="Pseudo Block GMRES">
        <Parameter name="Maximum Iterations" type="int" value="30"/>
        <Parameter name="Num Blocks" type="int" value="30"/>
        <Parameter name="Convergence Tolerance" type="double" value="1e-4"/>
      </ParameterList>
    </ParameterList>
  </ParameterList>
</ParameterList>

For using SIMPLE, on the other hand, the input XML may resemble the following:

<ParameterList>
  <!-- NOTE: must alter Aria input to specify INVERSE METHOD = SIMPLEC -->
  <ParameterList name="SIMPLEC">
    <Parameter name="Type" type="string" value="NS SIMPLE"/>

    <!- Switch to "Diagonal" for SIMPLE -->
    <Parameter name="Explicit Velocity Inverse Type" type="string" value="AbsRowSum"/>
    <Parameter name="Alpha" type="double" value="0.9"/>

    <!-- Velocity -->
    <Parameter name="Inverse Velocity Type" type="string" value="gmres"/>
    <Parameter name="Preconditioner Velocity Type" type="string" value="ddilu"/>

    <!-- Pressure -->
    <Parameter name="Inverse Pressure Type" type="string" value="gmres"/>
    <Parameter name="Preconditioner Pressure Type" type="string" value="amg"/>
  </ParameterList>

  <ParameterList name="ddilu">
    <Parameter name="Type" type="string" value="Ifpack2"/>
    <Parameter name="Prec Type" type="string" value="SCHWARZ"/>
    <ParameterList name="Ifpack2 Settings">
      <Parameter name="schwarz: use reordering" type="bool" value="true"/>
      <Parameter name="subdomain solver name" type="string" value="RILUK"/>
      <ParameterList name="subdomain solver parameters">
       <Parameter name="fact: iluk level-of-fill" type="int" value="1"/>
      </ParameterList>
    </ParameterList>
  </ParameterList>

  <ParameterList name="amg">
    <Parameter        name="Type"                             type="string"   value="MueLu"/>
    <Parameter        name="verbosity"                        type="string"   value="high"/>
    <Parameter        name="coarse: max size"                 type="int"      value="1000"/>
    <Parameter        name="max levels"                       type="int"      value="4"/>
    <Parameter        name="smoother: type"                   type="string"   value="CHEBYSHEV"/>
    <ParameterList    name="smoother: params">
       <Parameter name="chebyshev: degree"                    type="int"      value="4"/>
       <Parameter name="chebyshev: ratio eigenvalue"          type="double"   value="20"/>
       <Parameter name="chebyshev: min eigenvalue"            type="double"   value="1.0"/>
       <Parameter name="chebyshev: zero starting solution"    type="bool"     value="true"/>
       <Parameter name="chebyshev: eigenvalue max iterations" type="int"      value="15"/>
    </ParameterList>
    <Parameter        name="aggregation: type"                type="string"   value="uncoupled"/>
    <Parameter        name="aggregation: drop scheme"         type="string"   value="distance laplacian"/>
    <Parameter        name="aggregation: drop tol"            type="double"   value="0.01"/>

    <Parameter        name="repartition: enable"              type="bool"     value="true"/>
    <Parameter        name="repartition: min rows per proc"   type="int"      value="1000"/>
    <Parameter        name="repartition: start level"         type="int"      value="2"/>
    <Parameter        name="repartition: max imbalance"       type="double"   value="1.327"/>
    <Parameter        name="repartition: partitioner"         type="string"   value="zoltan2"/>
  </ParameterList>

  <ParameterList name="gmres">
    <Parameter name="Type" type="string" value="Belos"/>
    <Parameter name="Solver Type" type="string" value="Pseudo Block GMRES"/>
    <ParameterList name="Solver Types">
      <ParameterList name="Pseudo Block GMRES">
        <Parameter name="Maximum Iterations" type="int" value="30"/>
        <Parameter name="Num Blocks" type="int" value="30"/>
        <Parameter name="Convergence Tolerance" type="double" value="1e-4"/>
      </ParameterList>
    </ParameterList>
  </ParameterList>
</ParameterList>

4.7.9.7. Diagnostic Sub-block Solvers

To get additional diagnostics about the residual of the individual inverse approximations, Teko allows for the injection of a diagnostic inverse type. This feature, however, requires the use of the EXPERT TEKO PRECONDITIONER syntax as described in Expert Teko Syntax. For example, the SIMPLEC XML from Navier–Stokes and SIMPLE is modified to inject diagnostic blocks inside the pressure and velocity sub-block solvers below.

<ParameterList name="SIMPLEC">
  <Parameter name="Type" type="string" value="NS SIMPLE"/>
  <Parameter name="Inverse Velocity Type" type="string" value="diagnostic_velocity_block"/>
  <Parameter name="Inverse Pressure Type" type="string" value="diagnostic_pressure_block"/>
  <Parameter name="Explicit Velocity Inverse Type" type="string" value="AbsRowSum"/>
</ParameterList>

<!--
  See sample XML in 'EXPERT TEKO SYNTAX' section for definitions of the ddilu, amg sub-block solvers
-->
<ParameterList name="diagnostic_pressure_block">
  <Parameter name="Type" type="string" value="Diagnostic Inverse"/>
  <Parameter name="Inverse Factory" type="string" value="amg"/>
  <Parameter name="Descriptive Label" type="string" value="Pressure"/>
  <Parameter name="Print Residual" type="bool" value="true"/>
</ParameterList>

<ParameterList name="diagnostic_velocity_block">
  <Parameter name="Type" type="string" value="Diagnostic Inverse"/>
  <Parameter name="Inverse Factory" type="string" value="ddilu"/>
  <Parameter name="Descriptive Label" type="string" value="Velocity"/>
  <Parameter name="Print Residual" type="bool" value="true"/>
</ParameterList>

The above XML file will output the relative residual of the sub-block solvers, for example:

DiagnosticLinearOp "Velocity": residual = [ 2.0547e-03 ]
DiagnosticLinearOp "Pressure": residual = [ 3.8366e-01 ]

Injecting diagnostic inverses, like in the example above, is helpful for determining which portion of a block preconditioning scheme is failing to converge. For blocks that are not able to converge or do not converge as quickly, consider using an iterative solver for the block as outlined in Preconditioned Sub-block Solvers. Road-map for Constructing Teko Solvers offers further advice on debugging block preconditioner convergence.

4.7.9.8. Road-map for Constructing Teko Solvers

As we have seen from the previous sections, there is a lot of upfront work in setting up a Teko preconditioner. Constructing a Teko preconditioner for a multiphysics simulations requires the combination of a good block-splitting, block inverse method, and sub-block solvers. This section aims to offer advice to the user in how to construct a Teko preconditioner.

The following steps provide a road-map to constructing a Teko solver.

As a starting point, gather the names of the DOFs in the linear system you wish to solve.

Check what EQs are defined in the equation system where you want to use a Teko solver.

If you are still unsure what DOFs are in the linear system, guess! The error message in the logfile will provide the names for each DOF in the linear system.

Set up the block preconditioner.

Start with mapping each DOF to a unique sub-block.

If possible, use a direct solver for each sub-block to start.

For Navier–Stokes, start with the SIMPLE or SIMPLEC method from Navier–Stokes and SIMPLE.

For general multiphysics problems with only 2 sub-blocks, consider using SIMPLE or SIMPLEC. Note that the order of the sub-blocks matter as sub-block 2 is used to form the approximate Schur complement solve.

For general multiphysics problems with 3 or more sub-blocks, use Block Gauss-Seidel.

Find a good block-splitting by altering the DOF-to-block mapping.

Start with mapping each DOF to a unique sub-block.

If possible, use a direct solver for the sub-blocks to remove the concern for finding good sub-block solvers. We will tackle that issue on step 4. Note that the block-splitting with direct solvers forms a lower bound on the iteration count. If the iterations are relatively high – even 10 to 20 iterations is a lot here – it is unlikely the solver will perform well once we get to step 4.

Use physical intuition to lump DOFs to the same block. DOFs with tight-coupling are best put in the same sub-block. A reasonable starting point is to lump ‘like-with-like’. For example, try assigning species equations in the same phase to the same sub-block. Lumping temperature equations across phases tends to do well.

This step will almost always require some amount of trial-and-error.

Find good sub-block solvers.

We used direct solvers for steps 2 and 3 as a stepping stone. Now, we should try to remove their usage.

Sub-blocks encompassing several bits of physics often require DD-ILU.

Sub-blocks that only comprise a Poisson or elasticity equation are good candidates for algebraic multigrid and MueLu.

Sub-blocks with solid phase densities or solid mass fractions may be solved using Jacobi.

If in doubt, DD-ILU is a reasonable first choice for a sub-block solver.

Use the diagnostic sub-block solvers from Diagnostic Sub-block Solvers to monitor convergence. You generally want each sub-block to reach at least 1-2 digits of residual reduction. Be on the lookout for any sub-blocks that exhibit significantly worse convergence. These may lead to stagnation in the linear solver.

Blocks that exhibit significantly worse convergence may benefit from using a Belos solver as in Preconditioned Sub-block Solvers.

If all else fails, leave the direct solver in place for a sub-block.

Following the advice from steps 1-4, you should now be able to construct a Teko preconditioner. For an example demonstrating setting up a Teko solver, see Teko Solver for OMD with Porous Flows Example.