.. _nonlinear_solve:

##########################
Nonlinear Equation Solving
##########################

************
Introduction
************

This chapter discusses non-linear equation solving methods, specifically the use of iterative
algorithms for problems in solid mechanics.  Although some of this work has taken place over
many years at Sandia National Labs and elsewhere, recent efforts have significantly added to the
functionality and robustness of these algorithms. This chapter primarily documents these recent
efforts. Some historical development is covered for context and completeness,
hopefully showing a complete picture of the current status of iterative solution algorithms
for nonlinear solid mechanics in Sierra/SolidMechanics.

Iterative algorithms have seen somewhat of a resurgent interest, possibly due to the
advancement of parallel computing platforms.  Increases in computational speed and available
memory have raised expectations on model fidelity and problem size. Increased problem size has
sparked interest in iterative solvers because the direct solution strategy becomes increasingly
inefficient as problem size grows.  A traditional implicit global solution strategy is
typically based on Newton's method, generating fully coupled linearized equations that are
often solved using a direct method.  In many applications in solid mechanics this procedure
poses no particular problem for modern computing platforms with sufficient memory.  However,
for large three-dimensional models of interest, the cost of direct equation solving becomes
prohibitive on any computer, except for the largest supercomputers. This motivates the use of iterative
solution strategies that do not require the direct solution of linearized global equations.

Application of purely iterative solvers to the broad, general area of nonlinear finite element
solid mechanics problems has seen only modest success.  Certain classes of problems have
remained notoriously difficult to solve.  Examples of these include problems that are strongly
geometrically nonlinear, problems with nearly incompressible material response, and problems
with frictional sliding.  Thus, much of this chapter is devoted to examining and discussing an
implementation of a multi-level solution strategy, where the nonlinear iterative solver is
asked to solve simplified **model problems** from which the real solution to these
difficult problems is accumulated.  This strategy has greatly contributed to the functionality and
robustness of the nonlinear iterative solver.

.. %Sierra/SolidMechanics began with a concerted effort on a mixed direct/iterative
.. %method in the framework of a multilevel solver as its overall solution strategy.  This,
.. %combined with a legacy traced to JAC3D  [:footcite:`nonlinear_equation_solving:ref:JAC3D`]
.. %[:footcite:`JAC3D`], covers a wide spectrum of solution options available to the user.

************
The Residual
************

.. % We establish a notational foundation for the discussion of the alternatives available for
.. % solving the nonlinear discrete equations associated with the computation of an unknown state at
.. % :math:`t_{n+1}`, in the context of either a quasistatic or an implicit dynamics formulation. 

 Recall that the quasistatic problem, :eq:`quasistatics:eq:8`, is written as
 
.. math::
   :label: nonlinear_equation_solving:eq:01

   \mathbf{r} \left( \mathbf{d}_{n+1} \right) = \mathbf{F}^{\mathrm{ext}} \left( \mathbf{d}_{n+1}\right) - \mathbf{F}^{\mathrm{int}} \left( \mathbf{d}_{n+1} \right) = \mathbf{0}

and the implicit dynamics problem using the trapezoidal time integration rule,
:eq:`dynamics:eq:117`, is written as

.. math::
   :label: nonlinear_equation_solving:eq:02

   \mathbf{r} ( \mathbf{d}_{n+1}) = \left[ \mathbf{F}^{\mathrm{ext}}(\mathbf{d}_{n+1}) + \mathbf{M} \left( \mathbf{a}_n + \Delta t ~ \mathbf{v}_n + \frac{4}{\Delta t^2} \mathbf{d}_n \right) - \left( \frac{4}{\Delta t^2} \mathbf{M} \mathbf{d}_{n+1} + \mathbf{F}^{\mathrm{int}}(\mathbf{d}_{n+1}) \right) \right] = \mathbf{0} .


In either case, the equation to be solved takes the form

.. math::
   :label: nonlinear_equation_solving:eq:03

   \mathbf{r} \left( \mathbf{d}_{n+1} \right) =  \mathbf{0} ,

where the residual :math:`\mathbf{r} \left( \mathbf{d}_{n+1} \right)` is, in general, a nonlinear function of the solution vector :math:`\mathbf{d}_{n+1}`.
This form allows us to consider the topic of nonlinear equation solving in its most general form, with the introduction of iterations, :math:`j=0,1,2, ...`, as

.. math::
   :label: nonlinear_equation_solving:eq:04

   \mathbf{r} \left( (\mathbf{d}_{n+1})_j\right) =  \mathbf{0} ,

or simply

.. math::
   :label: nonlinear_equation_solving:eq:05

   \mathbf{r}_j = \mathbf{r} \left( \mathbf{d}_j \right) =  \mathbf{0}.

For implicit dynamic Sierra/SM simulations, each load step from time :math:`n` to :math:`n+1` requires a
new nonlinear solve with sub-iterations :math:`j=0,1,2,...`.  Here we have omitted the references to
the load step, yet it is understood that, e.g., :math:`\mathbf{d}_j` is at :math:`n+1`.

We can rewrite :eq:`nonlinear_equation_solving:eq:01` and :eq:`nonlinear_equation_solving:eq:02` as

.. math::
   :label: nonlinear_equation_solving:eq:06

   \mathbf{r}_j = \mathbf{F}^{\mathrm{ext}}_j - \mathbf{F}^{\mathrm{int}}_j = \mathbf{0}

and

.. math::
   :label: nonlinear_equation_solving:eq:07

   \mathbf{r}_j = \mathbf{F}^{\mathrm{ext}}_j - \frac{4}{\Delta t^2} \mathbf{M} \mathbf{d}_j -  \mathbf{F}^{\mathrm{int}}_j + \tilde{\mathbf{F}} = \mathbf{0}


where :math:`\mathbf{F}^{\mathrm{int}}_j = \mathbf{F}^{\mathrm{int}} \left( \mathbf{d}_j \right) = \mathbf{F}^{\mathrm{int}} \left( (\mathbf{d}_{n+1})_j \right)` and :math:`\tilde{\mathbf{F}}` is the constant portion of the residual, defined as

.. math::

   \tilde{\mathbf{F}} = \mathbf{M} \left( \mathbf{a}_n + \Delta t ~ \mathbf{v}_n + \frac{4}{\Delta t^2} \mathbf{d}_n \right).

The task for any nonlinear equation solution technique is to improve the iterate (or
guess) for the solution vector :math:`\mathbf{d}_j` such that the residual :math:`\mathbf{r}_{j}` is close
enough to :math:`\mathbf{0}`.  How that is done depends on the method employed.

:numref:`nonlinear_equation_solving-fig-nonlinear_iterations` depicts a generalized
nonlinear loadstep solution with solution iterates :math:`j=0,1,2, ..., j^*`, where the iterates
converge when :math:`\left\lVert\mathbf{r}_{j^*}\right\rVert \approx 0` at iteration :math:`j^*`.

In (a) of :numref:`nonlinear_equation_solving-fig-nonlinear_iterations`, the solution starts
with iterate :math:`\mathbf{d}_0` taken as the solution of the prior load step from :math:`n-1` to :math:`n`.
(Note that the zero iterate is not always taken to be the prior solution.  See
:numref:`predictors` on predictors for more details.)  Iterate :math:`\mathbf{d}_0`
results in a residual of :math:`\mathbf{r}_0`, which then informs the next iterate :math:`\mathbf{d}_1`
such that :math:`\left\lVert\mathbf{r}_1\right\rVert < \left\lVert\mathbf{r}_0\right\rVert`.  For
details on how :math:`\mathbf{d}_1` is formed, see the following
Sections :numref:`GradientPropertyResidual` through
:numref:`nonlinear_equation_solving-constraints`. In (b) and (c) this procedure from iteration
:math:`j` to :math:`j+1` is depicted, and in (d) the solution procedure has converged at iteration :math:`j^*`
with iterate :math:`\mathbf{d}_{j^*}`.  The load step from :math:`n` to :math:`n+1` is then solved, and the
solution procedure for the next load step from :math:`n+1` to :math:`n+2` starts over in (a).

.. _nonlinear_equation_solving-fig-nonlinear_iterations:

.. figure:: ../_static/figures/nonlinear_solve-fig01.png
   :align: center
   :scale: 75 %

   Graphical depiction of nonlinear iterations.

.. _GradientPropertyResidual:

*********************************
Gradient Property of the Residual
*********************************

The residual has the very important property that it points in the steepest descent or gradient direction of the function :math:`f`:

.. math::
   :label: nonlinear_equation_solving:eq:08

   f \left( \mathbf{d}_j \right) = \frac{1}{2} \left( \mathbf{d}_j - \mathbf{d}^{*} \right) ^T \mathbf{r} (\mathbf{d}_j),

which is the **energy error of the residual**.
Solving for :math:`\mathbf{d}_j = \mathbf{d}^{*}` is equivalent to minimizing the energy error of the residual, :math:`f \left( \mathbf{d}_j \right)`.

The importance of this property can not be overemphasized. Any iterative solver makes use of it
in some way or another.  Even though the solution :math:`\mathbf{d}^{*}` is not known, a non-zero
residual points the way to improving the guess.  Mathematically, our nonlinear solid
mechanics problem looks like a minimization problem discussed at length in the optimization
literature, see e.g. [:footcite:`luenberger`].  It is from this viewpoint that the remainder of the
nonlinear solution methods will be discussed.  The concept of the energy error of the residual
reveals important physical insights into how iterative algorithms are expected to perform on
particular classes of problems.

An example of the energy error of the residual providing physical insight into a problem is
demonstrated in :numref:`nonlinear_equation_solving-fig-two_beams1`.

.. _nonlinear_equation_solving-fig-two_beams1:

.. figure:: ../_static/figures/nonlinear_solve-fig02.png
   :align: center
   :scale: 75 %

   Energy error example: two beams with large and small :math:`x`-sectional moments of inertia.

Two beams, one thick and one thin, are subjected to a uniform pressure load causing a downward
deflection to the equilibrium point :math:`(d_1,d_2)` indicated by the blue dot. If we think of
*modes of deformation* rather than the nodal degrees of freedom :math:`(d_1 ,d_2 )`, two modes
of deformation come to mind: a bending mode and an axial mode.

.. _nonlinear_equation_solving-fig-two_beams2:

.. figure:: ../_static/figures/nonlinear_solve-fig03.png
   :align: center
   :scale: 75 %

   Energy error example: modes of deformation for two beams.

For the thick beam in :numref:`nonlinear_equation_solving-fig-two_beams2`, the red dashed
line is the locus of points :math:`(d_1,d_2)` that induce only bending stresses in the beam and is
therefore called a bending mode.  In contrast, the blue dashed line is the locus of points
:math:`(d_1,d_2)` that induce only axial stresses in the beam and is therefore called an axial mode.
These bending and axial modes are characterized by the eigenvectors :math:`q_b` and :math:`q_a`,
respectively.

Eigenvectors are typically written as linear combinations of the nodal degrees of freedom. The
bending modes, for example, can be written as :math:`q_b = a_1 d_1 + a_2 d_2`.  However, since we are
dealing with a nonlinear problem in our simple example (and in general), the coefficients :math:`a_1`
and :math:`a_2` vary with the deformation of the beam - which is precisely why the dashed red line is
curved. The energy error contours can thus be displayed, as shown in
:numref:`nonlinear_equation_solving-fig-two_beams3`.  Any displacement away from the
equilibrium point :math:`(d_1,d_2)^{*}` produces a nonzero residual and consequently requires work.

.. _nonlinear_equation_solving-fig-two_beams3:

.. figure:: ../_static/figures/nonlinear_solve-fig04-2.png
   :align: center
   :scale: 25 %

   Energy error example: energy error contours for two beams.

Now we compare moving the tip of the beam along the red dashed line, which invokes a bending
mode of deformation of the beam versus moving the tip along the blue dashed line, which
invokes an axial mode of deformation.  The larger modal stiffness (eigenvalue) corresponding to
the axial mode induces a greater energy penalty for a given amount of displacement compared to
the bending mode.  This produces the stretched energy contours shown.  Since the ratio of
stiffness between the axial and bending modes is much larger for the thin beam than the thick beam, the
stretching of the energy error contours is more pronounced for the thin beam.
Mathematically, these contours are a graphical representation of the gradient of the residual,
:math:`\nabla \mathbf{r}(\mathbf{d})`.

.. %Notationally, the gradient of the residual, :math:`\nabla \mathbf{r} (\mathbf{d}_j)`, 
.. %which is also the **Hessian** of :math:`f`, i.e.,

.. %\label{nonlinear_equation_solving:eq:09}
.. %\nabla \mathbf{r} (\mathbf{d}_j) = \nabla ^2 f( \mathbf{d}_j).
.. %

The beam example is chosen for its simplicity, however it also poses a non-trivial nonlinear problem. Experience has shown that the thinner the beam becomes the more difficult it is to solve. In fact, convergence investigations reveal that it is the ratio of maximum to minimum eigenvalue of :math:`\nabla \mathbf{r}(\mathbf{d})` that is critical to the performance of iterative methods.

.. _nonlinear_solve-newtons_method:

***********************************************
Newton's Method for Solving Nonlinear Equations
***********************************************

In this context, the idea embodied in classical Newton's method is simple. Substituting the nonlinear residual :math:`\mathbf{r}(\mathbf{d}_j)` with the local tangent approximation :math:`\mathbf{y}(\mathbf{d})` gives

.. math::
   :label: nonlinear_equation_solving:eq:10

   \mathbf{y} (\mathbf{d}) = \mathbf{r}(\mathbf{d}_j) + \nabla \mathbf{r} (\mathbf{d}_j) (\mathbf{d}-\mathbf{d}_j),


which is linear in the vector of unknowns :math:`(\mathbf{d})`.
Solving :eq:`nonlinear_equation_solving:eq:10` (solving for :math:`\mathbf{y}(\mathbf{d}) = \mathbf{y}(\mathbf{d}_{j+1}) = \mathbf{0}`) gives the iterative update for Newton's method,

.. math::
   :label: nonlinear_equation_solving:eq:11

   \mathbf{d}_{j+1} = \mathbf{d}_j - \nabla \mathbf{r}^{-1} (\mathbf{d}_j) \mathbf{r} (\mathbf{d}_j).


The structural mechanics community commonly refers to the **tangent stiffness matrix** in the context of geometrically nonlinear problems. Based on :eq:`nonlinear_equation_solving:eq:10`, the tangent stiffness matrix arises from the tangent approximation of :math:`f(\mathbf{d}_j)`:

.. math::
   :label: nonlinear_equation_solving:eq:12

   \mathbf{K}_T = \nabla \mathbf{r} (\mathbf{d}_j).


Then :eq:`nonlinear_equation_solving:eq:11` can be written as

.. math::
   :label: nonlinear_equation_solving:eq:12a

   \mathbf{d}_{j+1} = \mathbf{d}_j -  \left[ \mathbf{K}_T \right]^{-1} \mathbf{r} (\mathbf{d}_j) ,


the solution of which requires the inverse of :math:`\mathbf{K}_T`.

A conceptual view of Newton's method applied to our two beam example is shown in :numref:`nonlinear_equation_solving-fig-two_beams4`.

.. _nonlinear_equation_solving-fig-two_beams4:

.. figure:: ../_static/figures/nonlinear_solve-fig05.png
   :align: center
   :scale: 75 %

   Energy error example: Newton's method applied to two beams.

Newton's method is generally considered to be the most robust of the nonlinear equation solution techniques, albeit at the cost of generating the tangent stiffness matrix :eq:`nonlinear_equation_solving:eq:12` and solving the *linear* system of equations with :math:`n_{dof}` unknowns:

.. math::
   :label: nonlinear_equation_solving:eq:13

   \left[ \mathbf{K}_T \right] \left( \mathbf{d}_{j+1} - \mathbf{d}_j \right) = - \mathbf{r} (\mathbf{d}_j).


There are a number of linear equation solution techniques available, and Sierra/SolidMechanics\ has the ability to apply a linear equation solution approach available in the FETI library (discussed briefly in section :numref:`nonlinear_equation_solving-feti`).

As mentioned, Newton's method relies on computing the tangent stiffness matrix which, by examination of :eq:`nonlinear_equation_solving:eq:14`, requires the partial derivatives (with respect to the unknowns) of the external and internal force vector, 

.. math::
   :label: nonlinear_equation_solving:eq:14

   \mathbf{K}_T = \frac{\partial}{\partial \mathbf{d}} \left[ \mathbf{F}^{\mathrm{ext}}_j - \mathbf{F}^{\mathrm{int}}_j \right].


In practice, for all but the simplest of material models, the exact tangent cannot be
computed. Thus Sierra/SolidMechanics computes a secant approximation with the property

.. math::
   :label: nonlinear_equation_solving:tangent

   \mathbf{K}_{\tilde{T}} \cdot \left( \delta \mathbf{d} \right) = \mathbf{r} (\mathbf{d}_j + \delta \mathbf{d}) - \mathbf{r} (\mathbf{d}_{j})


by simply probing the nonlinear system via perturbations :math:`\delta \mathbf{d}`. In :eq:`nonlinear_equation_solving:tangent`, the notation :math:`\mathbf{K}_{\tilde{T}}` is used to indicate that the probed tangent is an approximation of the exact tangent.

.. %\subsection{Secant approximation of the Tangent Stiffness via probing}
.. %\subsubsection{Coloring algorithm}

***********************
Steepest Descent Method
***********************

As mentioned in section :numref:`GradientPropertyResidual`, the steepest descent iteration
takes steps in the direction of the gradient of the energy error of the residual.  On its own,
it would not be considered a viable solver for solid mechanics because of its general lack of
performance compared to Newton-based methods. However, there are algorithmic elements of this
method that are conceptually important for understanding nonlinear iterative solver such as the
method of conjugate gradients, and in fact are used in their construction.

The idea behind the steepest descent method is to construct a sequence of search directions, :math:`\mathbf{s}_j`,

.. math::
   :label: nonlinear_equation_solving:eq:16

   \mathbf{s}_j = \mathbf{M}^{-1} \mathbf{g}_j = -\mathbf{M}^{-1} \mathbf{r} (\mathbf{d}_j),


in which the energy decreases, thus producing a new guess of the solution vector

.. math::
   :label: nonlinear_equation_solving:eq:17

   \mathbf{d}_{j+1} = \mathbf{d}_j + \alpha \mathbf{s}_j.


The minimization is accomplished by taking a step of length :math:`\alpha` along :math:`\mathbf{s}_j`,
where :math:`\alpha` is called the line search parameter:

.. math::
   :label: nonlinear_equation_solving:eq:18

   \frac{\mathrm{d}}{\mathrm{d} \alpha} f(\mathbf{d}_j + \alpha \mathbf{s}_j ) \approx \left[
   \mathbf{r} (\mathbf{d}_j ) \right] ^T \mathbf{s}_j + \alpha \left[ \mathbf{r} (\mathbf{d}_j +
   \mathbf{s}_j ) - \mathbf{r} (\mathbf{d}_j ) \right] ^T \mathbf{s}_j = 0 ,


which, after simplification, gives

.. math::
   :label: nonlinear_equation_solving:eq:19

   \alpha = \frac{ \left[ \mathbf{r} (\mathbf{d}_j ) \right] ^T \mathbf{s}_j } { \left[ \mathbf{r}
    (\mathbf{d}_j + \mathbf{s}_j) - \mathbf{r} (\mathbf{d}_j ) \right] ^T \mathbf{s}_j }.


The **preconditioner** matrix :math:`\mathbf{M}` is included in
:eq:`nonlinear_equation_solving:eq:16` to accelerate the convergence rate of the
steepest descent method. Note that in this case :math:`\mathbf{M}` is not meant to be the mass
matrix.

:numref:`nonlinear_equation_solving-fig-two_beams5` through
:numref:`nonlinear_equation_solving-fig-two_beams7` all show high aspect ratio ellipses.  It turns
out that the ideal preconditioner would transform the ellipses to circles; this, in turn,
would be :math:`\mathbf{M} =\mathbf{K}_T`.  As expected, the ideal steepest descent method is
Newton's method.  However, the steepest descent framework gives us a way to use approximations
of :math:`\mathbf{K}_{T}`.

A conceptual view of the steepest descent method applied to our two beam example is shown in
:numref:`nonlinear_equation_solving-fig-two_beams5`.  As indicated in the figure, the
thinner beam would require more steepest descent iterations to obtain convergence compared to
the thicker beam.

.. _nonlinear_equation_solving-fig-two_beams5:

.. figure:: ../_static/figures/nonlinear_solve-fig06.png
   :align: center
   :scale: 75 %

   Energy error example: Steepest descent method applied to two beams.

It is instructive to consider whether or not the large number of iterations are due to the nonlinearities in this model problem.
For this purpose, we construct the two beam model problem in linearized form.
:numref:`nonlinear_equation_solving-fig-two_beams6` shows the first iteration of the steepest descent method for the linearized problem.
The immediate difference seen between the linearized version and the nonlinear problem is in the elliptic form of the energy error contours.
However, the contours are still stretched reflecting the relative modal stiffness of the axial and bending modes. Thus, from the same starting point, :math:`d_1 = d_2=0`, the initial search direction is composed of different amounts of :math:`d_1` and :math:`d_2`.
This is also apparent in all subsequent iterations.
:numref:`nonlinear_equation_solving-fig-two_beams7` shows the completed iterations for both thick and thin beams.

.. _nonlinear_equation_solving-fig-two_beams6:

.. figure:: ../_static/figures/nonlinear_solve-fig07.png
   :align: center
   :scale: 75 %

   Energy error example: First two iterations of the steepest descent method applied to linearized version of the two beam problem.

.. _nonlinear_equation_solving-fig-two_beams7:

.. figure:: ../_static/figures/nonlinear_solve-fig08.png
   :align: center
   :scale: 75 %

   Energy error example: Steepest descent method applied to linearized version of the two beam problem.

Even for the linearized problem, there is a large difference in the number of iterations
required for the steepest descent method to converge for the two beams.  We can see
this because the slope of the search directions is smaller for the thin beam.  Therefore, each
iteration makes less progress to the solution.

In general, the convergence rate of the steepest descent method is directly related to the
spread of the eigenvalues in the problem. In our conceptual beam example, the ratio
:math:`\lambda_{\mathrm{max}} / \lambda_{\mathrm{min}} = \lambda_a / \lambda_b` (often called the
condition number) is larger for the thin beam.  It can be shown that in the worst case, the
steepest descent iterations reduce the energy error of the residual according to

.. math::
   :label: nonlinear_equation_solving:eq:23

   f( \mathbf{d}_{j+1}) = \left( \frac {\lambda_{\mathrm{max}} / \lambda_{\mathrm{min}} - 1} {\lambda_{\mathrm{max}} / \lambda_{\mathrm{min}} + 1} \right)^2 f( \mathbf{d}_j ) .

*****************************
Method of Conjugate Gradients
*****************************

With the foundation provided by the steepest descent method, application of a conjugate
gradient algorithm to :eq:`nonlinear_equation_solving:eq:06` or
:eq:`nonlinear_equation_solving:eq:07` follows in a straightforward manner.  Like
the steepest descent method, the important feature the conjugate gradient (or CG) algorithm is
that it only needs to compute the nodal residual vectors element by element, and as a result,
does not need the large amount of storage required for Newton's method.

The method of conjugate gradients is a well-developed algorithm for solving linear equations.
Much of the original work can be found in the articles [:footcite:`flre:64,dani:67,dani:67a`] and the
books [:footcite:`hest:80, noce:91`].  A convergence proof of CG with inexact line searches can be
found in [:footcite:`grippo`], and a well-presented tutorial of linear CG can be found in
[:footcite:`shewchuck`].  The goal here is to review the method of conjugate gradients to understand
the benefits and potential difficulties encountered when applying it to the solution of the
nonlinear equations in solid mechanics problems.

Linear CG
=========

The CG algorithm also uses the gradient, :math:`\mathbf{g}_j`, to generate a sequence of search
directions :math:`\mathbf{s}_j` for iterations :math:`j=1,2,...`:

.. math::
   :label: nonlinear_equation_solving:eq:24

   \mathbf{s}_{j} = -\mathbf{M}^{-1} \mathbf{r}(\mathbf{d}_j) + \beta_j \mathbf{s}_{j-1} .

Note the additional (rightmost) term in :eq:`nonlinear_equation_solving:eq:24`
relative to the steepest descent algorithm of
:eq:`nonlinear_equation_solving:eq:16`.  The scalar :math:`\beta_j` is chosen such that
:math:`\mathbf{s}_j` and :math:`\mathbf{s}_{j-1}` are :math:`\mathbf{K}`-conjugate; this property is key to the
success of the CG algorithm.  Vectors :math:`\mathbf{s}_j` and :math:`\mathbf{s}_{j-1}` are
:math:`\mathbf{K}`-conjugate if

.. math::
   :label: nonlinear_equation_solving:eq:26

   \mathbf{s}_{j}^T \mathbf{K} \mathbf{s}_{j-1} = 0,


where :math:`\mathbf{K}` is the stiffness matrix. For a linear problem, :math:`\mathbf{K}` is a constant
positive definite matrix (assuming the internal force of
:eq:`nonlinear_equation_solving:eq:14` is linear).  Combining
:eq:`nonlinear_equation_solving:eq:24` and :eq:`nonlinear_equation_solving:eq:26`
gives the following expression for the search direction:

.. math::
   :label: nonlinear_equation_solving:eq:29

   \mathbf{s}_{j} = \frac{\mathbf{g}_{j}^T \mathbf{K} \mathbf{s}_{j-1}} {\mathbf{s}_{j-1}^T \mathbf{K} \mathbf{s}_{j-1} }.


Effective progress toward the solution requires minimizing the energy error of the residual along proposed search directions.
As with the steepest descent method, the line search performs this function.
Minimizing the energy error of the residual along the search direction occurs where the inner product of the gradient and the search direction is zero:

.. math::
   :label: nonlinear_equation_solving:eq:30

   \begin{aligned}
   \mathbf{g}_{j}^T (\Delta \mathbf{d}_j + \alpha \mathbf{s}_j ) \mathbf{s}_j & = \left[ \Delta \mathbf{F}^{\mathrm{ext}}(t) - \mathbf{K} \cdot (\Delta \mathbf{d}_j + \alpha \mathbf{s}_j ) \right] ^T \mathbf{M}^{-1} \mathbf{s}_j \\
      & = \left[ (\Delta \mathbf{F}^{\mathrm{ext}}(t) - \mathbf{K} \cdot (\Delta \mathbf{d}_j )^T - ( \mathbf{K} \cdot \alpha \mathbf{s}_j ) ^T \right]  \mathbf{M}^{-1} \mathbf{s}_j \\
      & = \mathbf{g}_j^T \mathbf{s}_j - \alpha_j \mathbf{s}_j^T \mathbf{K}^T \mathbf{M}^{-1} \mathbf{s}_j \\
      & = 0.
   \end{aligned}


Solving :eq:`nonlinear_equation_solving:eq:30` gives an exact expression for the line search parameter :math:`\alpha`,

.. math::
   :label: nonlinear_equation_solving:eq:31

   \alpha_{j} = \frac{\mathbf{g}_{j}^T \mathbf{s}_j } {\mathbf{s}_{j}^T \mathbf{M}^{-1} \mathbf{K} \mathbf{s}_{j} },


due to the inherent symmetry of :math:`\mathbf{K}`.

The essential feature of the method of conjugate gradients is that once a search direction contributes to the solution, it need never be considered again.
As a result, the inner product of the error :math:`\mathbf{e}` changes from iteration to iteration in the following manner

.. math::
   :label: nonlinear_equation_solving:eq:32

   \begin{gathered}
   \mathbf{e}_{j+1} \mathbf{K} \mathbf{e}_{j+1} - \mathbf{e}_{j} \mathbf{K} \mathbf{e}_{j}  =  \\
   \left[ \sum_{i=j+1}^{n-1} \delta_i \mathbf{s}_i \right]^T  \mathbf{K} \left[ \sum_{i=j+1}^{n-1} \delta_i \mathbf{s}_i \right]  - \left[ \delta_j \mathbf{s}_j + \sum_{i=j+1}^{n-1} \delta_i \mathbf{s}_i \right]^T \mathbf{K} \left[ \delta_j \mathbf{s}_j + \sum_{i=j+1}^{n-1} \delta_i \mathbf{s}_i \right] \\
   = -\left[ \delta_j \mathbf{s}_j \right]^T \mathbf{K} \left[ \delta_j \mathbf{s}_j \right].
   \end{gathered}


Since :math:`\mathbf{K}` is constant and positive definite, the energy error of the residual decreases monotonically as the iterations proceed.
Choosing :math:`\beta_j` such that the property in :eq:`nonlinear_equation_solving:eq:26` holds gives the important result that the sequence of search directions :math:`\mathbf{s}_1 , \mathbf{s}_2, ...` spans the solution space in at most :math:`n_{eq}` iterations.
Furthermore, :eq:`nonlinear_equation_solving:eq:32` reveals that the search directions :math:`\mathbf{s}_1 , \mathbf{s}_2, ...` reduce the error in the highest eigenvalue mode shapes first and progressively move to lower ones.

An additional important numerical property of CG is that it can tolerate some inexactness in the line search as discussed in 
[:footcite:`gree:89`] and still maintain its convergence properties.


Applying linear CG to our simple *linearized* beam model problem would generate the comparison depicted in :numref:`nonlinear_equation_solving-fig-beams_SDCG`. The fact that the linear CG algorithm precisely converges in two iterations demonstrates the significance of the orthogonalization with the previous search direction.

.. _nonlinear_equation_solving-fig-beams_SDCG:

.. figure:: ../_static/figures/nonlinear_solve-fig09.png
   :align: center
   :scale: 75 %

   A comparison of steepest descent and linear CG methods applied to the linearized beam example.

Nonlinear CG
============

For fully nonlinear problems, where the kinematics of the system are not confined to small strains, the material response is potentially nonlinear and inelastic, and the contact interactions feature potentially large relative motions between surfaces with frictional response, the residual is a function of the unknown configuration at :math:`(n+1)`, as indicated in :eq:`nonlinear_equation_solving:eq:01` and :eq:`nonlinear_equation_solving:eq:02`.

Nonetheless, in application of linear CG concepts, it is typical to proceed with the requirement that the new search direction satisfy

.. math::
   :label: nonlinear_equation_solving:eq:34

   \mathbf{s}_j^T ( \mathbf{g}_j - \mathbf{g}_{j-1} ) = 0.


A comparison of :eq:`nonlinear_equation_solving:eq:34` and :eq:`nonlinear_equation_solving:eq:26` reveals that :math:`(\mathbf{g}_j - \mathbf{g}_{j-1} )` can be interpreted to mean the instantaneous representation of :math:`\mathbf{K}^{NL} \mathbf{s}_{j-1}`  to the extent that the incremental solution is known and therefore, how it influences the residual.

Combining :eq:`nonlinear_equation_solving:eq:24` and :eq:`nonlinear_equation_solving:eq:34` gives the following result for the search direction

.. math::
   :label: nonlinear_equation_solving:eq:35

   \mathbf{s}_j = \beta_j \mathbf{s}_{j-1} - \mathbf{g}_{j}  = \left( \frac{\mathbf{g}_j^T ( \mathbf{g}_j - \mathbf{g}_{j-1} ) }{\mathbf{s}_{j-1}^T ( \mathbf{g}_j - \mathbf{g}_{j-1} ) }\right) \mathbf{s}_{j-1} - \mathbf{g}_j.


Use of :math:`\beta_j` as implied in :eq:`nonlinear_equation_solving:eq:35` is proposed in the 
nonlinear CG algorithm in [:footcite:`Hestenes:1952:CG`]. Alternatives to :math:`\beta_j`
have also been proposed. For example,
simplification of :eq:`nonlinear_equation_solving:eq:35` is possible if it can be assumed that previous line searches were exact, in which case

.. math::
   :label: nonlinear_equation_solving:eq:35a

   \mathbf{s}_{j-1}^T \mathbf{g}_j = \mathbf{s}_{j-2}^T \mathbf{g}_{j-1}  = 0 .


The orthogonality implied in :eq:`nonlinear_equation_solving:eq:35a` allows the following simplification to the expression for the search direction:

.. math::
   :label: nonlinear_equation_solving:eq:36

   \mathbf{s}_j = \left( \frac{\mathbf{g}_j^T ( \mathbf{g}_j - \mathbf{g}_{j-1} ) }{-\mathbf{s}_{j-1}^T \mathbf{g}_{j-1}  }\right) = \left( \frac{\mathbf{g}_j^T ( \mathbf{g}_j - \mathbf{g}_{j-1} ) }{-(\beta_{j-1}\mathbf{s}_{j-2} - \mathbf{g}_{j-1})^T \mathbf{g}_{j-1}  }\right) = \left( \frac{\mathbf{g}_j^T ( \mathbf{g}_j - \mathbf{g}_{j-1} ) } {\mathbf{g}_{j-1}^T \mathbf{g}_{j-1}  }\right) .


Use of the result in :eq:`nonlinear_equation_solving:eq:36` to define the search directions is recommended in the nonlinear CG algorithm in [:footcite:`grippo`]. 
The Solid Mechanics code adopts this approach because it has performed better
overall.
There are, however, instances when the condition implied in :eq:`nonlinear_equation_solving:eq:35a` is not satisfied (due to either highly nonlinear response or significantly approximate line searches).

The orthogonality ratio is computed every iteration to determine the nonlinearity of the problem and/or the inexactness of the *previous* line search.
When the orthogonality ratio exceeds a nominal value (default is 0.1), the nonlinear CG algorithm is **reset** by setting

.. math::
   :label: nonlinear_equation_solving:eq:36a

   \mathbf{s}_{j} = \mathbf{g}_{j} .


We recognize that the line search must be more general to account for potential nonlinearities.
Minimizing the gradient :math:`\mathbf{g}^T(\Delta \mathbf{d}_j + \alpha_j \mathbf{s}_j)` along the search direction :math:`\mathbf{s}_j` still occurs where their inner product is zero, but an exact expression for :math:`\alpha_j` can no longer be obtained.
A secant method for estimating the rate of change of the gradient along :math:`\mathbf{s}_j` is employed.
Setting the expression to zero will yield the value of :math:`\alpha_j` that ensures the gradient is orthogonal to the search direction:

.. math::
   :label: nonlinear_equation_solving:eq:37

   \frac{\mathrm{d}}{\mathrm{d}\alpha} \left[ \mathbf{g}^T \left(\mathbf{d}_j +\alpha \mathbf{s}_j \right) \right] _{\alpha=0} \mathbf{s}_{j} \approx \mathbf{g}^T \left(\mathbf{d}_j \right) \mathbf{s}_j + \alpha_j \mathbf{s}_j^T \left[ \frac{\mathrm{d}}{\mathrm{d}\alpha} [\mathbf{g}^T \left(\mathbf{d}_j +\alpha \mathbf{s}_j \right) ] _{\alpha=0} \right] \mathbf{s}_j = 0,


where :math:`\left[ \frac{\mathrm{d}}{\mathrm{d}\alpha} [\mathbf{g}^T (\mathbf{d}_j +\alpha \mathbf{s}_j ) ] _{\alpha=0} \right]` is the instantaneous representation of the tangent stiffness matrix.
In order to preserve the memory efficient attribute of nonlinear CG, a secant approximation of the tangent stiffness is obtained by evaluating the gradient at distinct points :math:`\alpha=0` and :math:`\alpha=\epsilon`,

.. math::
   :label: nonlinear_equation_solving:eq:38

   \left[ \frac{\mathrm{d}}{\mathrm{d}\alpha} [\mathbf{g}^T (\mathbf{d}_j +\alpha \mathbf{s}_j ) ] _{\alpha=0} ^{\epsilon} \right] = \frac{1}{\epsilon} \left[ \mathbf{g}^T (\mathbf{d}_j +\alpha \mathbf{s}_j )\right] _{\alpha=\epsilon} - \left[ \mathbf{g}^T (\mathbf{d}_j +\alpha \mathbf{s}_j ) \right] _{\alpha=0}.


Substituting :eq:`nonlinear_equation_solving:eq:38` into :eq:`nonlinear_equation_solving:eq:37`  and taking :math:`\epsilon = 1` yields the following result for the value of the line search parameter :math:`\alpha_j`:

.. math::
   :label: nonlinear_equation_solving:eq:39

   \alpha_j = \frac{-\mathbf{g}^T ( \Delta \mathbf{d}_j )\mathbf{s}_j } {\mathbf{g}_T (\Delta \mathbf{d}_j + \mathbf{s}_j ) - \mathbf{g}^T (\Delta \mathbf{d}_j ) \mathbf{s}_j }.


Applying nonlinear CG to our simple beam model problem would conceptually generate the iterations depicted in :numref:`nonlinear_equation_solving-fig-beams_CG`.

.. _nonlinear_equation_solving-fig-beams_CG:

.. figure:: ../_static/figures/nonlinear_solve-fig10.png
   :align: center
   :scale: 75 %

   Nonlinear conjugate gradient method applied to the two beam problem.

Convergence Properties of CG
============================

It is well known that the convergence rates of iterative, matrix-free solution algorithms such as CG are highly dependent on the eigenvalue spectrum of the underlying equations.
In the case of linear systems of equations, where the gradient direction varies linearly with the solution error, the number of iterations required for convergence is bounded by the number of degrees of freedom.
Unfortunately, no such guarantee exists for nonlinear equations.
In practice, it is observed that convergence is unpredictable.
Depending on the nonlinearities, a solution may be obtained in surprisingly few iterations, or the solution may be intractable even with innumerable iterations and the reset strategy of :eq:`nonlinear_equation_solving:eq:36a`, where the search direction is reset to the steepest descent (current gradient) direction.

As a practical matter, for all but the smallest problem there is an expectation that convergence will be obtained in fewer iterations. In order to understand the conditions under which this is even possible, we summarize here an analysis (which can be found in many texts) of the convergence rate of the method of conjugate gradients.

.. math::
   :label: nonlinear_equation_solving:eq:40

   f( \mathbf{d}_{j}) = 2 \left( \frac {\sqrt{\lambda_{\mathrm{max}} / \lambda_{\mathrm{min}}} - 1} {\sqrt{\lambda_{\mathrm{max}} / \lambda_{\mathrm{min}}} + 1} \right)^j f( \mathbf{d}_0 ) .


Several important conclusions can be drawn from this analysis.
First, convergence of linear CG as a whole is only as fast as the worst eigenmode.
Second, it is not only the spread between the maximum and minimum eigenvalues that is important but also the number of distinct eigenvalues in the spectrum.
Finally, the starting value of the residual can influence the convergence path to the solution.

These conclusions hold for the case of CG applied to the linear equations, yet they remain an important reminder of what should be expected in the nonlinear case. They can provide guidance when the convergence behavior deteriorates.

.. _predictors:

Predictors
==========

One of the most beneficial capabilities added to the nonlinear preconditioned CG iterative solver (nlPCG) is the ability to generate a good starting vector.
Algorithmically, good starting vector is simply 

.. math::
   :label: nonlinear_equation_solving:vector

   \mathbf{d}_0^{*pred*} =  \mathbf{d}_0 + \Delta \mathbf{d}^{*pred*} ,


where :math:`\Delta \mathbf{d}^{*pred*}` is called the predictor.

This can dramatically improve the convergence rate.
A perfect predictor would give a configuration that has no inherent error, and
thus no iterations would be required to improve the solution.

Any other predicted configuration, of course, has error associated with it.
This error can be expressed as a linear combination of distinct eigenvectors.
Theoretically, CG will iterate at most to the same number as there are distinct eigenvectors.
The goal is to generate a predictor with less computational work than that required to iterate to the same configuration.

Computing the incremental solution from the previous step to the current one, and using this increment to extrapolate a guess to the next is a cost effective predictor.
Not only is it trivially computed, but it also contains modes shapes that are actively participating in the solution. That is,

.. math::
   :label: nonlinear_equation_solving:eq:41

   \Delta \mathbf{d}^{*pred*} = \mathbf{d}_{j^*}^{n-1} - \mathbf{d}_{0}^{n-1} 


and therefore,

.. math::
   :label: nonlinear_equation_solving:eq:42

   \mathbf{d}_0^{n, *pred*} =\mathbf{d}_0^n + \Delta \mathbf{d}^{*pred*} .


In :eq:`nonlinear_equation_solving:eq:41`, :math:`(n-1)` refers to the previous load step, as we explicitly write the **predicted configuration** :math:`\mathbf{d}_0^{n, *pred*}` for load step :math:`n` in :eq:`nonlinear_equation_solving:eq:42`.

When the solution path is smooth and gradually varying, this predictor is extremely effective.
A slight improvement can be made by performing a line search along the predictor in which case it is more appropriately named a starting search direction.
The effect of a simple linear predictor on our simple beam model problem is depicted in :numref:`nonlinear_equation_solving-fig-beams_predictor`.

.. _nonlinear_equation_solving-fig-beams_predictor:

.. figure:: ../_static/figures/nonlinear_solve-fig11.png
   :align: center
   :scale: 75 %

   A linear predictor applied to the beam problem can produce a good starting point.

Preconditioned CG
=================

We have mentioned the preconditioner :math:`\mathbf{M}` without any specifics on how it is formed.
Preconditioning is essential for good performance of the CG solver. 
Sierra/SolidMechanics offers two forms of preconditioning, the **nodal** preconditioner and the **full tangent** preconditioner.
The nodal preconditioner is constructed by simply computing and assembling the :math:`3 \times 3` block diagonal entry of the gradient of the residual, :eq:`nonlinear_equation_solving:eq:12`.
In this most general case, the precondition will contain contributions from both the internal force and external force. Sierra/SolidMechanics at this point only includes the contribution to the nodal preconditioner from the internal force:

.. math::
   :label: nonlinear_equation_solving:eq:50

   \left[ \mathbf{M}_I^{nPC} \right] = \left[  \int_{\varphi^h_t(\Omega)} \left[  N_{I,i} \left( \varphi^{-1}_t (\mathbf{x}) \right) \mathbf{C} N_{I,j} \left( \varphi^{-1}_t (\mathbf{x}) \right) \right] \mathrm{d}v   \right] ,


where the term :math:`\mathbf{C}` in :eq:`nonlinear_equation_solving:eq:50` is the instantaneous tangent material properties describing the material.
For the many nonlinear material models supported by the Solid Mechanics module, exact material tangents would be onerous.
A simple but effective alternative is to assume an equivalent hypo-elastic material response *for every material model* where the hypo-elastic bulk and shear moduli are conservatively set to the largest values that the material model may obtain.
The formation of the nodal preconditioner is therefore simple, and need only be performed once per load step.

The full tangent preconditioner is constructed by computing the tangent stiffness matrix. As mentioned in section :numref:`nonlinear_solve-newtons_method`, the tangent stiffness is obtained via probing :eq:`nonlinear_equation_solving:tangent`, the nonlinear system of equations.

.. _nonlinear_equation_solving-feti:

********************************
Parallel Linear Equation Solving
********************************

FETI is now a well established approach for solving a linear system of equations on parallel
MPI-based computer architectures.  Its inception and early development is described in
[:footcite:`towi_book:05`].  Prevalent in the literature is a description of the FETI algorithm as the
foundation for a parallel implementation of Newton's method and its typical requirement for
direct equation solving capability.  The dual-primal unified FETI method which forms the basis
of the Sierra/SM's FETI solver was introduced in [:footcite:`impsol:ref:farhat1,impsol:ref:farhat2`].

The Sierra/SM module generalizes the use of FETI to include not just a means to provide Newton's
method, but also as a preconditioner for nonlinear preconditioned conjugate gradient.  The
basic notion of FETI is embodied in its name, Finite Element Tearing and Interconnecting,
resulting in a separability of the linear system of equations to sub-problems, one for each
processor.

.. _nonlinear_equation_solving-constraints:

************************************
Enforcing Constraints within Solvers
************************************

Theoretically, constraint enforcement is reasonably straightforward. However, performance and/or robustness difficulties reveal themselves in the practical use of solvers where there are many constraints and/or a changing active constraint set. 
It is in the application of the methods for treating constraints within the solver
where difficulties start.
Mathematically,  
there are two broad categories of constraints, **equality constraints** and **inequality constraints**.
Again, with the aid of the simple beam example we have used throughout this chapter, :numref:`nonlinear_equation_solving-fig-constraint_types` shows where one would encounter such constraints in practice.

.. _nonlinear_equation_solving-fig-constraint_types:

.. figure:: ../_static/figures/nonlinear_solve-fig12.png
   :align: center
   :scale: 75 %

   Simple beam example with constraints.

At the fixed end of the cantilever beam, where the displacements are required to be zero, we pose an equality constraint,

.. math::
   :label: nonlinear_equation_solving:eq:60

   \mathbf{h}(\mathbf{d}) = 0. 


:eq:`nonlinear_equation_solving:eq:60` is written in matrix notation and can alternatively be written in index notation as 

.. math::
   :label: nonlinear_equation_solving:eq:61

   h_{L}(d_i) = 0 \text{  , } L=1,n_{\text{con}} \text{  , }
   i=1,n_{\text{dofpn}},


where :math:`\mathbf{h}` is the **constraint operator**.
The constraint operator is simply the collection by row of all the equality constraints (in this case, :math:`n_{\text{con}} =2`).
Notice that for the fixed end of the beam, the constraint operator is very simple. All of the constraints are linear with respect to the displacements :math:`d_1^{I=1}` and :math:`d_2^{I=1}`.
The form of the equality constraint operator may be linear, :math:`\alpha_i d_i = 0`, or nonlinear. However, the essential feature is that the unknowns can be written on the left-hand side of the equation.

Returning to our simple beam example, the ellipse presents itself as an obstacle to the motion of the tip of the beam. It constrains node 2 to be *outside* the ellipse that has major axis :math:`a`, minor axis :math:`b`, is centered at :math:`(0,c)` and is rotated by angle :math:`\alpha` with respect to the horizontal axis.
Given these specifications for the location and orientation of the obstacle, we write the following inequality constraint

.. math::
   :label: nonlinear_equation_solving:eq:63

   \mathbf{g}(\mathbf{d}) \geq 0,


in matrix notation, and alternatively in index notation as

.. math::
   :label: nonlinear_equation_solving:eq:64

   t(d_i) \geq 0 \text{  , } L=1,n_{\text{con}} \text{  , } i=1,n_{\text{dofpn}}.


:numref:`nonlinear_equation_solving-fig-constraint_energy` (a) and (b) is a graphical depiction of the energy error contours as they are modified when using a Lagrange multiplier method and a penalty method, respectively.

.. _nonlinear_equation_solving-fig-constraint_energy:

.. figure:: ../_static/figures/nonlinear_solve-fig13.png
   :align: center
   :scale: 75 %

   Energy error contours for simple beam example with constraints.

:numref:`nonlinear_equation_solving-fig-constraint_penalty` is a graphical depiction of the energy error contours as they are modified when using an augmented Lagrangian (mixed Lagrangian, penalty) method. As the tip of the beam is penetrating the ellipse (violating the kinematic constraint), a penalty force is generated according to 

.. math::
   :label: nonlinear_equation_solving:eq:65

   f\left( d_{k+j/j^*}\right) = \frac{1}{2}\left( d_{k+j/j^*}-d^*\right)^T r\left( d_{k+j/j^*} \right) + \lambda^T_k H\left( d_{k+j/j^*} \right) + \frac{1}{2} \varepsilon_g g^T \left( d_{k+j/j^*} \right) g \left( d_{k+j/j^*} \right) ,


in which it is apparent that an augmented Lagrange method is a combination of a Lagrange multiplier method and a penalty
method. The advantage of this approach is that the penalty :math:`\varepsilon_g` can be soft, thus avoiding the
ill-conditioning associated with penalty methods that must rely on overly stiff penalty parameters for acceptable
constraint enforcement.

.. _nonlinear_equation_solving-fig-constraint_penalty:

.. figure:: ../_static/figures/nonlinear_solve-fig14.png
   :align: center
   :scale: 75 %

   Energy error contours for simple beam example with constraints.

The soft penalty parameter is indicated by the energy error contours increasing only moderately. The iteration counter :math:`j` refers to the nonlinear CG iteration. It proceeds from :math:`j=1,2,\dots` to :math:`j^*`, where the well-conditioned model problem is converged. 
However, because of the soft penalty parameter, there is a significant constraint violation.
Introducing an outer loop and the concept of **nested iterations**,  repeated solutions of the well-conditioned problem are solved while the multiplier, :math:`\lambda_k` , is updated in each of the outer iterations, :math:`k=1,2,\dots`.

:numref:`nonlinear_equation_solving-fig-constraint_penalty_updates` shows a graphical depiction of the updates of the Lagrange multiplier. 
The iteration counter :math:`k` refers to the outer Lagrange multiplier update.
Although not immediate obvious, once the multiplier is updated, dis-equilibrium is introduced (especially in the early updates) and a new model problem must be solved. Eventually, as the multiplier converges, the constraint error tends to zero as well as the corresponding dis-equilibrium. 

.. _nonlinear_equation_solving-fig-constraint_penalty_updates:

.. figure:: ../_static/figures/nonlinear_solve-fig15.png
   :align: center
   :scale: 75 %

   Energy error contours for simple beam example with constraints.

****************************
Multi-Level Iterative Solver
****************************

The multi-level solver concept is based on a strategy where an attribute and/or nonlinearity is controlled within the nonlinear solver.
It is important to recognize that complete linearization (as in a Newton Raphson approach) is not necessary and in many cases not optimal.
Furthermore, there are several cases where nonlinearities are not even the source of the poor convergence behavior.
The essential concept of the strategy is to identify the feature that makes convergence difficult to achieve and to control it in a manner that encourages the nonlinear core solver to converge to the greatest extent possible.

The control is accomplished by holding fixed a variable that would ordinarily be free to change during the iteration, by reducing the stiffness of dilatational modes of deformation, or by restricting the search directions to span only a selected sub-space.
The core CG solver is used to solve a **model problem** - a problem where the control is active.
When the core CG solver is converged, an update on the controlled variable is performed, the residual is recalculated, and a new model problem is solved.
The approach has similarities to a Newton Raphson algorithm, as shown in :numref:`nonlinear_equation_solving-fig-onelevel_mlsolver`.

.. _nonlinear_equation_solving-fig-onelevel_mlsolver:

.. figure:: ../_static/figures/nonlinear_solve-fig16.png
   :align: center
   :scale: 75 %

   A schematic of a single-level multi-level solver.

The generality of the multi-level solver is apparent in the case where multiple controls are active.
Multiple controls can occur at a single-level or be nested at different levels - 
hence the name multi-level solver. 
:numref:`nonlinear_equation_solving-fig-twolevel_mlsolver` depicts a 2-level multi-level solver.

.. _nonlinear_equation_solving-fig-twolevel_mlsolver:

.. figure:: ../_static/figures/nonlinear_solve-fig17.png
   :align: center
   :scale: 75 %

   A schematic of a two-level multi-level solver.

As depicted in :numref:`nonlinear_equation_solving-fig-onelevel_mlsolver` and :numref:`nonlinear_equation_solving-fig-twolevel_mlsolver`,  the iterative solver by its nature solves the model problem and/or the nested problem within some specified tolerance (as opposed to nearly exact solutions obtained by a direct solver).
The inexactness of these solves is most often not an issue, however there are some cases where a certain amount of precision is required. 

.. %Further discussion of this issue is left to subsection :numref:`nonlinear_equation_solving:sec-convergence_multilevel_solver`.


.. %\subsection{Architecture/Implementation of Multi-Level Solver}

.. %The flowchart depicted in :numref:`nonlinear_equation_solving-fig-flowchart_CG` gives us a foundation for understanding the Multi-level Solver by first seeing graphically the implementation of non-linear CG.

.. %\begin{figure}[ht]
.. %\centerline{
.. %\epsfig{file=figures/chap13_nonlinear_equation_solving/fig12.png, width=5.5in}
.. % }
.. %\caption{Flowchart for Implementation of non-linear CG}
.. %\label{nonlinear_equation_solving-fig-flowchart_CG}
.. %\end{figure}



.. %\subsection{Control Geometry: Managing Geometrically Nonlinear Problems}


.. %\subsection{Control Stiffness: Augmented Lagrangian Concepts}


.. %\subsection{Control Stiffness: Nearly Incompressible Materials}


.. %\subsection{Control Stiffness: Nearly Inextensible Materials}


.. %\subsection{Control Stiffness: Highly Orthotropic Materials}

..  %Not sure if this is actually implemented in Adagio?


.. %\subsection{Control Contact: Contact Constraint Enforcement with Nested Iterations}


.. %\subsection{Control Modes: Multi-grid Geometric Preconditioning}


.. %\subsection{Convergence Characteristics of the Multi-Level Iterative Solver}
.. %\label{nonlinear_equation_solving:sec-convergence_multilevel_solver}

.. raw::
   html

   <hr>

.. footbibliography::
