In [R. J. Baraldi and D. P. Kouri, Mathematical Programming, (2022), pp. 1-40], we introduced an inexact trust-region algorithm for minimizing the sum of a smooth nonconvex and nonsmooth convex function. The principle expense of this method is in computing a trial iterate that satisfies the so-called fraction of Cauchy decrease condition—a bound that ensures the trial iterate produces sufficient decrease of the subproblem model. In this paper, we expound on various proximal trust-region subproblem solvers that generalize traditional trust-region methods for smooth unconstrained and convex-constrained problems. We introduce a simplified spectral proximal gradient solver, a truncated nonlinear conjugate gradient solver, and a dogleg method. We compare algorithm performance on examples from data science and PDE-constrained optimization.
In Baraldi (Math Program 20:1–40, 2022), we introduced an inexact trust-region algorithm for minimizing the sum of a smooth nonconvex function and a nonsmooth convex function in Hilbert space—a class of problems that is ubiquitous in data science, learning, optimal control, and inverse problems. This algorithm has demonstrated excellent performance and scalability with problem size. In this paper, we enrich the convergence analysis for this algorithm, proving strong convergence of the iterates with guaranteed rates. In particular, we demonstrate that the trust-region algorithm recovers superlinear, even quadratic, convergence rates when using a second-order Taylor approximation of the smooth objective function term.
In many applications, one can only access the inexact gradients and inexact hessian times vector products. Thus it is essential to consider algorithms that can handle such inexact quantities with a guaranteed convergence to solution. An inexact adaptive and provably convergent semismooth Newton method is considered to solve constrained optimization problems. In particular, dynamic optimization problems, which are known to be highly expensive, are the focus. A memory efficient semismooth Newton algorithm is introduced for these problems. The source of efficiency and inexactness is the randomized matrix sketching. Applications to optimization problems constrained by partial differential equations are also considered.
A key challenge in inverse problems is the selection of sensors to gather the most effective data. In this paper, we consider the problem of inferring the initial condition to a linear dynamical system and develop an efficient control-theoretical approach for greedily selecting sensors. Our method employs a Galerkin projection to reduce the size of the inverse problem, resulting in a computationally efficient algorithm for sensor selection. As a byproduct of our algorithm, we obtain a preconditioner for the inverse problem that enables the rapid recovery of the initial condition. We analyze the theoretical performance of our greedy sensor selection algorithm as well as the performance of the associated preconditioner. Finally, we verify our theoretical results on various inverse problems involving partial differential equations.
We present a new algorithm for infinite-dimensional optimization with general constraints, called ALESQP. In short, ALESQP is an augmented Lagrangian method that penalizes inequality constraints and solves equality-constrained nonlinear optimization subproblems at every iteration. The subproblems are solved using a matrix-free trust-region sequential quadratic programming (SQP) method that takes advantage of iterative, i.e., inexact linear solvers, and is suitable for large-scale applications. A key feature of ALESQP is a constraint decomposition strategy that allows it to exploit problem-specific variable scalings and inner products. We analyze convergence of ALESQP under different assumptions. We show that strong accumulation points are stationary. Consequently, in finite dimensions ALESQP converges to a stationary point. In infinite dimensions we establish that weak accumulation points are feasible in many practical situations. Under additional assumptions we show that weak accumulation points are stationary. We present several infinite-dimensional examples where ALESQP shows remarkable discretization-independent performance in all of its iterative components, requiring a modest number of iterations to meet constraint tolerances at the level of machine precision. Also, we demonstrate a fully matrix-free solution of an infinite-dimensional problem with nonlinear inequality constraints.
Many applications require minimizing the sum of smooth and nonsmooth functions. For example, basis pursuit denoising problems in data science require minimizing a measure of data misfit plus an $\ell^1$-regularizer. Similar problems arise in the optimal control of partial differential equations (PDEs) when sparsity of the control is desired. Here, we develop a novel trust-region method to minimize the sum of a smooth nonconvex function and a nonsmooth convex function. Our method is unique in that it permits and systematically controls the use of inexact objective function and derivative evaluations. When using a quadratic Taylor model for the trust-region subproblem, our algorithm is an inexact, matrix-free proximal Newton-type method that permits indefinite Hessians. We prove global convergence of our method in Hilbert space and demonstrate its efficacy on three examples from data science and PDE-constrained optimization.
We present a surrogate modeling framework for conservatively estimating measures of risk from limited realizations of an expensive physical experiment or computational simulation. Risk measures combine objective probabilities with the subjective values of a decision maker to quantify anticipated outcomes. Given a set of samples, we construct a surrogate model that produces estimates of risk measures that are always greater than their empirical approximations obtained from the training data. These surrogate models limit over-confidence in reliability and safety assessments and produce estimates of risk measures that converge much faster to the true value than purely sample-based estimates. We first detail the construction of conservative surrogate models that can be tailored to a stakeholder's risk preferences and then present an approach, based on stochastic orders, for constructing surrogate models that are conservative with respect to families of risk measures. Our surrogate models include biases that permit them to conservatively estimate the target risk measures. We provide theoretical results that show that these biases decay at the same rate as the L2 error in the surrogate model. Numerical demonstrations confirm that risk-adapted surrogate models do indeed overestimate the target risk measures while converging at the expected rate.
In this paper, we develop an algorithm to efficiently solve risk-averse optimization problems posed in reflexive Banach space. Such problems often arise in many practical applications as, e.g., optimization problems constrained by partial differential equations with uncertain inputs. Unfortunately, for many popular risk models including the coherent risk measures, the resulting risk-averse objective function is nonsmooth. This lack of differentiability complicates the numerical approximation of the objective function as well as the numerical solution of the optimization problem. To address these challenges, we propose a primal–dual algorithm for solving large-scale nonsmooth risk-averse optimization problems. This algorithm is motivated by the classical method of multipliers and by epigraphical regularization of risk measures. As a result, the algorithm solves a sequence of smooth optimization problems using derivative-based methods. We prove convergence of the algorithm even when the subproblems are solved inexactly and conclude with numerical examples demonstrating the efficiency of our method.
A major challenge in shape optimization is the coupling of finite element method (FEM) codes in a way that facilitates efficient computation of shape derivatives. This is particularly difficult with multiphysics problems involving legacy codes, where the costs of implementing and maintaining shape derivative capabilities are prohibitive. The volume and boundary methods are two approaches to computing shape derivatives. Each has a major drawback: the boundary method is less accurate, while the volume method is more invasive to the FEM code. We introduce the strip method, which computes shape derivatives on a strip adjacent to the boundary. The strip method makes code coupling simple. Like the boundary method, it queries the state and adjoint solutions at quadrature nodes, but requires no knowledge of the FEM code implementations. At the same time, it exhibits the higher accuracy of the volume method. As an added benefit, its computational complexity is comparable to that of the boundary method, that is, it is faster than the volume method. We illustrate the benefits of the strip method with numerical examples.
Constructing accurate statistical models of critical system responses typically requires an enormous amount of data from physical experiments or numerical simulations. Unfortunately, data generation is often expensive and time consuming. To streamline the data generation process, optimal experimental design determines the 'best' allocation of experiments with respect to a criterion that measures the ability to estimate some important aspect of an assumed statistical model. While optimal design has a vast literature, few researchers have developed design paradigms targeting tail statistics, such as quantiles. In this project, we tailored and extended traditional design paradigms to target distribution tails. Our approach included (i) the development of new optimality criteria to shape the distribution of prediction variances, (ii) the development of novel risk-adapted surrogate models that provably overestimate certain statistics including the probability of exceeding a threshold, and (iii) the asymptotic analysis of regression approaches that target tail statistics such as superquantile regression. To accompany our theoretical contributions, we released implementations of our methods for surrogate modeling and design of experiments in two complementary open source software packages, the ROL/OED Toolkit and PyApprox.
We present a surrogate modeling framework for conservatively estimating measures of risk from limited realizations of an expensive physical experiment or computational simulation. We adopt a probabilistic description of risk that assigns probabilities to consequences associated with an event and use risk measures, which combine objective evidence with the subjective values of decision makers, to quantify anticipated outcomes. Given a set of samples, we construct a surrogate model that produces estimates of risk measures that are always greater than their empirical estimates obtained from the training data. These surrogate models not only limit over-confidence in reliability and safety assessments, but produce estimates of risk measures that converge much faster to the true value than purely sample-based estimates. We first detail the construction of conservative surrogate models that can be tailored to the specific risk preferences of the stakeholder and then present an approach, based upon stochastic orders, for constructing surrogate models that are conservative with respect to families of risk measures. The surrogate models introduce a bias that allows them to conservatively estimate the target risk measures. We provide theoretical results that show that this bias decays at the same rate as the L2 error in the surrogate model. Our numerical examples confirm that risk-aware surrogate models do indeed over-estimate the target risk measures while converging at the expected rate.
ROL-PEBBL is a C++, MPI-based parallel code for mixed-integer PDE-constrained optimization (MIPDECO). In these problems we wish to optimize (control, design, etc.) physical systems, which must obey the laws of physics, when some of the decision variables must take integer values. ROL-PEBBL combines a code to efficiently search over integer choices (PEBBL = Parallel Enumeration Branch-and-Bound Library) and a code for efficient nonlinear optimization, including PDE-constrained optimization (ROL = Rapid Optimization Library). In this report, we summarize the design of ROL-PEBBL and initial applications/results. For an artificial source-inversion problem, finding sources of pollution on a grid from sparse samples, ROL-PEBBLs solution for the nest grid gave the best optimization guarantee for any general solver that gives both a solution and a quality guarantee.
In this paper, we introduce and analyze a new class of optimal control problems constrained by elliptic equations with uncertain fractional exponents. We utilize risk measures to formulate the resulting optimization problem. We develop a functional analytic framework, study the existence of solution, and rigorously derive the first-order optimality conditions. Additionally, we employ a sample-based approximation for the uncertain exponent and the finite element method to discretize in space. We prove the rate of convergence for the optimal risk neutral controls when using quadrature approximation for the uncertain exponent and conclude with illustrative examples.
This paper develops a novel limited-memory method to solve dynamic optimization problems. The memory requirements for such problems often present a major obstacle, particularly for problems with PDE constraints such as optimal flow control, full waveform inversion, and optical tomography. In these problems, PDE constraints uniquely determine the state of a physical system for a given control; the goal is to find the value of the control that minimizes an objective. While the control is often low dimensional, the state is typically more expensive to store. This paper suggests using randomized matrix approximation to compress the state as it is generated and shows how to use the compressed state to reliably solve the original dynamic optimization problem. Concretely, the compressed state is used to compute approximate gradients and to apply the Hessian to vectors. The approximation error in these quantities is controlled by the target rank of the sketch. This approximate first- and second-order information can readily be used in any optimization algorithm. As an example, we develop a sketched trust-region method that adaptively chooses the target rank using a posteriori error information and provably converges to a stationary point of the original problem. Numerical experiments with the sketched trust-region method show promising performance on challenging problems such as the optimal control of an advection-reaction-diffusion equation and the optimal control of fluid flow past a cylinder.
This report summarizes the work performed under the project "Linear Programming in Strongly Polynomial Time." Linear programming (LP) is a classic combinatorial optimization problem heavily used directly and as an enabling subroutine in integer programming (IP). Specifically IP is the same as LP except that some solution variables must take integer values (e.g. to represent yes/no decisions). Together LP and IP have many applications in resource allocation including general logistics, and infrastructure design and vulnerability analysis. The project was motivated by the PI's recent success developing methods to efficiently sample Voronoi vertices (essentially finding nearest neighbors in high-dimensional point sets) in arbitrary dimension. His method seems applicable to exploring the high-dimensional convex feasible space of an LP problem. Although the project did not provably find a strongly-polynomial algorithm, it explored multiple algorithm classes. The new medial simplex algorithms may still lead to solvers with improved provable complexity. We describe medial simplex algorithms and some relevant structural/complexity results. We also designed a novel parallel LP algorithm based on our geometric insights and implemented it in the Spoke-LP code. A major part of the computational step is many independent vector dot products. Our parallel algorithm distributes the problem constraints across processors. Current commercial and high-quality free LP solvers require all problem details to fit onto a single processor or multicore. Our new algorithm might enable the solution of problems too large for any current LP solvers. We describe our new algorithm, give preliminary proof-of-concept experiments, and describe a new generator for arbitrarily large LP instances.