Beyeler, Walter E.; Frazier, Christopher R.; Swiler, Laura P.; Portone, Teresa P.; Krofcheck, Daniel J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Valicka, Christopher G.; Staid, Andrea S.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Epi-regularization of risk measures

Mathematics of Operations Research

Kouri, Drew P.; Surowiec, Thomas M.

Uncertainty pervades virtually every branch of science and engineering, and in many disciplines, the underlying phenomena can be modeled by partial differential equations (PDEs) with uncertain or random inputs. This work is motivated by risk-averse stochastic programming problems constrained by PDEs. These problems are posed in infinite dimensions, which leads to a significant increase in the scale of the (discretized) problem. In order to handle the inherent nonsmoothness of, for example, coherent risk measures and to exploit existing solution techniques for smooth, PDE-constrained optimization problems, we propose a variational smoothing technique called epigraphical (epi-)regularization. We investigate the effects of epi-regularization on the axioms of coherency and prove differentiability of the smoothed risk measures. In addition, we demonstrate variational convergence of the epi-regularized risk measures and prove the consistency of minimizers and first-order stationary points for the approximate risk-averse optimization problem. We conclude with numerical experiments confirming our theoretical results.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

SPHYNX: Spectral partitioning for HYbrid and aXelerator-enabled systems

Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020

Acer, Seher A.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications are rapidly moving to these architectures. However, there is no scalable, distributed-memory, multi-GPU graph partitioner available for applications. We developed a spectral graph partitioner, Sphynx, using the portable, accelerator-friendly stack of the Trilinos framework. We use Sphnyx to systematically evaluate the various algorithmic choices in spectral partitioning with a focus on GPU performance. We perform those evaluations on irregular graphs, because state-of-the-art partitioners have the most difficulty on them. We demonstrate that Sphynx is up to 17x faster on GPUs compared to the case on CPUs, and up to 580x faster compared to a state-of-the-art multilevel partitioner. Sphynx provides a robust alternative for applications looking for a GPU-based partitioner.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

Exploring chapel productivity using some graph algorithms

Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020

Arxiv

Behzadinasab, Masoud; Bazilevs, Yuri; Trask, Nathaniel A.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2020

OSTI

Verification Validation and Uncertainty Quantification (V&V/UQ) of Wind Plant Models Project Overview of FY20 Q2 Milestone Completion: Wind Uncertainty Quantification Session and Publications

Maniaci, David C.; Laros, James H.; Geraci, Gianluca G.; Seidl, Daniel T.; Herges, Thomas G.; Eldred, Michael S.; Blaylo, Myra L.; Houchens, Brent C.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Siirola, John D.; Burgard, Anthony

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Dynamic Modeling and Optimization of Advanced Energy Systems

Nicholson, Bethany L.; Thierry, David; Parker, Robert B.; Eslick, John; Ma, Jinliang; Le, Quang M.; Bhattacharyya, Debangsu; Biegler, Larry

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Low-synchronization orthogonalization schemes for s-step and pipelined Krylov solvers in Trilinos

Yamazaki, Ichitaro Y.; Hoemmen, Mark F.; Boman, Erik G.; Elliott, James E.; Thomas, Stephen; Swirydowicz, Katarzyna

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

DOI OSTI

What is the fractional Laplacian? A comparative review with new results

Journal of Computational Physics

arXiv preprint

Layer-Parallel Training of Deep Residual Neural Networks

SIAM Journal on Mathematics of Data Science

Gunther, Stefanie; Ruthotto, Lars; Schroder, Jacob B.; Cyr, Eric C.; Gauger, Nicolas R.

Residual neural networks (ResNets) are a promising class of deep neural networks that have shown excellent performance for a number of learning tasks, e.g., image classification and recognition. Mathematically, ResNet architectures can be interpreted as forward Euler discretizations of a nonlinear initial value problem whose time-dependent control variables represent the weights of the neural network. Hence, training a ResNet can be cast as an optimal control problem of the associated dynamical system. For similar time-dependent optimal control problems arising in engineering applications, parallel-in-time methods have shown notable improvements in scalability. This paper demonstrates the use of those techniques for efficient and effective training of ResNets. The proposed algorithms replace the classical (sequential) forward and backward propagation through the network layers with a parallel nonlinear multigrid iteration applied to the layer domain. This adds a new dimension of parallelism across layers that is attractive when training very deep networks. From this basic idea, we derive multiple layer-parallel methods. The most efficient version employs a simultaneous optimization approach where updates to the network parameters are based on inexact gradient information in order to speed up the training process. Finally, using numerical examples from supervised classification, we demonstrate that the new approach achieves a training performance similar to that of traditional methods, but enables layer-parallelism and thus provides speedup over layer-serial methods through greater concurrency.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI

Scalable preconditioning of block-structured linear algebra systems using ADMM

Computers and Chemical Engineering

Materials Science and Engineering: A

Heckman, Nathan H.; Ivanoff, Thomas I.; Roach, Ashley M.; Jared, Bradley H.; Tung, Daniel J.; Huber, Todd H.; Saiz, David J.; Koepke, Joshua R.; Rodelas, Jeffrey R.; Madison, Jonathan D.; Salzbrenner, Bradley S.; Swiler, Laura P.; Jones, Reese E.; Boyce, Brad B.

The mechanical properties of additively manufactured metals tend to show high variability, due largely to the stochastic nature of defect formation during the printing process. This study seeks to understand how automated high throughput testing can be utilized to understand the variable nature of additively manufactured metals at different print conditions, and to allow for statistically meaningful analysis. This is demonstrated by analyzing how different processing parameters, including laser power, scan velocity, and scan pattern, influence the tensile behavior of additively manufactured stainless steel 316L utilizing a newly developed automated test methodology. Microstructural characterization through computed tomography and electron backscatter diffraction is used to understand some of the observed trends in mechanical behavior. Specifically, grain size and morphology are shown to depend on processing parameters and influence the observed mechanical behavior. In the current study, laser-powder bed fusion, also known as selective laser melting or direct metal laser sintering, is shown to produce 316L over a wide processing range without substantial detrimental effect on the tensile properties. Ultimate tensile strengths above 600 MPa, which are greater than that for typical wrought annealed 316L with similar grain sizes, and elongations to failure greater than 40% were observed. It is demonstrated that this process has little sensitivity to minor intentional or unintentional variations in laser velocity and power.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

A Performance and Cost Assessment of Machine Learning Interatomic Potentials

Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory

Zuo, Yunxing; Chen, Chi; Li, Xiangguo; Deng, Zhi; Chen, Yiming; Behler, Jorg; Csanyi, Gabor; Shapeev, Alexander V.; Thompson, Aidan P.; Wood, Mitchell A.; Ong, Shyue P.

Machine learning of the quantitative relationship between local environment descriptors and the potential energy surface of a system of atoms has emerged as a new frontier in the development of interatomic potentials (IAPs). Here, we present a comprehensive evaluation of ML-IAPs based on four local environment descriptors --- Behler-Parrinello symmetry functions, smooth overlap of atomic positions (SOAP), the Spectral Neighbor Analysis Potential (SNAP) bispectrum components, and moment tensors --- using a diverse data set generated using high-throughput density functional theory (DFT) calculations. The data set comprising bcc (Li, Mo) and fcc (Cu, Ni) metals and diamond group IV semiconductors (Si, Ge) is chosen to span a range of crystal structures and bonding. All descriptors studied show excellent performance in predicting energies and forces far surpassing that of classical IAPs, as well as predicting properties such as elastic constants and phonon dispersion curves. We observe a general trade-off between accuracy and the degrees of freedom of each model, and consequently computational cost. We will discuss these trade-offs in the context of model selection for molecular dynamics and other applications.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI

Quantifying the relative importance of complimentary parameters in PDE-based inverse problems

Sunseri, Isaac P.; Hart, Joseph L.; Alexandarian, Alen; van Bloemen Waanders, Bart G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Using LDMS for Performance and Proxy Representativeness Characterization

Cook, Jeanine C.; Kuehn, Jeffery; Aaziz, Omar R.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Fourier analyses of high-order continuous and discontinuous Galerkin methods

SIAM Journal on Numerical Analysis

Le Roux, Daniel Y.; Eldred, Christopher; Taylor, Mark A.

We present a Fourier analysis of wave propagation problems subject to a class of continuous and discontinuous discretizations using high-degree Lagrange polynomials. This allows us to obtain explicit analytical formulas for the dispersion relation and group velocity and, for the first time to our knowledge, characterize analytically the emergence of gaps in the dispersion relation at specific wavenumbers, when they exist, and compute their specific locations. Wave packets with energy at these wavenumbers will fail to propagate correctly, leading to significant numerical dispersion. We also show that the Fourier analysis generates mathematical artifacts, and we explain how to remove them through a branch selection procedure conducted by analysis of eigenvectors and associated reconstructed solutions. The higher frequency eigenmodes, named erratic in this study, are also investigated analytically and numerically.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Kalashnikova, Irina; Watkins, Jerry E.; Tuminaro, Raymond S.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Multilevel uncertainty quantification using cfd and openfast simulations of the swift facility

AIAA Scitech 2020 Forum

Laros, James H.; Maniaci, David C.; Herges, Thomas H.; Geraci, Gianluca G.; Seidl, Daniel T.; Eldred, Michael S.; Blaylock, Myra L.; Houchens, Brent C.

Uncertainty is present in all wind energy problems of interest, but quantifying its impact for wind energy research, design and analysis applications often requires the collection of large ensembles of numerical simulations. These predictions require a range of model fidelity as predictive models, that include the interaction of atmospheric and wind turbine wake physics, can require weeks or months to solve on institutional high-performance computing systems. The need for these extremely expensive numerical simulations extends the computational resource requirements usually associated with uncertainty quantification analysis. To alleviate the computational burden, we propose here to adopt several Multilevel-Multifidelity sampling strategies that we compare for a realistic test case. A demonstration study was completed using simulations of a V27 turbine at Sandia National Laboratories’ SWiFT facility in a neutral atmospheric boundary layer. The flow was simulated with three models of disparate fidelity. OpenFAST with TurbSim was used stand-alone as the most computationally-efficient, lower-fidelity model. The computational fluid dynamics code Nalu-Wind was used for large eddy simulations with both medium-fidelity actuator disk and high-fidelity actuator line models, with various mesh resolutions. In an uncertainty quantification study, we considered five different turbine properties as random parameters: yaw offset, generator torque constant, collective blade pitch, gearbox efficiency and blade mass. For all quantities of interest, the Multilevel-Multifidelity estimators demonstrated greater efficiency compared to standard and multilevel Monte Carlo estimators.

More Details

TYPE Conference Poster YEAR 2020

DOI OSTI Scopus

Solving IPDEs on Spiking Neuromorphic Hardware

Smith, John D.; Aimone, James B.; Franke, Brian C.; Hill, Aaron J.; Lehoucq, Richard B.; Parekh, Ojas D.; Reeder, Leah E.; Severa, William M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

A Portable SIMD Primitive Using Kokkos for Heterogeneous Architectures

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Sahasrabudhe, Damodar; Phipps, Eric T.; Rajamanickam, Sivasankaran R.; Berzins, Martin

As computer architectures are rapidly evolving (e.g. those designed for exascale), multiple portability frameworks have been developed to avoid new architecture-specific development and tuning. However, portability frameworks depend on compilers for auto-vectorization and may lack support for explicit vectorization on heterogeneous platforms. Alternatively, programmers can use intrinsics-based primitives to achieve more efficient vectorization, but the lack of a gpu back-end for these primitives makes such code non-portable. A unified, portable, Single Instruction Multiple Data (simd) primitive proposed in this work, allows intrinsics-based vectorization on cpus and many-core architectures such as Intel Knights Landing (knl), and also facilitates Single Instruction Multiple Threads (simt) based execution on gpus. This unified primitive, coupled with the Kokkos portability ecosystem, makes it possible to develop explicitly vectorized code, which is portable across heterogeneous platforms. The new simd primitive is used on different architectures to test the performance boost against hard-to-auto-vectorize baseline, to measure the overhead against efficiently vectroized baseline, and to evaluate the new feature called the “logical vector length” (lvl). The simd primitive provides portability across cpus and gpus without any performance degradation being observed experimentally.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

SIAM Journal on Matrix Analysis and Applications

Cambier, Leopold; Boman, Erik G.; Rajamanickam, Sivasankaran R.; Tuminaro, Raymond S.; Darve, Eric

We propose a new algorithm for the fast solution of large, sparse, symmetric positive-definite linear systems, spaND (sparsified Nested Dissection). It is based on nested dissection, sparsification, and low-rank compression. After eliminating all interiors at a given level of the elimination tree, the algorithm sparsifies all separators corresponding to the interiors. This operation reduces the size of the separators by eliminating some degrees of freedom but without introducing any fill-in. This is done at the expense of a small and controllable approximation error. The result is an approximate factorization that can be used as an efficient preconditioner. We then perform several numerical experiments to evaluate this algorithm. We demonstrate that a version using orthogonal factorization and block-diagonal scaling takes fewer CG iterations to converge than previous similar algorithms on various kinds of problems. Furthermore, this algorithm is provably guaranteed to never break down and the matrix stays symmetric positive-definite throughout the process. We evaluate the algorithm on some large problems show it exhibits near-linear scaling. The factorization time is roughly \scrO (N), and the number of iterations grows slowly with N.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Publications

Search results