Publications Search

Spoke-LP: A parallel code for linear programming

Ebeida, Mohamed S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Towards Generic Parallel Programming in Computer Science Education with Kokkos

Ciesko, Jan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Analog architectures for neural network acceleration based on non-volatile memory

Applied Physics Reviews

Xiao, Tianyao X.; Bennett, Christopher H.; Feinberg, Benjamin F.; Agarwal, Sapan A.; Marinella, Matthew J.

Analog hardware accelerators, which perform computation within a dense memory array, have the potential to overcome the major bottlenecks faced by digital hardware for data-heavy workloads such as deep learning. Exploiting the intrinsic computational advantages of memory arrays, however, has proven to be challenging principally due to the overhead imposed by the peripheral circuitry and due to the non-ideal properties of memory devices that play the role of the synapse. We review the existing implementations of these accelerators for deep supervised learning, organizing our discussion around the different levels of the accelerator design hierarchy, with an emphasis on circuits and architecture. We explore and consolidate the various approaches that have been proposed to address the critical challenges faced by analog accelerators, for both neural network inference and training, and highlight the key design trade-offs underlying these techniques.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Regression Based Approach for Robust Finite Element Analysis on Arbitrary Grids (REBAR)

Kuberry, Paul A.; Trask, Nathaniel A.; Koester, Jacob K.; Bochev, Pavel B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

LANL/SANDIA/VOROCRUST EOFY 2020 meeting

LaForce, Tara; Jordan, Spencer H.; Ebeida, Mohamed S.; McLendon, William C.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Quantum Scientific Computing Open User Testbed (QSCOUT)

Clark, Susan M.; Landahl, Andrew J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Transpiling Your Code in Other Languages to Run on QSCOUT

Morrison, Benjamin M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Elastoplastic Constitutive Model Calibration with Automatic Differentiation-based Sensitivities

Seidl, Daniel T.; Granzow, Brian N.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

DOI OSTI

Fundamental Physics of Reversible Computing--An Introduction

Frank, Michael P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Performance-Portable and Scalable Algorithms on Sparse/Dense Matrices and Graphs

Acer, Seher A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Evaluating the Efficiency of OpenMP Tasking for Unbalanced Computationon Diverse CPU Architectures

Olivier, Stephen L.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Assessing atomically thin delta-doping of silicon using mid-infrared ellipsometry

Journal of Materials Research

Katzenmeyer, Aaron M.; Luk, Ting S.; Bussmann, Ezra B.; Young, Steve M.; Anderson, Evan M.; Marshall, Michael T.; Ohlhausen, J.A.; Kotula, Paul G.; Lu, Ping L.; Campbell, DeAnna M.; Lu, Tzu-Ming L.; Liu, Peter Q.; Ward, Daniel R.; Misra, Shashank M.

Hydrogen lithography has been used to template phosphine-based surface chemistry to fabricate atomic-scale devices, a process we abbreviate as atomic precision advanced manufacturing (APAM). Here, we use mid-infrared variable angle spectroscopic ellipsometry (IR-VASE) to characterize single-nanometer thickness phosphorus dopant layers (δ-layers) in silicon made using APAM compatible processes. A large Drude response is directly attributable to the δ-layer and can be used for nondestructive monitoring of the condition of the APAM layer when integrating additional processing steps. The carrier density and mobility extracted from our room temperature IR-VASE measurements are consistent with cryogenic magneto-transport measurements, showing that APAM δ-layers function at room temperature. Finally, the permittivity extracted from these measurements shows that the doping in the APAM δ-layers is so large that their low-frequency in-plane response is reminiscent of a silicide. However, there is no indication of a plasma resonance, likely due to reduced dimensionality and/or low scattering lifetime.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

ALO-NMF: Accelerated Locality-Optimized Non-negative Matrix Factorization

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Moon, Gordon E.; Ellis, John E.; Sukumaran-Rajam, Aravind; Parthasarathy, Srinivasan; Sadayappan, P.

Non-negative Matrix Factorization (NMF) is a key kernel for unsupervised dimension reduction used in a wide range of applications, including graph mining, recommender systems and natural language processing. Due to the compute-intensive nature of applications that must perform repeated NMF, several parallel implementations have been developed. However, existing parallel NMF algorithms have not addressed data locality optimizations, which are critical for high performance since data movement costs greatly exceed the cost of arithmetic/logic operations on current computer systems. In this paper, we present a novel optimization method for parallel NMF algorithm based on the HALS (Hierarchical Alternating Least Squares) scheme that incorporates algorithmic transformations to enhance data locality. Efficient realizations of the algorithm on multi-core CPUs and GPUs are developed, demonstrating a new Accelerated Locality-Optimized NMF (ALO-NMF) that obtains up to 2.29x lower data movement cost and up to 4.45x speedup over existing state-of-the-art parallel NMF algorithms.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

ALO-NMF: Accelerated Locality-Optimized Non-negative Matrix Factorization

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Moon, Gordon E.; Ellis, John E.; Sukumaran-Rajam, Aravind; Parthasarathy, Srinivasan; Sadayappan, P.

Non-negative Matrix Factorization (NMF) is a key kernel for unsupervised dimension reduction used in a wide range of applications, including graph mining, recommender systems and natural language processing. Due to the compute-intensive nature of applications that must perform repeated NMF, several parallel implementations have been developed. However, existing parallel NMF algorithms have not addressed data locality optimizations, which are critical for high performance since data movement costs greatly exceed the cost of arithmetic/logic operations on current computer systems. In this paper, we present a novel optimization method for parallel NMF algorithm based on the HALS (Hierarchical Alternating Least Squares) scheme that incorporates algorithmic transformations to enhance data locality. Efficient realizations of the algorithm on multi-core CPUs and GPUs are developed, demonstrating a new Accelerated Locality-Optimized NMF (ALO-NMF) that obtains up to 2.29x lower data movement cost and up to 4.45x speedup over existing state-of-the-art parallel NMF algorithms.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

Multi-fidelity machine-learning with uncertainty quantification and Bayesian optimization for materials design: Application to ternary random alloys

Journal of Chemical Physics

Laros, James H.; Wildey, Timothy M.; Tranchida, Julien G.; Thompson, Aidan P.

We present a scale-bridging approach based on a multi-fidelity (MF) machine-learning (ML) framework leveraging Gaussian processes (GP) to fuse atomistic computational model predictions across multiple levels of fidelity. Through the posterior variance of the MFGP, our framework naturally enables uncertainty quantification, providing estimates of confidence in the predictions. We used density functional theory as high-fidelity prediction, while a ML interatomic potential is used as low-fidelity prediction. Practical materials' design efficiency is demonstrated by reproducing the ternary composition dependence of a quantity of interest (bulk modulus) across the full aluminum-niobium-titanium ternary random alloy composition space. The MFGP is then coupled to a Bayesian optimization procedure, and the computational efficiency of this approach is demonstrated by performing an on-the-fly search for the global optimum of bulk modulus in the ternary composition space. The framework presented in this manuscript is the first application of MFGP to atomistic materials simulations fusing predictions between density functional theory and classical interatomic potential calculations.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures

ACM International Conference Proceeding Series

Yamazaki, Ichitaro Y.; Rajamanickam, Sivasankaran R.; Ellingwood, Nathan D.

Sparse triangular solver is an important kernel in many computational applications. However, a fast, parallel, sparse triangular solver on a manycore architecture such as GPU has been an open issue in the field for several years. In this paper, we develop a sparse triangular solver that takes advantage of the supernodal structures of the triangular matrices that come from the direct factorization of a sparse matrix. We implemented our solver using Kokkos and Kokkos Kernels such that our solver is portable to different manycore architectures. This has the additional benefit of allowing our triangular solver to use the team-level kernels and take advantage of the hierarchical parallelism available on the GPU. We compare the effects of different scheduling schemes on the performance and also investigate an algorithmic variant called the partitioned inverse. Our performance results on an NVIDIA V100 or P100 GPU demonstrate that our implementation can be 12.4 × or 19.5 × faster than the vendor optimized implementation in NVIDIA's CuSPARSE library.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

Efficient optimization method for finding minimum energy paths of magnetic transitions

Journal of Physics Condensed Matter

Tranchida, Julien G.; Ivanov, A.V.; Dagbartsson, D.; Uzdin, V.M.; Jonsson, H.

Efficient algorithms for the calculation of minimum energy paths of magnetic transitions are implemented within the geodesic nudged elastic band (GNEB) approach. While an objective function is not available for GNEB and a traditional line search can, therefore, not be performed, the use of limited memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) and conjugate gradient algorithms in conjunction with orthogonal spin optimization (OSO) approach is shown to greatly outperform the previously used velocity projection and dissipative Landau-Lifschitz dynamics optimization methods. The implementation makes use of energy weighted springs for the distribution of the discretization points along the path and this is found to improve performance significantly. The various methods are applied to several test problems using a Heisenberg-type Hamiltonian, extended in some cases to include Dzyaloshinskii-Moriya and exchange interactions beyond nearest neighbours. Minimum energy paths are found for magnetization reversals in a nano-island, collapse of skyrmions in two-dimensional layers and annihilation of a chiral bobber near the surface of a three-dimensional magnet. The LBFGS-OSO method is found to outperform the dynamics based approaches by up to a factor of 8 in some cases.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Parallel algorithms for hyperdynamics and local hyperdynamics

Journal of Chemical Physics

Plimpton, Steven J.; Perez, Danny; Voter, Arthur F.

Hyperdynamics (HD) is a method for accelerating the timescale of standard molecular dynamics (MD). It can be used for simulations of systems with an energy potential landscape that is a collection of basins, separated by barriers, where transitions between basins are infrequent. HD enables the system to escape from a basin more quickly while enabling a statistically accurate renormalization of the simulation time, thus effectively boosting the timescale of the simulation. In the work of Kim et al. [J. Chem. Phys. 139, 144110 (2013)], a local version of HD was formulated, which exploits the intrinsic locality characteristic typical of most systems to mitigate the poor scaling properties of standard HD as the system size is increased. Here, we discuss how both HD and local HD can be formulated to run efficiently in parallel. We have implemented these ideas in the LAMMPS MD code, which means HD can be used with any interatomic potential LAMMPS supports. Together, these parallel methods allow simulations of any size to achieve the time acceleration offered by HD (which can be orders of magnitude), at a cost of 2-4× that of standard MD. As examples, we performed two simulations of a million-atom system to model the diffusion and clustering of Pt adatoms on a large patch of the Pt(100) surface for 80 μs and 160 μs.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Improved reference system for the corrected rigid spheres equation of state model

Journal of Applied Physics

Cowen, Benjamin J.; Carpenter, John H.

The Corrected Rigid Spheres (CRIS) equation of state (EOS) model [Kerley, J. Chem. Phys. 73, 469 (1980); 73, 478 (1980); 73, 487 (1980)], developed from fluid perturbation theory using a hard sphere reference system, has been successfully used to calculate the EOS of many materials, including gases and metals. The radial distribution function (RDF) plays a pivotal role in choosing the sphere diameter, through a variational principle, as well as the thermodynamic response. Despite its success, the CRIS model has some shortcomings in that it predicts too large a temperature for liquid-vapor critical points, can break down at large compression, and is computationally expensive. We first demonstrate that an improved analytic representation of the hard sphere RDF does not alleviate these issues. Relaxing the strict adherence of the RDF to hard spheres allows an accurate fit to the isotherms and vapor dome of the Lennard-Jones fluid using an arbitrary reference system. The second order correction is eliminated, limiting the breakdown at large compression and significantly reducing the computation cost. The transferability of the new model to real systems is demonstrated on argon, with an improved vapor dome compared to the original CRIS model.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus

Models and analysis of fuel switching generation impacts on power system resilience

IEEE Power and Energy Society General Meeting

Wilches-Bernal, Felipe; Knueven, Ben; Staid, Andrea S.; Watson, Jean-Paul W.

This paper presents model formulations for generators that have the ability to use multiple fuels and to switch between them if necessary. These models are used to generate different scenarios of fuel switching penetration from a test power system. With these scenarios, for a severe disruption in the fuel supply to multiple generators, the paper analyzes the effect that fuel switching has on the resilience of the power system. Load not served is used as the proxy metric to evaluate power system resilience. The paper shows that the presence of generators with fuel switching capabilities considerably reduces the amount and duration of the load shed by the system facing the fuel disruption.

More Details

TYPE Conference Poster YEAR 2020

OSTI Scopus

Neuromorphic Computing: Towards Extremely Heterogeneous HPC

Cardwell, Suma G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Code-verification techniques for hypersonic reacting flows in thermochemical nonequilibrium

Journal of Computational Physics

Freno, Brian A.; Carnes, Brian C.; Weirs, Vincent G.

The study of hypersonic flows and their underlying aerothermochemical reactions is particularly important in the design and analysis of vehicles exiting and reentering Earth's atmosphere. Computational physics codes can be employed to simulate these phenomena; however, code verification of these codes is necessary to certify their credibility. To date, few approaches have been presented for verifying codes that simulate hypersonic flows, especially flows reacting in thermochemical nonequilibrium. In this work, we present our code-verification techniques for verifying the spatial accuracy and thermochemical source term in hypersonic reacting flows in thermochemical nonequilibrium. Additionally, we demonstrate the effectiveness of these techniques on the Sandia Parallel Aerodynamics and Reentry Code (SPARC).

More Details

TYPE Journal Article YEAR 2020

DOI OSTI

Git Workflows

Willenbring, James M.; O'Neal, Jared

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

On supervised and unsupervised deep learning applications for materials informatics

Laros, James H.; Wildey, Timothy M.; Rodgers, Theron R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Agile Methodologies Redux

Willenbring, James M.; Heroux, Michael A.; Bernholdt, David

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Learning Compact Physics-Aware Photocurrent Models Using Dynamic Mode Decomposition

Hanson, Joshua M.; Bochev, Pavel B.; Paskaleva, Biliana S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

DOI OSTI

E3SM: Performance of spectral element dycore on Summit at scale

Taylor, Mark A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Validating Communication Similarity between Proxy and Parent Applications through Network Performance Characterization

Aaziz, Omar R.; Cook, Jeanine C.; Vaughan, Courtenay T.; Jeffery, Kuehn

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

New inflow boundary conditions for relativistic and Newtonian fluids

Roberds, Nicholas R.; Beckwith, Kristian B.; Bettencourt, Matthew T.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architecture

Yamazaki, Ichitaro Y.; Rajamanickam, Sivasankaran R.; Ellingwood, Nathan D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Containerized Environment for Reproducibility and Traceability of Scientific Workflows

Olaya, Paula; Lofstead, Gerald F.; Taufer, Michela

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Solving a steady-state pde using spiking networks and neuromorphic hardware

Smith, John D.; Severa, William M.; Hill, Aaron J.; Reeder, Leah; Parekh, Ojas D.; Franke, Brian C.; Lehoucq, Richard B.; Aimone, James B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

DOI OSTI

A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges

Gulian, Mamikon G.; Swiler, Laura P.; Frankel, Ari L.; Safta, Cosmin S.; Jakeman, John D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Brief Announcement: Provable neuromorphic advantages for computing constrained shortest paths

Aimone, James B.; Ho, Yang H.; Parekh, Ojas D.; Phillips, Cynthia A.; Pinar, Ali P.; Severa, William M.; Wang, Yipu W.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

The Kokkos Lectures - Module 6: Fortran/Python interoperability MPI and PGAS

Trott, Christian R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Automatic Differentiation of C++ Codes on Emerging Manycore Architectures with Sacado

Phipps, Eric T.; Pawlowski, Roger P.; Trott, Christian R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Exploring the Ultimate Limits of Adiabatic Circuits

Frank, Michael P.; Brocato, Robert W.; Conte, Thomas; Hsia, Alexander; Jain, Anirudh; Missert, Nancy A.; Shukla, Karpur; Tierney, Brian D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Modeling Assisted Room Temperature Operation of Atomic Precision Advanced Manufacturing Devices

Gao, Xujiao G.; Tracy, Lisa A.; Anderson, Evan M.; Campbell, DeAnna M.; Ivie, Jeffrey A.; Lu, Tzu-Ming L.; Mamaluy, Denis M.; Schmucker, Scott W.; Misra, Shashank M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

DOI OSTI

An active learning high-throughput microstructure calibration framework for solving inverse structure–process problems in materials informatics

Acta Materialia

Laros, James H.; Mitchell, John A.; Swiler, Laura P.; Wildey, Timothy M.

Determining a process–structure–property relationship is the holy grail of materials science, where both computational prediction in the forward direction and materials design in the inverse direction are essential. Problems in materials design are often considered in the context of process–property linkage by bypassing the materials structure, or in the context of structure–property linkage as in microstructure-sensitive design problems. However, there is a lack of research effort in studying materials design problems in the context of process–structure linkage, which has a great implication in reverse engineering. In this work, given a target microstructure, we propose an active learning high-throughput microstructure calibration framework to derive a set of processing parameters, which can produce an optimal microstructure that is statistically equivalent to the target microstructure. The proposed framework is formulated as a noisy multi-objective optimization problem, where each objective function measures a deterministic or statistical difference of the same microstructure descriptor between a candidate microstructure and a target microstructure. Furthermore, to significantly reduce the physical waiting wall-time, we enable the high-throughput feature of the microstructure calibration framework by adopting an asynchronously parallel Bayesian optimization by exploiting high-performance computing resources. Case studies in additive manufacturing and grain growth are used to demonstrate the applicability of the proposed framework, where kinetic Monte Carlo (kMC) simulation is used as a forward predictive model, such that for a given target microstructure, the target processing parameters that produced this microstructure are successfully recovered.

More Details

TYPE Journal Article YEAR 2020

DOI OSTI Scopus