Publications

Results 5001–5050 of 9,998

Search results

Jump to search filters

Short Introduction to Relations Between Thermodynamic Quantities

Wills, Ann E.

Thermodynamic quantities, such as pressure and internal energy, and their derivatives, are used in many applications. Depending on application, a natural set of quantities related to one of four thermodynamic potentials are typically used. For example, hydro-codes use internal energy derived quantities and Equation of State work often uses Helmholtz free energy quantities. When performing work spanning over several fields, transformations between one set of quantities and another set of quantities are often needed. A short, but comprehensive, review of such transformations are given in this report.

More Details

High-performance conjugate-gradient benchmark: A new metric for ranking high-performance computing systems

International Journal of High Performance Computing Applications

Heroux, Michael A.; Dongarra, Jack; Luszczek, Piotr

We describe a new high-performance conjugate-gradient (HPCG) benchmark. HPCG is composed of computations and data-access patterns commonly found in scientific applications. HPCG strives for a better correlation to existing codes from the computational science domain and to be representative of their performance. HPCG is meant to help drive the computer system design and implementation in directions that will better impact future performance improvement.

More Details

Verification and Validation of a Coordinate Transformation Method in Axisymmetric Transient Magnetics

Robinson, Allen C.; Niederhaus, John H.; Ashcraft, C.C.

We present a verification and validation analysis of a coordinate-transformation-based numerical solution method for the two-dimensional axisymmetric magnetic diffusion equation, implemented in the finite-element simulation code ALEGRA. The transformation, suggested by Melissen and Simkin, yields an equation set perfectly suited for linear finite elements and for problems with large jumps in material conductivity near the axis. The verification analysis examines transient magnetic diffusion in a rod or wire in a very low conductivity background by first deriving an approximate analytic solution using perturbation theory. This approach for generating a reference solution is shown to be not fully satisfactory. A specialized approach for manufacturing an exact solution is then used to demonstrate second-order convergence under spatial refinement and tem- poral refinement. For this new implementation, a significant improvement relative to previously available formulations is observed. Benefits in accuracy for computed current density and Joule heating are also demonstrated. The validation analysis examines the circuit-driven explosion of a copper wire using resistive magnetohydrodynamics modeling, in comparison to experimental tests. The new implementation matches the accuracy of the existing formulation, with both formulations capturing the experimental burst time and action to within approximately 2%.

More Details

Probing off-Hugoniot states in Ta, Cu, and Al to 1000 GPa compression with magnetically driven liner implosions

Journal of Applied Physics

Lemke, Raymond W.; Laros, James H.; Dalton, Devon D.; Brown, Justin L.; Tomlinson, K.; Robertson, G.R.; Knudson, Marcus D.; Harding, Eric H.; Wills, Ann E.; Carpenter, John H.; Drake, Richard R.; Cochrane, Kyle C.; Blue, B.E.; Robinson, Allen C.; Mattsson, Thomas M.

We report on a new technique for obtaining off-Hugoniot pressure vs. density data for solid metals compressed to extreme pressure by a magnetically driven liner implosion on the Z-machine (Z) at Sandia National Laboratories. In our experiments, the liner comprises inner and outer metal tubes. The inner tube is composed of a sample material (e.g., Ta and Cu) whose compressed state is to be inferred. The outer tube is composed of Al and serves as the current carrying cathode. Another aluminum liner at much larger radius serves as the anode. A shaped current pulse quasi-isentropically compresses the sample as it implodes. The iterative method used to infer pressure vs. density requires two velocity measurements. Photonic Doppler velocimetry probes measure the implosion velocity of the free (inner) surface of the sample material and the explosion velocity of the anode free (outer) surface. These two velocities are used in conjunction with magnetohydrodynamic simulation and mathematical optimization to obtain the current driving the liner implosion, and to infer pressure and density in the sample through maximum compression. This new equation of state calibration technique is illustrated using a simulated experiment with a Cu sample. Monte Carlo uncertainty quantification of synthetic data establishes convergence criteria for experiments. Results are presented from experiments with Al/Ta, Al/Cu, and Al liners. Symmetric liner implosion with quasi-isentropic compression to peak pressure ∼1000 GPa is achieved in all cases. These experiments exhibit unexpectedly softer behavior above 200 GPa, which we conjecture is related to differences in the actual and modeled properties of aluminum.

More Details

Rebooting Computing and Low-Power Image Recognition Challenge

2015 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2015

DeBenedictis, Erik; Lu, Yung H.; Kadin, Alan M.; Berg, Alexander C.; Conte, Thomas M.; Garg, Rachit; Gingade, Ganesh; Hoang, Bichlien; Huang, Yongzhen; Li, Boxun; Liu, Jingyu; Liu, Wei; Mao, Huizi; Peng, Junran; Tang, Tianqi; Track, Elie K.; Wang, Jingqiu; Wang, Tao; Wang, Yu; Yao, Jun

Rebooting Computing (RC) is an effort in the IEEE to rethink future computers. RC started in 2012 by the co-chairs, Elie Track (IEEE Council on Superconductivity) and Tom Conte (Computer Society). RC takes a holistic approach, considering revolutionary as well as evolutionary solutions needed to advance computer technologies. Three summits have been held in 2013 and 2014, discussing different technologies, from emerging devices to user interface, from security to energy efficiency, from neuromorphic to reversible computing. The first part of this paper introduces RC to the design automation community and solicits revolutionary ideas from the community for the directions of future computer research. Energy efficiency is identified as one of the most important challenges in future computer technologies. The importance of energy efficiency spans from miniature embedded sensors to wearable computers, from individual desktops to data centers. To gauge the state of the art, the RC Committee organized the first Low Power Image Recognition Challenge (LPIRC). Each image contains one or multiple objects, among 200 categories. A contestant has to provide a working system that can recognize the objects and report the bounding boxes of the objects. The second part of this paper explains LPIRC and the solutions from the top two winners.

More Details

Introduction to Peridynamics

Handbook of Peridynamic Modeling

Silling, Stewart A.

As discussed in the previous chapter, the purpose of peridynamics is to unify the mechanics of continuous media, continuous media with evolving discontinuities, and discrete particles. To accomplish this, peridynamics avoids the use of partial derivatives of the deformation with respect to spatial coordinates. Instead, it uses integral equations that remain valid on discontinuities. Discrete particles, as will be discussed later in this chapter, are treated using Dirac delta functions.

More Details

Analysis of an optimization-based atomistic-to-continuum coupling method for point defects

ESAIM: Mathematical Modelling and Numerical Analysis

Olson, Derek; Luskin, Mitchell; Shapeev, Alexander V.; Bochev, Pavel B.

We formulate and analyze an optimization-based Atomistic-to-Continuum (AtC) coupling method for problems with point defects. Application of a potential-based atomistic model near the defect core enables accurate simulation of the defect. Away from the core, where site energies become nearly independent of the lattice position, the method switches to a more efficient continuum model. The two models are merged by minimizing the mismatch of their states on an overlap region, subject to the atomistic and continuum force balance equations acting independently in their domains. We prove that the optimization problem is well-posed and establish error estimates.

More Details

An empirical comparison of graph laplacian solvers

Proceedings of the Workshop on Algorithm Engineering and Experiments

Boman, Erik G.; Deweese, Kevin; Gilbert, John R.

Solving Laplacian linear systems is an important task in a variety of practical and theoretical applications. This problem is known to have solutions that perform in linear times polylogarithmic work in theory, but these algorithms are difficult to implement in practice. We examine existing solution techniques in order to determine the best methods currently available and for which types of problems are they useful. We perform timing experiments using a variety of solvers on a variety of problems and present our results. We discover differing solver behavior between web graphs and a class of synthetic graphs designed to model them.

More Details

A nonlocal strain measure for DIC

Conference Proceedings of the Society for Experimental Mechanics Series

Turner, Daniel Z.; Lehoucq, Richard B.; Reu, Phillip L.

It is well known that the derivative-based classical approach to strain is problematic when the displacement field is irregular, noisy, or discontinuous. Difficulties arise wherever the displacements are not differentiable. We present an alternative, nonlocal approach to calculating strain from digital image correlation (DIC) data that is well-defined and robust, even for the pathological cases that undermine the classical strain measure. This integral formulation for strain has no spatial derivatives and when the displacement field is smooth, the nonlocal strain and the classical strain are identical. We submit that this approach to computing strains from displacements will greatly improve the fidelity and efficacy of DIC for new application spaces previously untenable in the classical framework.

More Details

Local search to improve coordinate-based task mapping

Parallel Computing

Balzuweit, Evan; Bunde, David P.; Leung, Vitus J.; Finley, Austin; Lee, Alan C.S.

We present a local search strategy to improve the coordinate-based mapping of a parallel job's tasks to the MPI ranks of its parallel allocation in order to reduce network congestion and the job's communication time. The goal is to reduce the number of network hops between communicating pairs of ranks. Our target is applications with a nearest-neighbor stencil communication pattern running on mesh systems with non-contiguous processor allocation, such as Cray XE and XK Systems. Using the miniGhost mini-app, which models the shock physics application CTH, we demonstrate that our strategy reduces application running time while also reducing the runtime variability. We further show that mapping quality can vary based on the selected allocation algorithm, even between allocation algorithms of similar apparent quality.

More Details

Monolithic multigrid methods for two-dimensional resistive magnetohydrodynamics

SIAM Journal on Scientific Computing

Adler, James H.; Benson, Thomas R.; Cyr, Eric C.; Maclachlan, Scott P.; Tuminaro, Raymond S.

Magnetohydrodynamic (MHD) representations are used to model a wide range of plasma physics applications and are characterized by a nonlinear system of partial differential equations that strongly couples a charged fluid with the evolution of electromagnetic fields. The resulting linear systems that arise from discretization and linearization of the nonlinear problem are generally difficult to solve. In this paper, we investigate multigrid preconditioners for this system. We consider two well-known multigrid relaxation methods for incompressible fluid dynamics: Braess-Sarazin relaxation and Vanka relaxation. We first extend these to the context of steady-state one-fluid viscoresistive MHD. Then we compare the two relaxation procedures within a multigrid-preconditioned GMRES method employed within Newton's method. To isolate the effects of the different relaxation methods, we use structured grids, inf-sup stable finite elements, and geometric interpolation. We present convergence and timing results for a two-dimensional, steady-state test problem.

More Details

Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

Frontiers in Neuroscience

Agarwal, Sapan A.; Quach, Tu-Thach Q.; Parekh, Ojas D.; DeBenedictis, Erik; James, Conrad D.; Marinella, Matthew J.; Aimone, James B.

The exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-based architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.

More Details

Teko: A block preconditioning capability with concrete example applications in Navier-Stokes and MHD

SIAM Journal on Scientific Computing

Cyr, Eric C.; Shadid, John N.; Tuminaro, Raymond S.

This paper describes the design of Teko, an object-oriented C++ library for implementing advanced block preconditioners. Mathematical design criteria that elucidate the needs of block preconditioning libraries and techniques are explained and shown to motivate the structure of Teko. For instance, a principal design choice was for Teko to strongly reflect the mathematical statement of the preconditioners to reduce development burden and permit focus on the numerics. Additional mechanisms are explained that provide a pathway to developing an optimized production capable block preconditioning capability with Teko. Finally, Teko is demonstrated on fluid flow and magnetohydrodynamics applications. In addition to highlighting the features of the Teko library, these new results illustrate the effectiveness of recent preconditioning developments applied to advanced discretization approaches.

More Details

Quantum Graph Analysis

Maunz, Peter L.; Sterk, Jonathan D.; Lobser, Daniel L.; Parekh, Ojas D.; Ryan-Anderson, Ciaran R.

In recent years, advanced network analytics have become increasingly important to na- tional security with applications ranging from cyber security to detection and disruption of ter- rorist networks. While classical computing solutions have received considerable investment, the development of quantum algorithms to address problems, such as data mining of attributed relational graphs, is a largely unexplored space. Recent theoretical work has shown that quan- tum algorithms for graph analysis can be more efficient than their classical counterparts. Here, we have implemented a trapped-ion-based two-qubit quantum information proces- sor to address these goals. Building on Sandia's microfabricated silicon surface ion traps, we have designed, realized and characterized a quantum information processor using the hyperfine qubits encoded in two 171 Yb + ions. We have implemented single qubit gates using resonant microwave radiation and have employed Gate set tomography (GST) to characterize the quan- tum process. For the first time, we were able to prove that the quantum process surpasses the fault tolerance thresholds of some quantum codes by demonstrating a diamond norm distance of less than 1 . 9 x 10 [?] 4 . We used Raman transitions in order to manipulate the trapped ions' motion and realize two-qubit gates. We characterized the implemented motion sensitive and insensitive single qubit processes and achieved a maximal process infidelity of 6 . 5 x 10 [?] 5 . We implemented the two-qubit gate proposed by Molmer and Sorensen and achieved a fidelity of more than 97 . 7%.

More Details

Micro-fabricated ion traps for Quantum Information Processing; Highlights and lessons learned

Maunz, Peter L.; Blume-Kohout, Robin J.; Blain, Matthew G.; Benito, Francisco; Berry, Christopher; Clark, Craig R.; Clark, Susan M.; Colombo, Anthony P.; Dagel, Amber L.; Fortier, Kevin M.; Haltli, Raymond A.; Heller, Edwin J.; Lobser, Daniel L.; Mizrahi, Jonathan; Nielsen, Erik N.; Resnick, Paul J.; Rembetski, John F.; Rudinger, Kenneth M.; Scrymgeour, David S.; Sterk, Jonathan D.; Tabakov, Boyan; Tigges, Chris P.; Van Der Wall, Jay W.; Stick, Daniel L.

Abstract not provided.

Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout

Kim, Kyungjoo K.; Rajamanickam, Sivasankaran R.; Stelle, George; Edwards, Harold C.; Olivier, Stephen L.

We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented on both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.

More Details
Results 5001–5050 of 9,998
Results 5001–5050 of 9,998