Publications

Results 5701–5750 of 9,998

Search results

Jump to search filters

KAYENTA: Theory and User's Guide

Brannon, Rebecca M.; Fuller, Timothy J.; Strack, Otto E.; Fossum, Arlo F.; Sanchez, Jason J.

The physical foundations and domain of applicability of the Kayenta constitutive model are presented along with descriptions of the source code and user instructions. Kayenta, which is an outgrowth of the Sandia GeoModel, includes features and fitting functions appropriate to a broad class of materials including rocks, rock-like engineered materials (such as concretes and ceramics), and metals. Fundamentally, Kayenta is a computational framework for generalized plasticity models. As such, it includes a yield surface, but the term (3z(Byield(3y (Bis generalized to include any form of inelastic material response (including microcrack growth and pore collapse) that can result in non-recovered strain upon removal of loads on a material element. Kayenta supports optional anisotropic elasticity associated with joint sets, as well as optional deformation-induced anisotropy through kinematic hardening (in which the initially isotropic yield surface is permitted to translate in deviatoric stress space to model Bauschinger effects). The governing equations are otherwise isotropic. Because Kayenta is a unification and generalization of simpler models, it can be run using as few as 2 parameters (for linear elasticity) to as many as 40 material and control parameters in the exceptionally rare case when all features are used. For high-strain-rate applications, Kayenta supports rate dependence through an overstress model. Isotropic damage is modeled through loss of stiffness and strength.

More Details

Evaluation of various interpolants available in DICE

Turner, Daniel Z.; Reu, Phillip L.; Crozier, Paul C.

This report evaluates several interpolants implemented in the Digital Image Correlation Engine (DICe), an image correlation software package developed by Sandia. By interpolants we refer to the basis functions used to represent discrete pixel intensity data as a continuous signal. Interpolation is used to determine intensity values in an image at non - pixel locations. It is also used, in some cases, to evaluate the x and y gradients of the image intensities. Intensity gradients subsequently guide the optimization process. The goal of this report is to inform analysts as to the characteristics of each interpolant and provide guidance towards the best interpolant for a given dataset. This work also serves as an initial verification of each of the interpolants implemented.

More Details

Equation of State Model Quality Study for Ti and Ti64

Wills, Ann E.; Sanchez, Jason J.

Titanium and the titanium alloy Ti64 (6% aluminum, 4% vanadium and the balance ti- tanium) are materials used in many technologically important applications. To be able to computationally investigate and design these applications, accurate Equations of State (EOS) are needed and in many cases also additional constitutive relations. This report describes what data is available for constructing EOS for these two materials, and also describes some references giving data for stress-strain constitutive models. We also give some suggestions for projects to achieve improved EOS and constitutive models. In an appendix, we present a study of the 'cloud formation' issue observed in the ALEGRA code. This issue was one of the motivating factors for this literature search of available data for constructing improved EOS for Ti and Ti64. However, the study shows that the cloud formation issue is only marginally connected to the quality of the EOS, and, in fact, is a physical behavior of the system in question. We give some suggestions for settings in, and improvements of, the ALEGRA code to address this computational di culty.

More Details

Scaling to Nanotechnology Limits with the PIMS Computer Architecture and a new Scaling Rule

DeBenedictis, Erik

We describe a new approach to computing that moves towards the limits of nanotechnology using a newly formulated sc aling rule. This is in contrast to the current computer industry scali ng away from von Neumann's original computer at the rate of Moore's Law. We extend Moore's Law to 3D, which l eads generally to architectures that integrate logic and memory. To keep pow er dissipation cons tant through a 2D surface of the 3D structure requires using adiabatic principles. We call our newly proposed architecture Processor In Memory and Storage (PIMS). We propose a new computational model that integrates processing and memory into "tiles" that comprise logic, memory/storage, and communications functions. Since the programming model will be relatively stable as a system scales, programs repr esented by tiles could be executed in a PIMS system built with today's technology or could become the "schematic diagram" for implementation in an ultimate 3D nanotechnology of the future. We build a systems software approach that offers advantages over and above the technological and arch itectural advantages. Firs t, the algorithms may be more efficient in the conventional sens e of having fewer steps. Second, the algorithms may run with higher power efficiency per operation by being a better match for the adiabatic scaling ru le. The performance analysis based on demonstrated ideas in physical science suggests 80,000 x improvement in cost per operation for the (arguably) gene ral purpose function of emulating neurons in Deep Learning.

More Details

Exploiting data representation for fault tolerance

Journal of Computational Science

Laros, James H.; Hoemmen, Mark F.; Mueller, F.

Incorrect computer hardware behavior may corrupt intermediate computations in numerical algorithms, possibly resulting in incorrect answers. Prior work models misbehaving hardware by randomly flipping bits in memory. We start by accepting this premise, and present an analytic model for the error introduced by a bit flip in an IEEE 754 floating-point number. We then relate this finding to the linear algebra concepts of normalization and matrix equilibration. In particular, we present a case study illustrating that normalizing both vector inputs of a dot product minimizes the probability of a single bit flip causing a large error in the dot product's result. Moreover, the absolute error is either less than one or very large, which allows detection of large errors. Then, we apply this to the GMRES iterative solver. We count all possible errors that can be introduced through faults in arithmetic in the computationally intensive orthogonalization phase of GMRES, and show that when the matrix is equilibrated, the absolute error is bounded above by one.

More Details

Why do simple algorithms for triangle enumeration work in the real world?

Internet Mathematics

Berry, Jonathan W.; Fostvedt, Luke A.; Nordman, Daniel J.; Phillips, Cynthia A.; Comandur, Seshadhri C.; Wilson, Alyson G.

Listing all triangles is a fundamental graph operation. Triangles can have important interpretations in real-world graphs, especially social and other interaction networks. Despite the lack of provably efficient (linear, or slightly super linear) worst-case algorithms for this problem, practitioners run simple, efficient heuristics to find all triangles in graphs with millions of vertices. How are these heuristics exploiting the structure of these special graphs to provide major speedups in running time? We study one of the most prevalent algorithms used by practitioners. A trivial algorithm enumerates all paths of length 2, and checks if each such path is incident to a triangle. A good heuristic is to enumerate only those paths of length 2 in which the middle vertex has the lowest degree. It is easily implemented and is empirically known to give remarkable speedups over the trivial algorithm. We study the behavior of this algorithm over graphs with heavy-tailed degree distributions, a defining feature of real-world graphs. The erased configuration model (ECM) efficiently generates a graph with asymptotically (almost) any desired degree sequence. We show that the expected running time of this algorithm over the distribution of graphs created by the ECM is controlled by the l4/3-norm of the degree sequence. Norms of the degree sequence are a measure of the heaviness of the tail, and it is precisely this feature that allows non trivial speedups of simple triangle enumeration algorithms. As a corollary of our main theorem, we prove expected linear-time performance for degree sequences following a power law with exponent α ≥ 7/3, and non trivial speedup whenever α ∈ (2, 3).

More Details

Assessing the role of mini-applications in predicting key performance characteristics of scientific and engineering applications

Journal of Parallel and Distributed Computing

Barrett, R.F.; Crozier, Paul C.; Doerfler, Douglas W.; Heroux, Michael A.; Lin, Paul L.; Thornquist, Heidi K.; Trucano, Timothy G.; Vaughan, Courtenay T.

Computational science and engineering application programs are typically large, complex, and dynamic, and are often constrained by distribution limitations. As a means of making tractable rapid explorations of scientific and engineering application programs in the context of new, emerging, and future computing architectures, a suite of "miniapps" has been created to serve as proxies for full scale applications. Each miniapp is designed to represent a key performance characteristic that does or is expected to significantly impact the runtime performance of an application program. In this paper we introduce a methodology for assessing the ability of these miniapps to effectively represent these performance issues. We applied this methodology to three miniapps, examining the linkage between them and an application they are intended to represent. Herein we evaluate the fidelity of that linkage. This work represents the initial steps required to begin to answer the question, "Under what conditions does a miniapp represent a key performance characteristic in a full app?"

More Details

A hybrid approach for parallel transistor-level full-chip circuit simulation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Thornquist, Heidi K.; Rajamanickam, Sivasankaran R.

The computer-aided design (CAD) applications that are fundamental to the electronic design automation industry need to harness the available hardware resources to be able to perform full-chip simulation for modern technology nodes (45nm and below). We will present a hybrid (MPI+threads) approach for parallel transistor-level transient circuit simulation that achieves scalable performance for some challenging large-scale integrated circuits. This approach focuses on the computationally expensive part of the simulator: the linear system solve. Hybrid versions of two iterative linear solver strategies are presented, one takes advantage of block triangular form structure while the other uses a Schur complement technique. Results indicate up to a 27x improvement in total simulation time on 256 cores.

More Details

Preserving lagrangian structure in nonlinear model reduction with application to structural dynamics

SIAM Journal on Scientific Computing

Carlberg, Kevin; Tuminaro, Raymond S.; Boggs, Paul

This work proposes a model-reduction methodology that preserves Lagrangian structure and achieves computational efficiency in the presence of high-order nonlinearities and arbitrary parameter dependence. As such, the resulting reduced-order model retains key properties such as energy conservation and symplectic time-evolution maps. We focus on parameterized simple mechanical systems subjected to Rayleigh damping and external forces, and consider an application to nonlinear structural dynamics. To preserve structure, the method first approximates the system's "Lagrangian ingredients"-the Riemannian metric, the potential-energy function, the dissipation function, and the external force-and subsequently derives reduced-order equations of motion by applying the (forced) Euler-Lagrange equation with these quantities. From the algebraic perspective, key contributions include two efficient techniques for approximating parameterized reduced matrices while preserving symmetry and positive definiteness: matrix gappy proper orthogonal decomposition and reduced-basis sparsification. Results for a parameterized truss-structure problem demonstrate the practical importance of preserving Lagrangian structure and illustrate the proposed method's merits: it reduces computation time while maintaining high accuracy and stability, in contrast to existing nonlinear model-reduction techniques that do not preserve structure.

More Details
Results 5701–5750 of 9,998
Results 5701–5750 of 9,998