Publications

Results 8001–8100 of 9,998

Search results

Jump to search filters

Posterior predictive modeling using multi-scale stochastic inverse parameter estimates

Mckenna, Sean A.; Ray, Jaideep R.; van Bloemen Waanders, Bart G.

Multi-scale binary permeability field estimation from static and dynamic data is completed using Markov Chain Monte Carlo (MCMC) sampling. The binary permeability field is defined as high permeability inclusions within a lower permeability matrix. Static data are obtained as measurements of permeability with support consistent to the coarse scale discretization. Dynamic data are advective travel times along streamlines calculated through a fine-scale field and averaged for each observation point at the coarse scale. Parameters estimated at the coarse scale (30 x 20 grid) are the spatially varying proportion of the high permeability phase and the inclusion length and aspect ratio of the high permeability inclusions. From the non-parametric, posterior distributions estimated for these parameters, a recently developed sub-grid algorithm is employed to create an ensemble of realizations representing the fine-scale (3000 x 2000), binary permeability field. Each fine-scale ensemble member is instantiated by convolution of an uncorrelated multiGaussian random field with a Gaussian kernel defined by the estimated inclusion length and aspect ratio. Since the multiGaussian random field is itself a realization of a stochastic process, the procedure for generating fine-scale binary permeability field realizations is also stochastic. Two different methods are hypothesized to perform posterior predictive tests. Different mechanisms for combining multi Gaussian random fields with kernels defined from the MCMC sampling are examined. Posterior predictive accuracy of the estimated parameters is assessed against a simulated ground truth for predictions at both the coarse scale (effective permeabilities) and at the fine scale (advective travel time distributions). The two techniques for conducting posterior predictive tests are compared by their ability to recover the static and dynamic data. The skill of the inference and the method for generating fine-scale binary permeability fields are evaluated through flow calculations on the resulting fields using fine-scale realizations and comparing them against results obtained with the ground truth fine-scale and coarse-scale permeability fields.

More Details

Solution methods for very highly integrated circuits

Thornquist, Heidi K.; Mei, Ting M.; Tuminaro, Raymond S.

While advances in manufacturing enable the fabrication of integrated circuits containing tens-to-hundreds of millions of devices, the time-sensitive modeling and simulation necessary to design these circuits poses a significant computational challenge. This is especially true for mixed-signal integrated circuits where detailed performance analyses are necessary for the individual analog/digital circuit components as well as the full system. When the integrated circuit has millions of devices, performing a full system simulation is practically infeasible using currently available Electrical Design Automation (EDA) tools. The principal reason for this is the time required for the nonlinear solver to compute the solutions of large linearized systems during the simulation of these circuits. The research presented in this report aims to address the computational difficulties introduced by these large linearized systems by using Model Order Reduction (MOR) to (i) generate specialized preconditioners that accelerate the computation of the linear system solution and (ii) reduce the overall dynamical system size. MOR techniques attempt to produce macromodels that capture the desired input-output behavior of larger dynamical systems and enable substantial speedups in simulation time. Several MOR techniques that have been developed under the LDRD on 'Solution Methods for Very Highly Integrated Circuits' will be presented in this report. Among those presented are techniques for linear time-invariant dynamical systems that either extend current approaches or improve the time-domain performance of the reduced model using novel error bounds and a new approach for linear time-varying dynamical systems that guarantees dimension reduction, which has not been proven before. Progress on preconditioning power grid systems using multi-grid techniques will be presented as well as a framework for delivering MOR techniques to the user community using Trilinos and the Xyce circuit simulator, both prominent world-class software tools.

More Details

Investigating the impact of the cielo cray XE6 architecture on scientific application codes

Vaughan, Courtenay T.; Rajan, Mahesh R.; Barrett, Richard F.; Doerfler, Douglas W.; Pedretti, Kevin P.

Cielo, a Cray XE6, is the Department of Energy NNSA Advanced Simulation and Computing (ASC) campaign's newest capability machine. Rated at 1.37 PFLOPS, it consists of 8,944 dual-socket oct-core AMD Magny-Cours compute nodes, linked using Cray's Gemini interconnect. Its primary mission objective is to enable a suite of the ASC applications implemented using MPI to scale to tens of thousands of cores. Cielo is an evolutionary improvement to a successful architecture previously available to many of our codes, thus enabling a basis for understanding the capabilities of this new architecture. Using three codes strategically important to the ASC campaign, and supplemented with some micro-benchmarks that expose the fundamental capabilities of the XE6, we report on the performance characteristics and capabilities of Cielo.

More Details

Using triggered operations to offload collective communication operations

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Hemmert, K.S.; Barrett, Brian B.; Underwood, Keith D.

Efficient collective operations are a major component of application scalability. Offload of collective operations onto the network interface reduces many of the latencies that are inherent in network communications and, consequently, reduces the time to perform the collective operation. To support offload, it is desirable to expose semantic building blocks that are simple to offload and yet powerful enough to implement a variety of collective algorithms. This paper presents the implementation of barrier and broadcast leveraging triggered operations - a semantic building block for collective offload. Triggered operations are shown to be both semantically powerful and capable of improving performance. © 2010 Springer-Verlag.

More Details

Alternative perturbation theories for triple excitations in coupled-cluster theory

Molecular Physics

Taube, Andrew G.

The dominant method of small molecule quantum chemistry over the last twenty years is CCSD(T). Despite this success, RHF-based CCSD(T) fails for systems away from equilibrium. Work over the last ten years has lead to modifications of CCSD(T) that improve the description of bond breaking. These new methods include CCSD(T), CCSD(2)T, CCSD(2) and CR-CC(2,3), which are new perturbative corrections to single-reference CCSD. We present a unified derivation of these methods and compare them at the level of formal theory and computational accuracy. None of the methods is clearly superior, although formal considerations favour CCSD(T) and computational accuracy for the systems considered favours CR-CC(2,3). © 2010 Taylor & Francis.

More Details

A revolution in micropower : the catalytic nanodiode

Creighton, J.R.; Baucom, Kevin C.; Coltrin, Michael E.; Figiel, J.J.; Cross, Karen C.; Koleske, Daniel K.; Pawlowski, Roger P.; Heller, Edwin J.; Bogart, Katherine B.; Coker, Eric N.

Our ability to field useful, nano-enabled microsystems that capitalize on recent advances in sensor technology is severely limited by the energy density of available power sources. The catalytic nanodiode (reported by Somorjai's group at Berkeley in 2005) was potentially an alternative revolutionary source of micropower. Their first reports claimed that a sizable fraction of the chemical energy may be harvested via hot electrons (a 'chemicurrent') that are created by the catalytic chemical reaction. We fabricated and tested Pt/GaN nanodiodes, which eventually produced currents up to several microamps. Our best reaction yields (electrons/CO{sub 2}) were on the order of 10{sup -3}; well below the 75% values first reported by Somorjai (we note they have also been unable to reproduce their early results). Over the course of this Project we have determined that the whole concept of 'chemicurrent', in fact, may be an illusion. Our results conclusively demonstrate that the current measured from our nanodiodes is derived from a thermoelectric voltage; we have found no credible evidence for true chemicurrent. Unfortunately this means that the catalytic nanodiode has no future as a micropower source.

More Details

Installing python software packages : the good, the bad and the ugly

Hart, William E.

These slides describe different strategies for installing Python software. Although I am a big fan of Python software development, robust strategies for software installation remains a challenge. This talk describes several different installation scenarios. The Good: the user has administrative privileges - Installing on Windows with an installer executable, Installing with Linux application utility, Installing a Python package from the PyPI repository, and Installing a Python package from source. The Bad: the user does not have administrative privileges - Using a virtual environment to isolate package installations, and Using an installer executable on Windows with a virtual environment. The Ugly: the user needs to install an extension package from source - Installing a Python extension package from source, and PyCoinInstall - Managing builds for Python extension packages. The last item referring to PyCoinInstall describes a utility being developed for the COIN-OR software, which is used within the operations research community. COIN-OR includes a variety of Python and C++ software packages, and this script uses a simple plug-in system to support the management of package builds and installation.

More Details

On Raviart-Thomas and VMS formulations for flow in heterogeneous materials

Turner, Daniel Z.

It is well known that the continuous Galerkin method (in its standard form) is not locally conservative, yet many stabilized methods are constructed by augmenting the standard Galerkin weak form. In particular, the Variational Multiscale (VMS) method has achieved popularity for combating numerical instabilities that arise for mixed formulations that do not otherwise satisfy the LBB condition. Among alternative methods that satisfy local and global conservation, many employ Raviart-Thomas function spaces. The lowest order Raviart-Thomas finite element formulation (RT0) consists of evaluating fluxes over the midpoint of element edges and constant pressures within the element. Although the RT0 element poses many advantages, it has only been shown viable for triangular or tetrahedral elements (quadrilateral variants of this method do not pass the patch test). In the context of heterogenous materials, both of these methods have been used to model the mixed form of the Darcy equation. This work aims, in a comparative fashion, to evaluate the strengths and weaknesses of either approach for modeling Darcy flow for problems with highly varying material permeabilities and predominantly open flow boundary conditions. Such problems include carbon sequestration and enhanced oil recovery simulations for which the far-field boundary is typically described with some type of pressure boundary condition. We intend to show the degree to which the VMS formulation violates local mass conservation for these types of problems and compare the performance of the VMS and RT0 methods at boundaries between disparate permeabilities.

More Details

Managing variability in the IO performance of petascale storage systems

Oldfield, Ron A.

Significant challenges exist for achieving peak or even consistent levels of performance when using IO systems at scale. They stem from sharing IO system resources across the processes of single large-scale applications and/or multiple simultaneous programs causing internal and external interference, which in turn, causes substantial reductions in IO performance. This paper presents interference effects measurements for two different file systems at multiple supercomputing sites. These measurements motivate developing a 'managed' IO approach using adaptive algorithms varying the IO system workload based on current levels and use areas. An implementation of these methods deployed for the shared, general scratch storage system on Oak Ridge National Laboratory machines achieves higher overall performance and less variability in both a typical usage environment and with artificially introduced levels of 'noise'. The latter serving to clearly delineate and illustrate potential problems arising from shared system usage and the advantages derived from actively managing it.

More Details

Coarse-graining in peridynamics

Silling, Stewart A.

The peridynamic theory is an extension of traditional solid mechanics that treats discontinuous media, including the evolution of discontinuities due to fracture, on the same mathematical basis as classically smooth media. A recent advance in the linearized peridynamic theory permits the reduction of the number of degrees of freedom modeled within a body. Under equilibrium conditions, this coarse graining method exactly reproduces the internal forces on the coarsened degrees of freedom, including the effect of the omitted material that is no longer explicitly modeled. The method applies to heterogeneous as well as homogeneous media and accounts for defects in the material. The coarse graining procedure can be repeated over and over, resulting in a hierarchically coarsened description that, at each stage, continues to reproduce the exact internal forces present in the original, detailed model. Each coarsening step results in reduced computational cost. This talk will describe the new peridynamic coarsening method and show computational examples.

More Details

Iterative packing for demand matching and sparse packing

Parekh, Ojas D.

The main result we will present is a 2k-approximation algorithm for the following 'k-hypergraph demand matching' problem: given a set system with sets of size <=k, where sets have profits & demands and vertices have capacities, find a max-profit subsystem whose demands do not exceed the capacities. The main tool is an iterative way to explicitly build a decomposition of the fractional optimum as 2k times a convex combination of integral solutions. If time permits we'll also show how the approach can be extended to a 3-approximation for 2-column sparse packing. The second result is tight w.r.t the integrality gap, and the first is near-tight as a gap lower bound of 2(k-1+1/k) is known.

More Details

Pyomo : Python Optimization Modeling Objects

Siirola, John D.; Watson, Jean-Paul W.; Hart, William E.

The Python Optimization Modeling Objects (Pyomo) package [1] is an open source tool for modeling optimization applications within Python. Pyomo provides an objected-oriented approach to optimization modeling, and it can be used to define symbolic problems, create concrete problem instances, and solve these instances with standard solvers. While Pyomo provides a capability that is commonly associated with algebraic modeling languages such as AMPL, AIMMS, and GAMS, Pyomo's modeling objects are embedded within a full-featured high-level programming language with a rich set of supporting libraries. Pyomo leverages the capabilities of the Coopr software library [2], which integrates Python packages (including Pyomo) for defining optimizers, modeling optimization applications, and managing computational experiments. A central design principle within Pyomo is extensibility. Pyomo is built upon a flexible component architecture [3] that allows users and developers to readily extend the core Pyomo functionality. Through these interface points, extensions and applications can have direct access to an optimization model's expression objects. This facilitates the rapid development and implementation of new modeling constructs and as well as high-level solution strategies (e.g. using decomposition- and reformulation-based techniques). In this presentation, we will give an overview of the Pyomo modeling environment and model syntax, and present several extensions to the core Pyomo environment, including support for Generalized Disjunctive Programming (Coopr GDP), Stochastic Programming (PySP), a generic Progressive Hedging solver [4], and a tailored implementation of Bender's Decomposition.

More Details

Opportunities for leveraging OS virtualization in high-end supercomputing

Pedretti, Kevin P.; Bridges, Patrick G.

This paper examines potential motivations for incorporating virtualization support in the system software stacks of high-end capability supercomputers. We advocate that this will increase the flexibility of these platforms significantly and enable new capabilities that are not possible with current fixed software stacks. Our results indicate that compute, virtual memory, and I/O virtualization overheads are low and can be further mitigated by utilizing well-known techniques such as large paging and VMM bypass. Furthermore, since the addition of virtualization support does not affect the performance of applications using the traditional native environment, there is essentially no disadvantage to its addition.

More Details

Comparison of open source visual analytics toolkits

Harger, John R.; Crossno, Patricia J.

We present the results of the first stage of a two-stage evaluation of open source visual analytics packages. This stage is a broad feature comparison over a range of open source toolkits. Although we had originally intended to restrict ourselves to comparing visual analytics toolkits, we quickly found that very few were available. So we expanded our study to include information visualization, graph analysis, and statistical packages. We examine three aspects of each toolkit: visualization functions, analysis capabilities, and development environments. With respect to development environments, we look at platforms, language bindings, multi-threading/parallelism, user interface frameworks, ease of installation, documentation, and whether the package is still being actively developed.

More Details

Surrogate modeling with surfpack

Adams, Brian M.; Dalbey, Keith D.; Swiler, Laura P.

Surfpack is a library of multidimensional function approximation methods useful for efficient surrogate-based sensitivity/uncertainty analysis or calibration/optimization. I will survey current Surfpack meta-modeling capabilities for continuous variables and describe recent progress generalizing to both continuous and categorical factors, including relevant test problems and analysis comparisons.

More Details

Synthesis of an ionic liquid with an iron coordination cation

Dalton Transactions

Anderson, Travis M.; Ingersoll, David I.; Hensley, Alyssa H.; Staiger, Chad S.; Leonard, Jonathan C.

An iron-based ionic liquid, Fe((OHCH2CH2) 2NH)6(CF3SO3)3, is synthesized in a single-step complexation reaction. Infrared and Raman data suggest NH(CH2CH2OH)2 primarily coordinates to Fe(iii) through alcohol groups. The compound has Tg and Td values of -64°C and 260°C, respectively. Cyclic voltammetry reveals quasi-reversible Fe(iii)/Fe(ii) reduction waves. © 2010 The Royal Society of Chemistry.

More Details

Shifted power method for computing tensor eigenpairs

Kolda, Tamara G.; Dunlavy, Daniel D.

Recent work on eigenvalues and eigenvectors for tensors of order m {>=} 3 has been motivated by applications in blind source separation, magnetic resonance imaging, molecular conformation, and more. In this paper, we consider methods for computing real symmetric-tensor eigenpairs of the form Ax{sup m-1} = {lambda}x subject to {parallel}x{parallel} = 1, which is closely related to optimal rank-1 approximation of a symmetric tensor. Our contribution is a novel shifted symmetric higher-order power method (SS-HOPM), which we showis guaranteed to converge to a tensor eigenpair. SS-HOPM can be viewed as a generalization of the power iteration method for matrices or of the symmetric higher-order power method. Additionally, using fixed point analysis, we can characterize exactly which eigenpairs can and cannot be found by the method. Numerical examples are presented, including examples from an extension of the method to fnding complex eigenpairs.

More Details

Biosafety Risk Assessment Methodology

Caskey, Susan A.; Gaudioso, Jennifer M.; Salerno, Reynolds M.

Laboratories that work with biological agents need to manage their safety risks to persons working the laboratories and the human and animal community in the surrounding areas. Biosafety guidance defines a wide variety of biosafety risk mitigation measures, which include measures which fall under the following categories: engineering controls, procedural and administrative controls, and the use of personal protective equipment; the determination of which mitigation measures should be used to address the specific laboratory risks are dependent upon a risk assessment. Ideally, a risk assessment should be conducted in a manner which is standardized and systematic which allows it to be repeatable and comparable. A risk assessment should clearly define the risk being assessed and avoid over complication.

More Details

Parallel mesh management using interoperable tools

Devine, Karen D.

This presentation included a discussion of challenges arising in parallel mesh management, as well as demonstrated solutions. They also described the broad range of software for mesh management and modification developed by the Interoperable Technologies for Advanced Petascale Simulations (ITAPS) team, and highlighted applications successfully using the ITAPS tool suite.

More Details

Expanding the Trilinos developer community

Heroux, Michael A.

The Trilinos Project started approximately nine years ago as a small effort to enable research, development and ongoing support of small, related solver software efforts. The 'Tri' in Trilinos was intended to indicate the eventual three packages we planned to develop. In 2007 the project expanded its scope to include any package that was an enabling technology for technical computing. Presently the Trilinos repository contains over 55 packages covering a broad spectrum of reusable tools for constructing full-featured scalable scientific and engineering applications. Trilinos usage is now worldwide, and many applications have an explicit dependence on Trilinos for essential capabilities. Users come from other US laboratories, universities, industry and international research groups. Awareness and use of Trilinos is growing rapidly outside of Sandia. Members of the external research community are becoming more familiar with Trilinos, its design and collaborative nature. As a result, the Trilinos project is receiving an increasing number of requests from external community members who want to contribute to Trilinos as developers. To-date we have worked with external developers in an ad hoc fashion. Going forward, we want to develop a set of policies, procedures, tools and infrastructure to simplify interactions with external developers. As we go forward with multi-laboratory efforts such as CASL and X-Stack, and international projects such as IESP, we will need a more streamlined and explicit process for making external developers 'first-class citizens' in the Trilinos development community. This document is intended to frame the discussion for expanding the Trilinos community to all strategically important external members, while at the same time preserving Sandia's primary leadership role in the project.

More Details

Toward robust scalable algebraic multigrid solvers

Tuminaro, Raymond S.; Siefert, Christopher S.; Hu, Jonathan J.; Gaidamour, Jeremie G.

This talk highlights some multigrid challenges that arise from several application areas including structural dynamics, fluid flow, and electromagnetics. A general framework is presented to help introduce and understand algebraic multigrid methods based on energy minimization concepts. Connections between algebraic multigrid prolongators and finite element basis functions are made to explored. It is shown how the general algebraic multigrid framework allows one to adapt multigrid ideas to a number of different situations. Examples are given corresponding to linear elasticity and specifically in the solution of linear systems associated with extended finite elements for fracture problems.

More Details

Drying/self-assembly of nanoparticle suspensions

Grest, Gary S.; Cheng, Shengfeng C.; Lechman, Jeremy B.; Plimpton, Steven J.

The most feasible way to disperse particles in a bulk material or control their packing at a substrate is through fluidization in a carrier that can be processed with well-known techniques such as spin, drip and spray coating, fiber drawing, and casting. The next stage in the processing is often solidification involving drying by solvent evaporation. While there has been significant progress in the past few years in developing discrete element numerical methods to model dense nanoparticle dispersion/suspension rheology which properly treat the hydrodynamic interactions of the solvent, these methods cannot at present account for the volume reduction of the suspension due to solvent evaporation. As part of LDRD project FY-101285 we have developed and implemented methods in the current suite of discrete element methods to remove solvent particles and volume, and hence solvent mass from the liquid/vapor interface of a suspension to account for volume reduction (solvent drying) effects. To validate the methods large scale molecular dynamics simulations have been carried out to follow the evaporation process at the microscopic scale.

More Details

Predictive Capability Maturity Model (PCMM)

Swiler, Laura P.; Knupp, Patrick K.

Predictive Capability Maturity Model (PCMM) is a communication tool that must include a dicussion of the supporting evidence. PCMM is a tool for managing risk in the use of modeling and simulation. PCMM is in the service of organizing evidence to help tell the modeling and simulation (M&S) story. PCMM table describes what activities within each element are undertaken at each of the levels of maturity. Target levels of maturity can be established based on the intended application. The assessment is to inform what level has been achieved compared to the desired level, to help prioritize the VU activities & to allocate resources.

More Details

See applications run and throughput jump: The case for redundant computing in HPC

Proceedings of the International Conference on Dependable Systems and Networks

Riesen, Rolf; Ferreira, Kurt; Stearley, Jon S.

For future parallel-computing systems with as few as twenty-thousand nodes we propose redundant computing to reduce the number of application interrupts. The frequency of faults in exascale systems will be so high that traditional checkpoint/restart methods will break down. Applications will experience interruptions so often that they will spend more time restarting and recovering lost work, than computing the solution. We show that redundant computation at large scale can be cost effective and allows applications to complete their work in significantly less wall-clock time. On truly large systems, redundant computing can increase system throughput by an order of magnitude. © 2010 IEEE.

More Details

Subsystem functionals and the missing ingredient of confinement physics in density functionals

Physical Review B - Condensed Matter and Materials Physics

Hao, Feng H.; Armiento, Rickard; Mattsson, Ann E.

The subsystem functional scheme is a promising approach recently proposed for constructing exchange-correlation density functionals. In this scheme, the physics in each part of real materials is described by mapping to a characteristic model system. The "confinement physics," an essential physical ingredient that has been left out in present functionals, is studied by employing the harmonic-oscillator (HO) gas model. By performing the potential→density and the density→exchange energy per particle mappings based on two model systems characterizing the physics in the interior (uniform electron-gas model) and surface regions (Airy gas model) of materials for the HO gases, we show that the confinement physics emerges when only the lowest subband of the HO gas is occupied by electrons. We examine the approximations of the exchange energy by several state-of-the-art functionals for the HO gas, and none of them produces adequate accuracy in the confinement dominated cases. A generic functional that incorporates the description of the confinement physics is needed. © 2010 The American Physical Society.

More Details

Adversary phase change detection using S.O.M. and text data

Speed, Ann S.; Warrender, Christina E.

In this work, we developed a self-organizing map (SOM) technique for using web-based text analysis to forecast when a group is undergoing a phase change. By 'phase change', we mean that an organization has fundamentally shifted attitudes or behaviors. For instance, when ice melts into water, the characteristics of the substance change. A formerly peaceful group may suddenly adopt violence, or a violent organization may unexpectedly agree to a ceasefire. SOM techniques were used to analyze text obtained from organization postings on the world-wide web. Results suggest it may be possible to forecast phase changes, and determine if an example of writing can be attributed to a group of interest.

More Details

Mesoscale to plant-scale models of nuclear waste reprocessing

Rao, Rekha R.; Pawlowski, Roger P.; Brotherton, Christopher M.; Cipiti, Benjamin B.; Domino, Stefan P.; Jove Colon, Carlos F.; Moffat, Harry K.; Nemer, Martin N.; Noble, David R.; O'Hern, Timothy J.

Imported oil exacerabates our trade deficit and funds anti-American regimes. Nuclear Energy (NE) is a demonstrated technology with high efficiency. NE's two biggest political detriments are possible accidents and nuclear waste disposal. For NE policy, proliferation is the biggest obstacle. Nuclear waste can be reduced through reprocessing, where fuel rods are separated into various streams, some of which can be reused in reactors. Current process developed in the 1950s is dirty and expensive, U/Pu separation is the most critical. Fuel rods are sheared and dissolved in acid to extract fissile material in a centrifugal contactor. Plants have many contacts in series with other separations. We have taken a science and simulation-based approach to develop a modern reprocessing plant. Models of reprocessing plants are needed to support nuclear materials accountancy, nonproliferation, plant design, and plant scale-up.

More Details

Visualization on supercomputing platform level II ASC milestone (3537-1B) results from Sandia

Moreland, Kenneth D.; Fabian, Nathan D.

This report provides documentation for the completion of the Sandia portion of the ASC Level II Visualization on the platform milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratories and Los Alamos National Laboratories. This milestone contains functionality required for performing visualization directly on a supercomputing platform, which is necessary for peta-scale visualization. Sandia's contribution concerns in-situ visualization, running a visualization in tandem with a solver. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is most computationally intensive portion of the visualization process. For terascale platforms, commodity clusters with graphics processors(GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the performance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. Scientific simulation on parallel supercomputers is traditionally performed in four sequential steps: meshing, partitioning, solver, and visualization. Not all of these components are necessarily run on the supercomputer. In particular, the meshing and visualization typically happen on smaller but more interactive computing resources. However, the previous decade has seen a growth in both the need and ability to perform scalable parallel analysis, and this gives motivation for coupling the solver and visualization.

More Details

Peridynamic modeling of fracture in elastomers and composites

Silling, Stewart A.

The peridynamic model of solid mechanics is a mathematical theory designed to provide consistent mathematical treatment of deformations involving discontinuities, especially cracks. Unlike the partial differential equations (PDEs) of the standard theory, the fundamental equations of the peridynamic theory remain applicable on singularities such as crack surfaces and tips. These basic relations are integro-differential equations that do not require the existence of spatial derivatives of the deformation, or even continuity of the deformation. In the peridynamic theory, material points in a continuous body separated from each other by finite distances can interact directly through force densities. The interaction between each pair of points is called a bond. The dependence of the force density in a bond on the deformation provides the constitutive model for a material. By allowing the force density in a bond to depend on the deformation of other nearby bonds, as well as its own deformation, a wide spectrum of material response can be modelled. Damage is included in the constitutive model through the irreversible breakage of bonds according to some criterion. This criterion determines the critical energy release rate for a peridynamic material. In this talk, we present a general discussion of the peridynamic method and recent progress in its application to penetration and fracture in nonlinearly elastic solids. Constitutive models are presented for rubbery materials, including damage evolution laws. The deformation near a crack tip is discussed and compared with results from the standard theory. Examples demonstrating the spontaneous nucleation and growth of cracks are presented. It is also shown how the method can be applied to anisotropic media, including fiber reinforced composites. Examples show prediction of impact damage in composites and comparison against experimental measurements of damage and delamination.

More Details

Scheduling error correction operations for a quantum computer

Phillips, Cynthia A.; Carr, Robert D.; Ganti, Anand G.; Landahl, Andrew J.

In a (future) quantum computer a single logical quantum bit (qubit) will be made of multiple physical qubits. These extra physical qubits implement mandatory extensive error checking. The efficiency of error correction will fundamentally influence the performance of a future quantum computer, both in latency/speed and in error threshold (the worst error tolerated for an individual gate). Executing this quantum error correction requires scheduling the individual operations subject to architectural constraints. Since our last talk on this subject, a team of researchers at Sandia National Labortories has designed a logical qubit architecture that considers all relevant architectural issues including layout, the effects of supporting classical electronics, and the types of gates that the underlying physical qubit implementation supports most naturally. This is a two-dimensional system where 2-qubit operations occur locally, so there is no need to calculate more complex qubit/information transportation. Using integer programming, we found a schedule of qubit operations that obeys the hardware constraints, implements the local-check code in the native gate set, and minimizes qubit idle periods. Even with an optimal schedule, however, parallel Monte Carlo simulation shows that there is no finite error probability for the native gates such that the error-correction system would be benecial. However, by adding dynamic decoupling, a series of timed pulses that can reverse some errors, we found that there may be a threshold. Thus finding optimal schedules for increasingly-refined scheduling problems has proven critical for the overall design of the logical qubit system. We describe the evolving scheduling problems and the ideas behind the integer programming-based solution methods. This talk assumes no prior knowledge of quantum computing.

More Details

Development, sensitivity analysis, and uncertainty quantification of high-fidelity arctic sea ice models

Bochev, Pavel B.; Paskaleva, Biliana S.

Arctic sea ice is an important component of the global climate system and due to feedback effects the Arctic ice cover is changing rapidly. Predictive mathematical models are of paramount importance for accurate estimates of the future ice trajectory. However, the sea ice components of Global Climate Models (GCMs) vary significantly in their prediction of the future state of Arctic sea ice and have generally underestimated the rate of decline in minimum sea ice extent seen over the past thirty years. One of the contributing factors to this variability is the sensitivity of the sea ice to model physical parameters. A new sea ice model that has the potential to improve sea ice predictions incorporates an anisotropic elastic-decohesive rheology and dynamics solved using the material-point method (MPM), which combines Lagrangian particles for advection with a background grid for gradient computations. We evaluate the variability of the Los Alamos National Laboratory CICE code and the MPM sea ice code for a single year simulation of the Arctic basin using consistent ocean and atmospheric forcing. Sensitivities of ice volume, ice area, ice extent, root mean square (RMS) ice speed, central Arctic ice thickness, and central Arctic ice speed with respect to ten different dynamic and thermodynamic parameters are evaluated both individually and in combination using the Design Analysis Kit for Optimization and Terascale Applications (DAKOTA). We find similar responses for the two codes and some interesting seasonal variability in the strength of the parameters on the solution.

More Details

LDRD final report : a lightweight operating system for multi-core capability class supercomputers

Pedretti, Kevin T.T.; Levenhagen, Michael J.; Ferreira, Kurt; Brightwell, Ronald B.; Kelly, Suzanne M.; Bridges, Patrick G.

The two primary objectives of this LDRD project were to create a lightweight kernel (LWK) operating system(OS) designed to take maximum advantage of multi-core processors, and to leverage the virtualization capabilities in modern multi-core processors to create a more flexible and adaptable LWK environment. The most significant technical accomplishments of this project were the development of the Kitten lightweight kernel, the co-development of the SMARTMAP intra-node memory mapping technique, and the development and demonstration of a scalable virtualization environment for HPC. Each of these topics is presented in this report by the inclusion of a published or submitted research paper. The results of this project are being leveraged by several ongoing and new research projects.

More Details

LDRD final report : managing shared memory data distribution in hybrid HPC applications

Pedretti, Kevin T.T.

MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.

More Details

ParaText : scalable solutions for processing and searching very large document collections : final LDRD report

Dunlavy, Daniel D.; Crossno, Patricia J.

This report is a summary of the accomplishments of the 'Scalable Solutions for Processing and Searching Very Large Document Collections' LDRD, which ran from FY08 through FY10. Our goal was to investigate scalable text analysis; specifically, methods for information retrieval and visualization that could scale to extremely large document collections. Towards that end, we designed, implemented, and demonstrated a scalable framework for text analysis - ParaText - as a major project deliverable. Further, we demonstrated the benefits of using visual analysis in text analysis algorithm development, improved performance of heterogeneous ensemble models in data classification problems, and the advantages of information theoretic methods in user analysis and interpretation in cross language information retrieval. The project involved 5 members of the technical staff and 3 summer interns (including one who worked two summers). It resulted in a total of 14 publications, 3 new software libraries (2 open source and 1 internal to Sandia), several new end-user software applications, and over 20 presentations. Several follow-on projects have already begun or will start in FY11, with additional projects currently in proposal.

More Details

Geometric comparison of popular mixture-model distances

Mitchell, Scott A.

More Details

Toward exascale computing through neuromorphic approaches

Forsythe, James C.; Branch, Darren W.; McKenzie, Amber T.

While individual neurons function at relatively low firing rates, naturally-occurring nervous systems not only surpass manmade systems in computing power, but accomplish this feat using relatively little energy. It is asserted that the next major breakthrough in computing power will be achieved through application of neuromorphic approaches that mimic the mechanisms by which neural systems integrate and store massive quantities of data for real-time decision making. The proposed LDRD provides a conceptual foundation for SNL to make unique advances toward exascale computing. First, a team consisting of experts from the HPC, MESA, cognitive and biological sciences and nanotechnology domains will be coordinated to conduct an exercise with the outcome being a concept for applying neuromorphic computing to achieve exascale computing. It is anticipated that this concept will involve innovative extension and integration of SNL capabilities in MicroFab, material sciences, high-performance computing, and modeling and simulation of neural processes/systems.

More Details

Peridynamics as a rigorous coarse-graining of atomistics for multiscale materials design

Aidun, John B.; Kamm, James R.; Lehoucq, Richard B.; Parks, Michael L.; Sears, Mark P.; Silling, Stewart A.

This report summarizes activities undertaken during FY08-FY10 for the LDRD Peridynamics as a Rigorous Coarse-Graining of Atomistics for Multiscale Materials Design. The goal of our project was to develop a coarse-graining of finite temperature molecular dynamics (MD) that successfully transitions from statistical mechanics to continuum mechanics. The goal of our project is to develop a coarse-graining of finite temperature molecular dynamics (MD) that successfully transitions from statistical mechanics to continuum mechanics. Our coarse-graining overcomes the intrinsic limitation of coupling atomistics with classical continuum mechanics via the FEM (finite element method), SPH (smoothed particle hydrodynamics), or MPM (material point method); namely, that classical continuum mechanics assumes a local force interaction that is incompatible with the nonlocal force model of atomistic methods. Therefore FEM, SPH, and MPM inherit this limitation. This seemingly innocuous dichotomy has far reaching consequences; for example, classical continuum mechanics cannot resolve the short wavelength behavior associated with atomistics. Other consequences include spurious forces, invalid phonon dispersion relationships, and irreconcilable descriptions/treatments of temperature. We propose a statistically based coarse-graining of atomistics via peridynamics and so develop a first of a kind mesoscopic capability to enable consistent, thermodynamically sound, atomistic-to-continuum (AtC) multiscale material simulation. Peridynamics (PD) is a microcontinuum theory that assumes nonlocal forces for describing long-range material interaction. The force interactions occurring at finite distances are naturally accounted for in PD. Moreover, PDs nonlocal force model is entirely consistent with those used by atomistics methods, in stark contrast to classical continuum mechanics. Hence, PD can be employed for mesoscopic phenomena that are beyond the realms of classical continuum mechanics and atomistic simulations, e.g., molecular dynamics and density functional theory (DFT). The latter two atomistic techniques are handicapped by the onerous length and time scales associated with simulating mesoscopic materials. Simulating such mesoscopic materials is likely to require, and greatly benefit from multiscale simulations coupling DFT, MD, PD, and explicit transient dynamic finite element methods FEM (e.g., Presto). The proposed work fills the gap needed to enable multiscale materials simulations.

More Details

LDRD final report : leveraging multi-way linkages on heterogeneous data

Dunlavy, Daniel D.; Kolda, Tamara G.

This report is a summary of the accomplishments of the 'Leveraging Multi-way Linkages on Heterogeneous Data' which ran from FY08 through FY10. The goal was to investigate scalable and robust methods for multi-way data analysis. We developed a new optimization-based method called CPOPT for fitting a particular type of tensor factorization to data; CPOPT was compared against existing methods and found to be more accurate than any faster method and faster than any equally accurate method. We extended this method to computing tensor factorizations for problems with incomplete data; our results show that you can recover scientifically meaningfully factorizations with large amounts of missing data (50% or more). The project has involved 5 members of the technical staff, 2 postdocs, and 1 summer intern. It has resulted in a total of 13 publications, 2 software releases, and over 30 presentations. Several follow-on projects have already begun, with more potential projects in development.

More Details
Results 8001–8100 of 9,998
Results 8001–8100 of 9,998