Publications

Results 8201–8400 of 9,998

Search results

Jump to search filters

A combinatorial method for tracing objects using semantics of their shape

Diegert, Carl F.

We present a shape-first approach to finding automobiles and trucks in overhead images and include results from our analysis of an image from the Overhead Imaging Research Dataset [1]. For the OIRDS, our shape-first approach traces candidate vehicle outlines by exploiting knowledge about an overhead image of a vehicle: a vehicle's outline fits into a rectangle, this rectangle is sized to allow vehicles to use local roads, and rectangles from two different vehicles are disjoint. Our shape-first approach can efficiently process high-resolution overhead imaging over wide areas to provide tips and cues for human analysts, or for subsequent automatic processing using machine learning or other analysis based on color, tone, pattern, texture, size, and/or location (shape first). In fact, computationally-intensive complex structural, syntactic, and statistical analysis may be possible when a shape-first work flow sends a list of specific tips and cues down a processing pipeline rather than sending the whole of wide area imaging information. This data flow may fit well when bandwidth is limited between computers delivering ad hoc image exploitation and an imaging sensor. As expected, our early computational experiments find that the shape-first processing stage appears to reliably detect rectangular shapes from vehicles. More intriguing is that our computational experiments with six-inch GSD OIRDS benchmark images show that the shape-first stage can be efficient, and that candidate vehicle locations corresponding to features that do not include vehicles are unlikely to trigger tips and cues. We found that stopping with just the shape-first list of candidate vehicle locations, and then solving a weighted, maximal independent vertex set problem to resolve conflicts among candidate vehicle locations, often correctly traces the vehicles in an OIRDS scene.

More Details

DAKOTA : a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis. Version 5.0, user's manual

Adams, Brian M.; Dalbey, Keith D.; Eldred, Michael S.; Gay, David M.; Swiler, Laura P.; Bohnhoff, William J.; Eddy, John P.; Haskell, Karen H.; Hough, Patricia D.

The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a user's manual for the DAKOTA software and provides capability overviews and procedures for software execution, as well as a variety of example studies.

More Details

DAKOTA : a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis. Version 5.0, user's reference manual

Adams, Brian M.; Dalbey, Keith D.; Eldred, Michael S.; Gay, David M.; Swiler, Laura P.; Bohnhoff, William J.; Eddy, John P.; Haskell, Karen H.; Hough, Patricia D.

The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a reference manual for the commands specification for the DAKOTA software, providing input overviews, option descriptions, and example specifications.

More Details

DAKOTA : a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis. Version 5.0, developers manual

Adams, Brian M.; Dalbey, Keith D.; Eldred, Michael S.; Gay, David M.; Swiler, Laura P.; Bohnhoff, William J.; Eddy, John P.; Haskell, Karen H.; Hough, Patricia D.

The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a developers manual for the DAKOTA software and describes the DAKOTA class hierarchies and their interrelationships. It derives directly from annotation of the actual source code and provides detailed class documentation, including all member functions and attributes.

More Details

Teuchos C++ memory management classes, idioms, and related topics, the complete reference : a comprehensive strategy for safe and efficient memory management in C++ for high performance computing

Bartlett, Roscoe B.

More Details

Challenges for high-performance networking for exascale computing

Brightwell, Ronald B.; Barrett, Brian B.; Hemmert, Karl S.

Achieving the next three orders of magnitude performance increase to move from petascale to exascale computing will require a significant advancements in several fundamental areas. Recent studies have outlined many of the challenges in hardware and software that will be needed. In this paper, we examine these challenges with respect to high-performance networking. We describe the repercussions of anticipated changes to computing and networking hardware and discuss the impact that alternative parallel programming models will have on the network software stack. We also present some ideas on possible approaches that address some of these challenges.

More Details

A new pressure relaxation closure model for two%3CU%2B2010%3Ematerial lagrangian hydrodynamics

Kamm, James R.; Rider, William J.

We present a new model for closing a system of Lagrangian hydrodynamics equations for a two-material cell with a single velocity model. We describe a new approach that is motivated by earlier work of Delov and Sadchikov and of Goncharov and Yanilkin. Using a linearized Riemann problem to initialize volume fraction changes, we require that each material satisfy its own pdV equation, which breaks the overall energy balance in the mixed cell. To enforce this balance, we redistribute the energy discrepancy by assuming that the corresponding pressure change in each material is equal. This multiple-material model is packaged as part of a two-step time integration scheme. We compare results of our approach with other models and with corresponding pure-material calculations, on two-material test problems with ideal-gas or stiffened-gas equations of state.

More Details

Resolving local ambiguity using semantics of shape

Diegert, Carl F.

We demonstrate a new semantic method for automatic analysis of wide-area, high-resolution overhead imagery to tip and cue human intelligence analysts to human activity. In the open demonstration, we find and trace cars and rooftops. Our methodology, extended to analysis of voxels, may be applicable to understanding morphology and to automatic tracing of neurons in large-scale, serial-section TEM datasets. We defined an algorithm and software implementation that efficiently finds all combinations of image blobs that satisfy given shape semantics, where image blobs are formed as a general-purpose, first step that 'oversegments' image pixels into blobs of similar pixels. We will demonstrate the remarkable power (ROC) of this combinatorial-based work flow for automatically tracing any automobiles in a scene by applying semantics that require a subset of image blobs to fill out a rectangular shape, with width and height in given intervals. In most applications we find that the new combinatorial-based work flow produces alternative (overlapping) tracings of possible objects (e.g. cars) in a scene. To force an estimation (tracing) of a consistent collection of objects (cars), a quick-and-simple greedy algorithm is often sufficient. We will demonstrate a more powerful resolution method: we produce a weighted graph from the conflicts in all of our enumerated hypotheses, and then solve a maximal independent vertex set problem on this graph to resolve conflicting hypotheses. This graph computation is almost certain to be necessary to adequately resolve multiple, conflicting neuron topologies into a set that is most consistent with a TEM dataset.

More Details

The alliance for computing at the extreme scale

Ang, James A.; Doerfler, Douglas W.; Dosanjh, Sudip S.; Hemmert, Karl S.

Los Alamos and Sandia National Laboratories have formed a new high performance computing center, the Alliance for Computing at the Extreme Scale (ACES). The two labs will jointly architect, develop, procure and operate capability systems for DOE's Advanced Simulation and Computing Program. This presentation will discuss a petascale production capability system, Cielo, that will be deployed in late 2010, and a new partnership with Cray on advanced interconnect technologies.

More Details

Reliability-based design optimization using efficient global reliability analysis

Eldred, Michael S.

Finding the optimal (lightest, least expensive, etc.) design for an engineered component that meets or exceeds a specified level of reliability is a problem of obvious interest across a wide spectrum of engineering fields. Various methods for this reliability-based design optimization problem have been proposed. Unfortunately, this problem is rarely solved in practice because, regardless of the method used, solving the problem is too expensive or the final solution is too inaccurate to ensure that the reliability constraint is actually satisfied. This is especially true for engineering applications involving expensive, implicit, and possibly nonlinear performance functions (such as large finite element models). The Efficient Global Reliability Analysis method was recently introduced to improve both the accuracy and efficiency of reliability analysis for this type of performance function. This paper explores how this new reliability analysis method can be used in a design optimization context to create a method of sufficient accuracy and efficiency to enable the use of reliability-based design optimization as a practical design tool.

More Details

Integrating event detection system operation characteristics into sensor placement optimization

Hart, David B.; Hart, William E.; Mckenna, Sean A.; Phillips, Cynthia A.

We consider the problem of placing sensors in a municipal water network when we can choose both the location of sensors and the sensitivity and specificity of the contamination warning system. Sensor stations in a municipal water distribution network continuously send sensor output information to a centralized computing facility, and event detection systems at the control center determine when to signal an anomaly worthy of response. Although most sensor placement research has assumed perfect anomaly detection, signal analysis software has parameters that control the tradeoff between false alarms and false negatives. We describe a nonlinear sensor placement formulation, which we heuristically optimize with a linear approximation that can be solved as a mixed-integer linear program. We report the results of initial experiments on a real network and discuss tradeoffs between early detection of contamination incidents, and control of false alarms.

More Details

A coarsening method for linear peridynamics

Silling, Stewart A.

A method is obtained for deriving peridynamic material models for a sequence of increasingly coarsened descriptions of a body. The starting point is a known detailed, small scale linearized state-based description. Each successively coarsened model excludes some of the aterial present in the previous model, and the length scale increases accordingly. This excluded material, while not present explicitly in the coarsened model, is nevertheless taken into account implicitly through its effect on the forces in the coarsened material. Numerical examples emonstrate that the method accurately reproduces the effective elastic properties of a composite as well as the effect of a small defect in a homogeneous medium.

More Details

Xyce parallel electronic simulator

Keiter, Eric R.; Russo, Thomas V.; Schiek, Richard S.; Mei, Ting M.; Thornquist, Heidi K.; Coffey, Todd S.; Santarelli, Keith R.; Pawlowski, Roger P.

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.

More Details

Xyce parallel electronic simulator release notes

Keiter, Eric R.; Santarelli, Keith R.; Hoekstra, Robert J.; Russo, Thomas V.; Schiek, Richard S.; Mei, Ting M.; Thornquist, Heidi K.; Pawlowski, Roger P.; Coffey, Todd S.

The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.

More Details

High fidelity equation of state for xenon : integrating experiments and first principles simulations in developing a wide-range equation of state model for a fifth-row element

Magyar, Rudolph J.; Root, Seth R.; Carpenter, John H.; Mattsson, Thomas M.

The noble gas xenon is a particularly interesting element. At standard pressure xenon is an fcc solid which melts at 161 K and then boils at 165 K, thus displaying a rather narrow liquid range on the phase diagram. On the other hand, under pressure the melting point is significantly higher: 3000 K at 30 GPa. Under shock compression, electronic excitations become important at 40 GPa. Finally, xenon forms stable molecules with fluorine (XeF{sub 2}) suggesting that the electronic structure is significantly more complex than expected for a noble gas. With these reasons in mind, we studied the xenon Hugoniot using DFT/QMD and validated the simulations with multi-Mbar shock compression experiments. The results show that existing equation of state models lack fidelity and so we developed a wide-range free-energy based equation of state using experimental data and results from first-principles simulations.

More Details

Steps toward fault-tolerant quantum chemistry

Taube, Andrew G.

Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that MPI alone is insufficient to achieve parallel scaling; QC developers have been forced to use alternative approaches to achieve scalability and would be receptive to radical shifts in the programming paradigm. Initial work in adapting the simplest QC method, Hartree-Fock, to this the new programming model indicates that the approach is beneficial for QC applications. However, the advantages to being able to scale to exascale computers are greatest for the computationally most expensive algorithms; within QC these are the high-accuracy coupled-cluster (CC) methods. Parallel coupledcluster programs are available, however they are based on the conventional MPI paradigm. Much of the effort is spent handling the complicated data dependencies between the various processors, especially as the size of the problem becomes large. The current paradigm will not survive the move to exascale computers. Here we discuss the initial steps toward designing and implementing a CC method within this model. First, we introduce the general concepts behind a CC method, focusing on the aspects that make these methods difficult to parallelize with conventional techniques. Then we outline what is the computational core of the CC method - a matrix multiply - within the task-based approach that the FAST-OS project is designed to take advantage of. Finally we outline the general setup to implement the simplest CC method in this model, linearized CC doubles (LinCC).

More Details

Adversary phase change detection using SOMs and text data

Doser, Adele D.; Speed, Ann S.; Warrender, Christina E.

In this work, we developed a self-organizing map (SOM) technique for using web-based text analysis to forecast when a group is undergoing a phase change. By 'phase change', we mean that an organization has fundamentally shifted attitudes or behaviors. For instance, when ice melts into water, the characteristics of the substance change. A formerly peaceful group may suddenly adopt violence, or a violent organization may unexpectedly agree to a ceasefire. SOM techniques were used to analyze text obtained from organization postings on the world-wide web. Results suggest it may be possible to forecast phase changes, and determine if an example of writing can be attributed to a group of interest.

More Details

Failing in place for low-serviceability storage infrastructure using high-parity GPU-based RAID

Ward, Harry L.

In order to provide large quantities of high-reliability disk-based storage, it has become necessary to aggregate disks into fault-tolerant groups based on the RAID methodology. Most RAID levels do provide some fault tolerance, but there are certain classes of applications that require increased levels of fault tolerance within an array. Some of these applications include embedded systems in harsh environments that have a low level of serviceability, or uninhabited data centers servicing cloud computing. When describing RAID reliability, the Mean Time To Data Loss (MTTDL) calculations will often assume that the time to replace a failed disk is relatively low, or even negligible compared to rebuild time. For platforms that are in remote areas collecting and processing data, it may be impossible to access the system to perform system maintenance for long periods. A disk may fail early in a platform's life, but not be replaceable for much longer than typical for RAID arrays. Service periods may be scheduled at intervals on the order of months, or the platform may not be serviced until the end of a mission in progress. Further, this platform may be subject to extreme conditions that can accelerate wear and tear on a disk, requiring even more protection from failures. We have created a high parity RAID implementation that uses a Graphics Processing Unit (GPU) to compute more than two blocks of parity information per stripe, allowing extra parity to eliminate or reduce the requirement for rebuilding data between service periods. While this type of controller is highly effective for RAID 6 systems, an important benefit is the ability to incorporate more parity into a RAID storage system. Such RAID levels, as yet unnamed, can tolerate the failure of three or more disks (depending on configuration) without data loss. While this RAID system certainly has applications in embedded systems running applications in the field, similar benefits can be obtained for servers that are engineered for storage density, with less regard for serviceability or maintainability. A storage brick can be designed to have a MTTDL that extends well beyond the useful lifetime of the hardware used, allowing the disk subsystem to require less service throughout the lifetime of a compute resource. This approach is similar to the Xiotech ISE. Such a design can be deliberately placed remotely (without frequent support) in order to provide colocation, or meet cost goals. For workloads where reliability is key, but conditions are sub-optimal for routine serviceability, a high-parity RAID can provide extra reliability in extraordinary situations. For example, for installations requiring very high Mean Time To Repair, the extra parity can eliminate certain problems with maintaining hot spares, increasing overall reliability. Furthermore, in situations where disk reliability is reduced because of harsh conditions, extra parity can guard against early data loss due to lowered Mean Time To Failure. If used through an iSCSI interface with a streaming workload, it is possible to gain all of these benefits without impacting performance.

More Details

Application performance on the tri-lab linux capacity cluster -TLCC

International Journal of Distributed Systems and Technologies

Rajan, Mahesh; Doerfler, Douglas W.; Vaughan, Courtenay T.; Epperson, Marcus E.; Ogden, Jeff

In a recent acquisition by DOE/NNSA several large capacity computing clusters called TLCC have been installed at the DOE labs: SNL, LANL and LLNL. TLCC architecture with ccNUMA, multi-socket, multi-core nodes, and InfiniBand interconnect, is representative of the trend in HPC architectures. This paper examines application performance on TLCC contrasting them with Red Storm/Cray XT4. TLCC and Red Storm share similar AMD processors and memory DIMMs. Red Storm however has single socket nodes and custom interconnect. Micro-benchmarks and performance analysis tools help understand the causes for the observed performance differences. Control of processor and memory affinity on TLCC with the numactl utility is shown to result in significant performance gains and is essential to attenuate the detrimental impact of OS interference and cache-coherency overhead. While previous studies have investigated impact of affinity control mostly in the context of small SMP systems, the focus of this paper is on highly parallel MPI applications.

More Details

Description of the Sandia National Laboratories science, technology & engineering metrics process

Jordan, Gretchen B.; Oelschlaeger, Peter O.; Burns, A.R.; Watkins, Randall D.; Trucano, Timothy G.

There has been a concerted effort since 2007 to establish a dashboard of metrics for the Science, Technology, and Engineering (ST&E) work at Sandia National Laboratories. These metrics are to provide a self assessment mechanism for the ST&E Strategic Management Unit (SMU) to complement external expert review and advice and various internal self assessment processes. The data and analysis will help ST&E Managers plan, implement, and track strategies and work in order to support the critical success factors of nurturing core science and enabling laboratory missions. The purpose of this SAND report is to provide a guide for those who want to understand the ST&E SMU metrics process. This report provides an overview of why the ST&E SMU wants a dashboard of metrics, some background on metrics for ST&E programs from existing literature and past Sandia metrics efforts, a summary of work completed to date, specifics on the portfolio of metrics that have been chosen and the implementation process that has been followed, and plans for the coming year to improve the ST&E SMU metrics process.

More Details

Unstructured discontinuous Galerkin for seismic inversion

Collis, Samuel S.; Ober, Curtis C.; van Bloemen Waanders, Bart G.

This abstract explores the potential advantages of discontinuous Galerkin (DG) methods for the time-domain inversion of media parameters within the earth's interior. In particular, DG methods enable local polynomial refinement to better capture localized geological features within an area of interest while also allowing the use of unstructured meshes that can accurately capture discontinuous material interfaces. This abstract describes our initial findings when using DG methods combined with Runge-Kutta time integration and adjoint-based optimization algorithms for full-waveform inversion. Our initial results suggest that DG methods allow great flexibility in matching the media characteristics (faults, ocean bottom and salt structures) while also providing higher fidelity representations in target regions. Time-domain inversion using discontinuous Galerkin on unstructured meshes and with local polynomial refinement is shown to better capture localized geological features and accurately capture discontinuous-material interfaces. These approaches provide the ability to surgically refine representations in order to improve predicted models for specific geological features. Our future work will entail automated extensions to directly incorporate local refinement and adaptive unstructured meshes within the inversion process.

More Details

Importance sampling : promises and limitations

Swiler, Laura P.

Importance sampling is an unbiased sampling method used to sample random variables from different densities than originally defined. These importance sampling densities are constructed to pick 'important' values of input random variables to improve the estimation of a statistical response of interest, such as a mean or probability of failure. Conceptually, importance sampling is very attractive: for example one wants to generate more samples in a failure region when estimating failure probabilities. In practice, however, importance sampling can be challenging to implement efficiently, especially in a general framework that will allow solutions for many classes of problems. We are interested in the promises and limitations of importance sampling as applied to computationally expensive finite element simulations which are treated as 'black-box' codes. In this paper, we present a customized importance sampler that is meant to be used after an initial set of Latin Hypercube samples has been taken, to help refine a failure probability estimate. The importance sampling densities are constructed based on kernel density estimators. We examine importance sampling with respect to two main questions: is importance sampling efficient and accurate for situations where we can only afford small numbers of samples? And does importance sampling require the use of surrogate methods to generate a sufficient number of samples so that the importance sampling process does increase the accuracy of the failure probability estimate? We present various case studies to address these questions.

More Details

Transparent redundant computing with MPI

Brightwell, Ronald B.; Ferreira, Kurt

Extreme-scale parallel systems will require alternative methods for applications to maintain current levels of uninterrupted execution. Redundant computation is one approach to consider, if the benefits of increased resiliency outweigh the cost of consuming additional resources. We describe a transparent redundancy approach for MPI applications and detail two different implementations that provide the ability to tolerate a range of failure scenarios, including loss of application processes and connectivity.We compare these two approaches and show performance results from micro-benchmarks that bound worst-case message passing performance degradation.We propose several enhancements that could lower the overhead of providing resiliency through redundancy.

More Details

Arctic sea ice modeling with the material-point method

Peterson, Kara J.; Bochev, Pavel B.

Arctic sea ice plays an important role in global climate by reflecting solar radiation and insulating the ocean from the atmosphere. Due to feedback effects, the Arctic sea ice cover is changing rapidly. To accurately model this change, high-resolution calculations must incorporate: (1) annual cycle of growth and melt due to radiative forcing; (2) mechanical deformation due to surface winds, ocean currents and Coriolis forces; and (3) localized effects of leads and ridges. We have demonstrated a new mathematical algorithm for solving the sea ice governing equations using the material-point method with an elastic-decohesive constitutive model. An initial comparison with the LANL CICE code indicates that the ice edge is sharper using Materials-Point Method (MPM), but that many of the overall features are similar.

More Details

Scalable tensor factorizations with missing data

Dunlavy, Daniel D.; Kolda, Tamara G.

The problem of missing data is ubiquitous in domains such as biomedical signal processing, network traffic analysis, bibliometrics, social network analysis, chemometrics, computer vision, and communication networks|all domains in which data collection is subject to occasional errors. Moreover, these data sets can be quite large and have more than two axes of variation, e.g., sender, receiver, time. Many applications in those domains aim to capture the underlying latent structure of the data; in other words, they need to factorize data sets with missing entries. If we cannot address the problem of missing data, many important data sets will be discarded or improperly analyzed. Therefore, we need a robust and scalable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. We focus on one of the most well-known tensor factorizations, CANDECOMP/PARAFAC (CP), and formulate the CP model as a weighted least squares problem that models only the known entries. We develop an algorithm called CP-WOPT (CP Weighted OPTimization) using a first-order optimization approach to solve the weighted least squares problem. Based on extensive numerical experiments, our algorithm is shown to successfully factor tensors with noise and up to 70% missing data. Moreover, our approach is significantly faster than the leading alternative and scales to larger problems. To show the real-world usefulness of CP-WOPT, we illustrate its applicability on a novel EEG (electroencephalogram) application where missing data is frequently encountered due to disconnections of electrodes.

More Details

Effects of a conducting sphere moving through a gradient magnetic field

Ames, Thomas L.; Robinson, Allen C.

We examine several conducting spheres moving through a magnetic field gradient. An analytical approximation is derived and an experiment is conducted to verify the analytical solution. The experiment is simulated as well to produce a numerical result. Both the low and high magnetic Reynolds number regimes are studied. Deformation of the sphere is noted in the high Reynolds number case. It is suggested that this deformation effect could be useful for designing or enhancing present protection systems against space debris.

More Details

The X-caliber architecture for informatics supercomputers

Murphy, Richard C.

This talk discusses the unique demands that informatics applications, particularly graph-theoretic applications, place on computer systems. These applications tend to pose significant data movement challenges for conventional systems. Worse, underlying technology trends are moving computers to cost-driven optimization points that exacerbate the problem. The X-caliber architecture is an economically viable counter-example to conventional architectures based on the integration of innovative technologies that support the data movement requirements of large-scale informatics applications. This talk will discuss the technology drivers and architectural features of the platform, and present analysis showing the benefits for informatics applications, as well as our traditional science and engineering HPC applications.

More Details

On the path to exascale

International Journal of Distributed Systems and Technologies

Alvin, Kenneth F.; Barrett, Brian B.; Brightwell, Ronald B.; Dosanjh, Sudip S.; Geist, Al; Hemmert, Karl S.; Heroux, Michael; Kothe, Doug; Murphy, Richard C.; Nichols, Jeff; Oldfield, Ron A.; Rodrigues, Arun; Vetter, Jeffrey S.

There is considerable interest in achieving a 1000 fold increase in supercomputing power in the next decade, but the challenges are formidable. In this paper, the authors discuss some of the driving science and security applications that require Exascale computing (a million, trillion operations per second). Key architectural challenges include power, memory, interconnection networks and resilience. The paper summarizes ongoing research aimed at overcoming these hurdles. Topics of interest are architecture aware and scalable algorithms, system simulation, 3D integration, new approaches to system-directed resilience and new benchmarks. Although significant progress is being made, a broader international program is needed.

More Details

Ceci n'est pas une micromachine

Diegert, Carl F.; Yarberry, Victor R.

The image created in reflected light DIC can often be interpreted as a true three-dimensional representation of the surface geometry, provided a clear distinction can be realized between raised and lowered regions in the specimen. It may be helpful if our definition of saliency embraces work on the human visual system (HVS) as well as the more abstract work on saliency, as it is certain that understanding by humans will always stand between recording of a useful signal from all manner of sensors and so-called actionable intelligence. A DARPA/DSO program lays down this requirement in a current program (Kruse 2010): The vision for the Neurotechnology for Intelligence Analysts (NIA) Program is to revolutionize the way that analysts handle intelligence imagery, increasing both the throughput of imagery to the analyst and overall accuracy of the assessments. Current computer-based target detection capabilities cannot process vast volumes of imagery with the speed, flexibility, and precision of the human visual system.

More Details

Modeling the fracture of ice sheets on parallel computers

Tuminaro, Raymond S.; Boman, Erik G.

The objective of this project is to investigate the complex fracture of ice and understand its role within larger ice sheet simulations and global climate change. At the present time, ice fracture is not explicitly considered within ice sheet models due in part to large computational costs associated with the accurate modeling of this complex phenomena. However, fracture not only plays an extremely important role in regional behavior but also influences ice dynamics over much larger zones in ways that are currently not well understood. Dramatic illustrations of fracture-induced phenomena most notably include the recent collapse of ice shelves in Antarctica (e.g. partial collapse of the Wilkins shelf in March of 2008 and the diminishing extent of the Larsen B shelf from 1998 to 2002). Other fracture examples include ice calving (fracture of icebergs) which is presently approximated in simplistic ways within ice sheet models, and the draining of supraglacial lakes through a complex network of cracks, a so called ice sheet plumbing system, that is believed to cause accelerated ice sheet flows due essentially to lubrication of the contact surface with the ground. These dramatic changes are emblematic of the ongoing change in the Earth's polar regions and highlight the important role of fracturing ice. To model ice fracture, a simulation capability will be designed centered around extended finite elements and solved by specialized multigrid methods on parallel computers. In addition, appropriate dynamic load balancing techniques will be employed to ensure an approximate equal amount of work for each processor.

More Details

Poblano v1.0 : a Matlab toolbox for gradient-based optimization

Dunlavy, Daniel D.; Kolda, Tamara G.

We present Poblano v1.0, a Matlab toolbox for solving gradient-based unconstrained optimization problems. Poblano implements three optimization methods (nonlinear conjugate gradients, limited-memory BFGS, and truncated Newton) that require only first order derivative information. In this paper, we describe the Poblano methods, provide numerous examples on how to use Poblano, and present results of Poblano used in solving problems from a standard test collection of unconstrained optimization problems.

More Details

On calculating the equilibrium structure of molecular crystals

Wills, Ann E.; Wixom, Ryan R.; Mattsson, Thomas M.

The difficulty of calculating the ambient properties of molecular crystals, such as the explosive PETN, has long hampered much needed computational investigations of these materials. One reason for the shortcomings is that the exchange-correlation functionals available for Density Functional Theory (DFT) based calculations do not correctly describe the weak intermolecular van der Waals' forces present in molecular crystals. However, this weak interaction also poses other challenges for the computational schemes used. We will discuss these issues in the context of calculations of lattice constants and structure of PETN with a number of different functionals, and also discuss if these limitations can be circumvented for studies at non-ambient conditions.

More Details

HPC top 10 InfiniBand Machine : a 3D Torus IB interconnect on Red Sky

Naegle, John H.; Monk, Stephen T.; Schutt, James A.; Doerfler, Douglas W.; Rajan, Mahesh R.

This presentation discusses the following topics: (1) Red Sky Background; (2) 3D Torus Interconnect Concepts; (3) Difficulties of Torus in IB; (4) New Routing Code for IB a 3D Torus; (5) Red Sky 3D Torus Implementation; and (6) Managing a Large IB Machine. Computing at Sandia: (1) Capability Computing - Designed for scaling of single large runs, Usually proprietary for maximum performance, and Red Storm is Sandia's current capability machine; (2) Capacity Computing - Computing for the masses, 100s of jobs and 100s of users, Extreme reliability required, Flexibility for changing workload, Thunderbird will be decommissioned this quarter, Red Sky is our future capacity computing platform, and Red Mesa machine for National Renewable Energy Lab. Red Sky main themes are: (1) Cheaper - 5X capacity of Tbird at 2/3 the cost, Substantially cheaper per flop than our last large capacity machine purchase; (2) Leaner - Lower operational costs, Three security environments via modular fabric, Expandable, upgradeable, extensible, and Designed for 6yr. life cycle; and (3) Greener - 15% less power-1/6th power per flop, 40% less water-5M gallons saved annually, 10X better cooling efficiency, and 4x denser footprint.

More Details

Combining dynamical decoupling with optimal control for improved QIP

Carroll, Malcolm; Witzel, Wayne W.

Constructing high-fidelity control pulses that are robust to control and system/environment fluctuations is a crucial objective for quantum information processing (QIP). We combine dynamical decoupling (DD) with optimal control (OC) to identify control pulses that achieve this objective numerically. Previous DD work has shown that general errors up to (but not including) third order can be removed from {pi}- and {pi}/2-pulses without concatenation. By systematically integrating DD and OC, we are able to increase pulse fidelity beyond this limit. Our hybrid method of quantum control incorporates a newly-developed algorithm for robust OC, providing a nested DD-OC approach to generate robust controls. Motivated by solid-state QIP, we also incorporate relevant experimental constraints into this DD-OC formalism. To demonstrate the advantage of our approach, the resulting quantum controls are compared to previous DD results in open and uncertain model systems.

More Details

Solid-liquid phase coexistence of alkali nitrates from molecular dynamics simulations

Jayaraman, Saivenkataraman J.

Alkali nitrate eutectic mixtures are finding application as industrial heat transfer fluids in concentrated solar power generation systems. An important property for such applications is the melting point, or phase coexistence temperature. We have computed melting points for lithium, sodium and potassium nitrate from molecular dynamics simulations using a recently developed method, which uses thermodynamic integration to compute the free energy difference between the solid and liquid phases. The computed melting point for NaNO3 was within 15K of its experimental value, while for LiNO3 and KNO3, the computed melting points were within 100K of the experimental values [4]. We are currently extending the approach to calculate melting temperatures for binary mixtures of lithium and sodium nitrate.

More Details

A brief parallel I/O tutorial

Ward, Harry L.

This document provides common best practices for the efficient utilization of parallel file systems for analysts and application developers. A multi-program, parallel supercomputer is able to provide effective compute power by aggregating a host of lower-power processors using a network. The idea, in general, is that one either constructs the application to distribute parts to the different nodes and processors available and then collects the result (a parallel application), or one launches a large number of small jobs, each doing similar work on different subsets (a campaign). The I/O system on these machines is usually implemented as a tightly-coupled, parallel application itself. It is providing the concept of a 'file' to the host applications. The 'file' is an addressable store of bytes and that address space is global in nature. In essence, it is providing a global address space. Beyond the simple reality that the I/O system is normally composed of a small, less capable, collection of hardware, that concept of a global address space will cause problems if not very carefully utilized. How much of a problem and the ways in which those problems manifest will be different, but that it is problem prone has been well established. Worse, the file system is a shared resource on the machine - a system service. What an application does when it uses the file system impacts all users. It is not the case that some portion of the available resource is reserved. Instead, the I/O system responds to requests by scheduling and queuing based on instantaneous demand. Using the system well contributes to the overall throughput on the machine. From a solely self-centered perspective, using it well reduces the time that the application or campaign is subject to impact by others. The developer's goal should be to accomplish I/O in a way that minimizes interaction with the I/O system, maximizes the amount of data moved per call, and provides the I/O system the most information about the I/O transfer per request.

More Details

Porting LAMMPS to GPUs

Brown, William M.; Crozier, Paul C.; Plimpton, Steven J.

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has potentials for soft materials (biomolecules, polymers) and solid-state materials (metals, semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale. LAMMPS runs on single processors or in parallel using message-passing techniques and a spatial-decomposition of the simulation domain. The code is designed to be easy to modify or extend with new functionality.

More Details

Foundational development of an advanced nuclear reactor integrated safety code

Schmidt, Rodney C.; Hooper, Russell H.; Humphries, Larry; Lorber, Alfred L.; Spotz, William S.

This report describes the activities and results of a Sandia LDRD project whose objective was to develop and demonstrate foundational aspects of a next-generation nuclear reactor safety code that leverages advanced computational technology. The project scope was directed towards the systems-level modeling and simulation of an advanced, sodium cooled fast reactor, but the approach developed has a more general applicability. The major accomplishments of the LDRD are centered around the following two activities. (1) The development and testing of LIME, a Lightweight Integrating Multi-physics Environment for coupling codes that is designed to enable both 'legacy' and 'new' physics codes to be combined and strongly coupled using advanced nonlinear solution methods. (2) The development and initial demonstration of BRISC, a prototype next-generation nuclear reactor integrated safety code. BRISC leverages LIME to tightly couple the physics models in several different codes (written in a variety of languages) into one integrated package for simulating accident scenarios in a liquid sodium cooled 'burner' nuclear reactor. Other activities and accomplishments of the LDRD include (a) further development, application and demonstration of the 'non-linear elimination' strategy to enable physics codes that do not provide residuals to be incorporated into LIME, (b) significant extensions of the RIO CFD code capabilities, (c) complex 3D solid modeling and meshing of major fast reactor components and regions, and (d) an approach for multi-physics coupling across non-conformal mesh interfaces.

More Details

A framework for reduced order modeling with mixed moment matching and peak error objectives

SIAM Journal on Scientific Computing

Santarelli, Keith R.

We examine a new method of producing reduced order models for LTI systems which attempts to minimize a bound on the peak error between t he original and reduced order models subject to a bound on the peak value of the input. The method, which can be implemented by solving a set of linear programming problems that are parameterized v ia a single scalar quantity, is able to minimize an error bound subject to a number of moment matc hing constraints. Moreover, because all optimization is performed in the time domain, the method can also be used to perform model reduction for infinite dimensional systems, rather than being restricted to finite order state space descriptions. We begin by contrasting the method we present her e with two classes of standard model reduction algorithms, namely, moment matching algorithms and singular value-based methods. After motivating the class of reduction tools we propose, we describe the algorithm (which minimizes the Ll norm of the difference between the original and reduced order impulse responses) and formulate the corresponding linear programming problem that is solved during each iteration of the algorithm. We then prove that, for a certain class of LTI systems, the metho d we propose can be used to produce reduced order models of arbitrary accuracy even when the original system is infinite dimensional. We then show how to incorporate moment matching constraints into the basic error bound minimization algorithm, and present three examples which utilize the techni ques described herein. We conclude with some comments on extensions to multi-input, multi-output systems, as well as some general comments for future work. © 2010 Society for Industrial and Applied Mathematics.

More Details

A framework for reduced order modeling with mixed moment matching and peak error objectives

SIAM Journal on Scientific Computing

Santarelli, Keith R.

We examine a new method of producing reduced order models for LTI systems which attempts to minimize a bound on the peak error between t he original and reduced order models subject to a bound on the peak value of the input. The method, which can be implemented by solving a set of linear programming problems that are parameterized v ia a single scalar quantity, is able to minimize an error bound subject to a number of moment matc hing constraints. Moreover, because all optimization is performed in the time domain, the method can also be used to perform model reduction for infinite dimensional systems, rather than being restricted to finite order state space descriptions. We begin by contrasting the method we present her e with two classes of standard model reduction algorithms, namely, moment matching algorithms and singular value-based methods. After motivating the class of reduction tools we propose, we describe the algorithm (which minimizes the Ll norm of the difference between the original and reduced order impulse responses) and formulate the corresponding linear programming problem that is solved during each iteration of the algorithm. We then prove that, for a certain class of LTI systems, the metho d we propose can be used to produce reduced order models of arbitrary accuracy even when the original system is infinite dimensional. We then show how to incorporate moment matching constraints into the basic error bound minimization algorithm, and present three examples which utilize the techni ques described herein. We conclude with some comments on extensions to multi-input, multi-output systems, as well as some general comments for future work. © 2010 Society for Industrial and Applied Mathematics.

More Details

A switched state feedback law for the stabilization of LTI systems

Proceedings of the 2010 American Control Conference, ACC 2010

Santarelli, Keith R.

Inspired by prior work in the design of switched feedback controllers for second order systems, we develop a switched state feedback control law for the stabilization of LTI systems of arbitrary dimension. The control law operates by switching between two static gain vectors in such a way that the state trajectory is driven onto a stable n - 1 dimensional hyperplane (where n represents the system dimension). We begin by briefly examining relevant geometric properties of the phase portraits in the case of two-dimensional systems and show how these geometric properties can be expressed as algebraic constraints on the switched vector fields that are applicable to LTI systems of arbitrary dimension. We then describe an explicit procedure for designing stabilizing controllers and illustrate the closed-loop transient performance via two examples. © 2010 AACC.

More Details

Advanced I/O for large-scale scientific applications

Oldfield, Ron A.

As scientific simulations scale to use petascale machines and beyond, the data volumes generated pose a dual problem. First, with increasing machine sizes, the careful tuning of IO routines becomes more and more important to keep the time spent in IO acceptable. It is not uncommon, for instance, to have 20% of an application's runtime spent performing IO in a 'tuned' system. Careful management of the IO routines can move that to 5% or even less in some cases. Second, the data volumes are so large, on the order of 10s to 100s of TB, that trying to discover the scientifically valid contributions requires assistance at runtime to both organize and annotate the data. Waiting for offline processing is not feasible due both to the impact on the IO system and the time required. To reduce this load and improve the ability of scientists to use the large amounts of data being produced, new techniques for data management are required. First, there is a need for techniques for efficient movement of data from the compute space to storage. These techniques should understand the underlying system infrastructure and adapt to changing system conditions. Technologies include aggregation networks, data staging nodes for a closer parity to the IO subsystem, and autonomic IO routines that can detect system bottlenecks and choose different approaches, such as splitting the output into multiple targets, staggering output processes. Such methods must be end-to-end, meaning that even with properly managed asynchronous techniques, it is still essential to properly manage the later synchronous interaction with the storage system to maintain acceptable performance. Second, for the data being generated, annotations and other metadata must be incorporated to help the scientist understand output data for the simulation run as a whole, to select data and data features without concern for what files or other storage technologies were employed. All of these features should be attained while maintaining a simple deployment for the science code and eliminating the need for allocation of additional computational resources.

More Details

Lightweight storage and overlay networks for fault tolerance

Oldfield, Ron A.

The next generation of capability-class, massively parallel processing (MPP) systems is expected to have hundreds of thousands to millions of processors, In such environments, it is critical to have fault-tolerance mechanisms, including checkpoint/restart, that scale with the size of applications and the percentage of the system on which the applications execute. For application-driven, periodic checkpoint operations, the state-of-the-art does not provide a scalable solution. For example, on today's massive-scale systems that execute applications which consume most of the memory of the employed compute nodes, checkpoint operations generate I/O that consumes nearly 80% of the total I/O usage. Motivated by this observation, this project aims to improve I/O performance for application-directed checkpoints through the use of lightweight storage architectures and overlay networks. Lightweight storage provide direct access to underlying storage devices. Overlay networks provide caching and processing capabilities in the compute-node fabric. The combination has potential to signifcantly reduce I/O overhead for large-scale applications. This report describes our combined efforts to model and understand overheads for application-directed checkpoints, as well as implementation and performance analysis of a checkpoint service that uses available compute nodes as a network cache for checkpoint operations.

More Details

Peridynamic theory of solid mechanics

Proposed for publication in Advances in Applied Mechanics.

Silling, Stewart A.; Lehoucq, Richard B.

The peridynamic theory of mechanics attempts to unite the mathematical modeling of continuous media, cracks, and particles within a single framework. It does this by replacing the partial differential equations of the classical theory of solid mechanics with integral or integro-differential equations. These equations are based on a model of internal forces within a body in which material points interact with each other directly over finite distances. The classical theory of solid mechanics is based on the assumption of a continuous distribution of mass within a body. It further assumes that all internal forces are contact forces that act across zero distance. The mathematical description of a solid that follows from these assumptions relies on partial differential equations that additionally assume sufficient smoothness of the deformation for the PDEs to make sense in either their strong or weak forms. The classical theory has been demonstrated to provide a good approximation to the response of real materials down to small length scales, particularly in single crystals, provided these assumptions are met. Nevertheless, technology increasingly involves the design and fabrication of devices at smaller and smaller length scales, even interatomic dimensions. Therefore, it is worthwhile to investigate whether the classical theory can be extended to permit relaxed assumptions of continuity, to include the modeling of discrete particles such as atoms, and to allow the explicit modeling of nonlocal forces that are known to strongly influence the behavior of real materials.

More Details

Elastic wave propagation in variable media using a discontinuous Galerkin method

Society of Exploration Geophysicists International Exposition and 80th Annual Meeting 2010, SEG 2010

Smith, Thomas M.; Collis, Samuel S.; Ober, Curtis C.; Overfelt, James R.; Schwaiger, Hans F.

Motivated by the needs of seismic inversion and building on our prior experience for fluid-dynamics systems, we present a high-order discontinuous Galerkin (DG) Runge-Kutta method applied to isotropic, linearized elasto-dynamics. Unlike other DG methods recently presented in the literature, our method allows for inhomogeneous material variations within each element that enables representation of realistic earth models — a feature critical for future use in seismic inversion. Likewise, our method supports curved elements and hybrid meshes that include both simplicial and nonsimplicial elements. We demonstrate the capabilities of this method through a series of numerical experiments including hybrid mesh discretizations of the Marmousi2 model as well as a modified Marmousi2 model with a oscillatory ocean bottom that is exactly captured by our discretization.

More Details

A cognitive-consistency based model of population wide attitude change

AAAI Fall Symposium - Technical Report

Lakkaraju, Kiran L.; Speed, Ann S.

Attitudes play a significant role in determining how individuals process information and behave. In this paper we have developed a new computational model of population wide attitude change that captures the social level: how individuals interact and communicate information, and the cognitive level: how attitudes and concept interact with each other. The model captures the cognitive aspect by representing each individuals as a parallel constraint satisfaction network. The dynamics of this model are explored through a simple attitude change experiment where we vary the social network and distribution of attitudes in a population. Copyright © 2010, Association for the Advancement of Artificial Intelligence. All rights reserved.

More Details

Simulation of dynamic fracture using peridynamics, finite element modeling, and contact

ASME International Mechanical Engineering Congress and Exposition, Proceedings (IMECE)

Littlewood, David J.

Peridynamics is a nonlocal extension of classical solid mechanics that allows for the modeling of bodies in which discontinuities occur spontaneously. Because the peridynamic expression for the balance of linear momentum does not contain spatial derivatives and is instead based on an integral equation, it is well suited for modeling phenomena involving spatial discontinuities such as crack formation and fracture. In this study, both peridynamics and classical finite element analysis are applied to simulate material response under dynamic blast loading conditions. A combined approach is utilized in which the portion of the simulation modeled with peridynamics interacts with the finite element portion of the model via a contact algorithm. The peridynamic portion of the analysis utilizes an elastic-plastic constitutive model with linear hardening. The peridynamic interface to the constitutive model is based on the calculation of an approximate deformation gradient, requiring the suppression of possible zero-energy modes. The classical finite element portion of the model utilizes a Johnson-Cook constitutive model. Simulation results are validated by direct comparison to expanding tube experiments. The coupled modeling approach successfully captures material response at the surface of the tube and the emerging fracture pattern. Copyright © 2010 by ASME.

More Details

Formulation and optimization of robust sensor placement problems for drinking water contamination warning systems

Journal of Infrastructure Systems

Watson, Jean P.; Murray, Regan; Hart, William E.

The sensor placement problem in contamination warning system design for municipal water distribution networks involves maximizing the protection level afforded by limited numbers of sensors, typically quantified as the expected impact of a contamination event; the issue of how to mitigate against high-consequence events is either handled implicitly or ignored entirely. Consequently, expected-case sensor placements run the risk of failing to protect against high-consequence 9/11-style attacks. In contrast, robust sensor placements address this concern by focusing strictly on high-consequence events and placing sensors to minimize the impact of these events. We introduce several robust variations of the sensor placement problem, distinguished by how they quantify the potential damage due to high-consequence events. We explore the nature of robust versus expected-case sensor placements on three real-world large-scale distribution networks. We find that robust sensor placements can yield large reductions in the number and magnitude of high-consequence events, with only modest increases in expected impact. The ability to trade-off between robust and expected-case impacts is a key unexplored dimension in contamination warning system design. © 2009 ASCE.

More Details

Probabilistic methods in model validation

Conference Proceedings of the Society for Experimental Mechanics Series

Paez, Thomas L.; Swiler, Laura P.

Extensive experimentation over the past decade has shown that fabricated physical systems that are intended to be identical, and are nominally identical, in fact, differ from one another, and sometimes substantially. This fact makes it difficult to validate a mathematical model for any system and results in the requirement to characterize physical system behavior using the tools of uncertainty quantification. Further, because of the existence of system, component, and material uncertainty, the mathematical models of these elements sometimes seek to reflect the uncertainty. This presentation introduces some of the methods of probability and statistics, and shows how they can be applied in engineering modeling and data analysis. The ideas of randomness and some basic means for measuring and modeling it are presented. The ideas of random experiment, random variable, mean, variance and standard deviation, and probability distribution are introduced. The ideas are introduced in the framework of a practical, yet simple, example; measured data are included. This presentation is the third in a sequence of tutorial discussions on mathematical model validation. The example introduced here is also used in later presentations. © 2009 Society for Experimental Mechanics Inc.

More Details

Density functional theory (DFT) simulations of shocked liquid xenon

AIP Conference Proceedings

Mattsson, Thomas M.; Magyar, Rudolph J.

Xenon is not only a technologically important element used in laser technologies and jet propulsion, but it is also one of the most accessible materials in which to study the metal-insulator transition with increasing pressure. Because of its closed shell electronic configuration, xenon is often assumed to be chemically inert, interacting almost entirely through the van der Waals interaction, and at liquid density, is typically modeled well using Leonard-Jones potentials. However, such modeling has a limited range of validity as xenon is known to form compounds under normal conditions and likely exhibits considerably more chemistry at higher densities when hybridization of occupied orbitals becomes significant. We present DFT-MD simulations of shocked liquid xenon with the goal of developing an improved equation of state. The calculated Hugoniot to 2 MPa compares well with available experimental shock data. Sandia is a mul-tiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. © 2009 American Institute of Physics.

More Details

Density functional theory (DFT) simulations of shocked liquid xenon

AIP Conference Proceedings

Mattsson, Thomas M.; Magyar, Rudolph J.

Xenon is not only a technologically important element used in laser technologies and jet propulsion, but it is also one of the most accessible materials in which to study the metal-insulator transition with increasing pressure. Because of its closed shell electronic configuration, xenon is often assumed to be chemically inert, interacting almost entirely through the van der Waals interaction, and at liquid density, is typically modeled well using Leonard-Jones potentials. However, such modeling has a limited range of validity as xenon is known to form compounds under normal conditions and likely exhibits considerably more chemistry at higher densities when hybridization of occupied orbitals becomes significant. We present DFT-MD simulations of shocked liquid xenon with the goal of developing an improved equation of state. The calculated Hugoniot to 2 MPa compares well with available experimental shock data. Sandia is a mul-tiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. © 2009 American Institute of Physics.

More Details

Label-invariant mesh quality metrics

Proceedings of the 18th International Meshing Roundtable, IMR 2009

Knupp, Patrick K.

Mappings from a master element to the physical mesh element, in conjunction with local metrics such as those appearing in the Target-matrix paradigm, are used to measure quality at points within an element. The approach is applied to both linear and quadratic triangular elements; this enables, for example, one to measure quality within a quadratic finite element. Quality within an element may also be measured on a set of symmetry points, leading to so-called symmetry metrics. An important issue having to do with the labeling of the element vertices is relevant to mesh quality tools such as Verdict and Mesquite. Certain quality measures like area, volume, and shape should be label-invariant, while others such as aspect ratio and orientation should not. It is shown that local metrics whose Jacobian matrix is non-constant are label-invariant only at the center of the element, while symmetry metrics can be label-invariant anywhere within the element, provided the reference element is properly restricted.

More Details

Analysis of micromixers and biocidal coatings on water-treatment membranes to minimize biofouling

Altman, Susan J.; Clem, Paul G.; Cook, Adam W.; Hart, William E.; Hibbs, Michael R.; Ho, Clifford K.; Jones, Howland D.; Sun, Amy C.; Webb, Stephen W.

Biofouling, the unwanted growth of biofilms on a surface, of water-treatment membranes negatively impacts in desalination and water treatment. With biofouling there is a decrease in permeate production, degradation of permeate water quality, and an increase in energy expenditure due to increased cross-flow pressure needed. To date, a universal successful and cost-effect method for controlling biofouling has not been implemented. The overall goal of the work described in this report was to use high-performance computing to direct polymer, material, and biological research to create the next generation of water-treatment membranes. Both physical (micromixers - UV-curable epoxy traces printed on the surface of a water-treatment membrane that promote chaotic mixing) and chemical (quaternary ammonium groups) modifications of the membranes for the purpose of increasing resistance to biofouling were evaluated. Creation of low-cost, efficient water-treatment membranes helps assure the availability of fresh water for human use, a growing need in both the U. S. and the world.

More Details

Summary of the CSRI Workshop on Combinatorial Algebraic Topology (CAT): Software, Applications, & Algorithms

Mitchell, Scott A.; Bennett, Janine C.; Day, David M.

This report summarizes the Combinatorial Algebraic Topology: software, applications & algorithms workshop (CAT Workshop). The workshop was sponsored by the Computer Science Research Institute of Sandia National Laboratories. It was organized by CSRI staff members Scott Mitchell and Shawn Martin. It was held in Santa Fe, New Mexico, August 29-30. The CAT Workshop website has links to some of the talk slides and other information, http://www.cs.sandia.gov/CSRI/Workshops/2009/CAT/index.html. The purpose of the report is to summarize the discussions and recap the sessions. There is a special emphasis on technical areas that are ripe for further exploration, and the plans for follow-up amongst the workshop participants. The intended audiences are the workshop participants, other researchers in the area, and the workshop sponsors.

More Details

Host suppression and bioinformatics for sequence-based characterization of unknown pathogens

Misra, Milind; Patel, Kamlesh P.; Kaiser, Julia N.; Meagher, Robert M.; Branda, Steven B.; Schoeniger, Joseph S.

Bioweapons and emerging infectious diseases pose formidable and growing threats to our national security. Rapid advances in biotechnology and the increasing efficiency of global transportation networks virtually guarantee that the United States will face potentially devastating infectious disease outbreaks caused by novel ('unknown') pathogens either intentionally or accidentally introduced into the population. Unfortunately, our nation's biodefense and public health infrastructure is primarily designed to handle previously characterized ('known') pathogens. While modern DNA assays can identify known pathogens quickly, identifying unknown pathogens currently depends upon slow, classical microbiological methods of isolation and culture that can take weeks to produce actionable information. In many scenarios that delay would be costly, in terms of casualties and economic damage; indeed, it can mean the difference between a manageable public health incident and a full-blown epidemic. To close this gap in our nation's biodefense capability, we will develop, validate, and optimize a system to extract nucleic acids from unknown pathogens present in clinical samples drawn from infected patients. This system will extract nucleic acids from a clinical sample, amplify pathogen and specific host response nucleic acid sequences. These sequences will then be suitable for ultra-high-throughput sequencing (UHTS) carried out by a third party. The data generated from UHTS will then be processed through a new data assimilation and Bioinformatic analysis pipeline that will allow us to characterize an unknown pathogen in hours to days instead of weeks to months. Our methods will require no a priori knowledge of the pathogen, and no isolation or culturing; therefore it will circumvent many of the major roadblocks confronting a clinical microbiologist or virologist when presented with an unknown or engineered pathogen.

More Details

Xyce parallel electronic simulator : users' guide. Version 5.1

Keiter, Eric R.; Mei, Ting M.; Russo, Thomas V.; Pawlowski, Roger P.; Schiek, Richard S.; Santarelli, Keith R.; Coffey, Todd S.; Thornquist, Heidi K.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

More Details

Xyce™ Parallel Electronic Simulator: Reference Guide, Version 5.1

Keiter, Eric R.; Mei, Ting M.; Russo, Thomas V.; Pawlowski, Roger P.; Schiek, Richard S.; Santarelli, Keith R.; Coffey, Todd S.; Thornquist, Heidi K.

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users’ Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users’ Guide.

More Details

Modeling aspects of human memory for scientific study

Bernard, Michael L.; Morrow, James D.; Taylor, Shawn E.; Verzi, Stephen J.; Vineyard, Craig M.

Working with leading experts in the field of cognitive neuroscience and computational intelligence, SNL has developed a computational architecture that represents neurocognitive mechanisms associated with how humans remember experiences in their past. The architecture represents how knowledge is organized and updated through information from individual experiences (episodes) via the cortical-hippocampal declarative memory system. We compared the simulated behavioral characteristics with those of humans measured under well established experimental standards, controlling for unmodeled aspects of human processing, such as perception. We used this knowledge to create robust simulations of & human memory behaviors that should help move the scientific community closer to understanding how humans remember information. These behaviors were experimentally validated against actual human subjects, which was published. An important outcome of the validation process will be the joining of specific experimental testing procedures from the field of neuroscience with computational representations from the field of cognitive modeling and simulation.

More Details

Crossing the mesoscale no-mans land via parallel kinetic Monte Carlo

Plimpton, Steven J.; Battaile, Corbett C.; Chandross, M.; Holm, Elizabeth A.; Thompson, Aidan P.; Tikare, Veena T.; Wagner, Gregory J.; Webb, Edmund B.; Zhou, Xiaowang Z.

The kinetic Monte Carlo method and its variants are powerful tools for modeling materials at the mesoscale, meaning at length and time scales in between the atomic and continuum. We have completed a 3 year LDRD project with the goal of developing a parallel kinetic Monte Carlo capability and applying it to materials modeling problems of interest to Sandia. In this report we give an overview of the methods and algorithms developed, and describe our new open-source code called SPPARKS, for Stochastic Parallel PARticle Kinetic Simulator. We also highlight the development of several Monte Carlo models in SPPARKS for specific materials modeling applications, including grain growth, bubble formation, diffusion in nanoporous materials, defect formation in erbium hydrides, and surface growth and evolution.

More Details

Evaluation of the impact chip multiprocessors have on SNL application performance

Doerfler, Douglas W.

This report describes trans-organizational efforts to investigate the impact of chip multiprocessors (CMPs) on the performance of important Sandia application codes. The impact of CMPs on the performance and applicability of Sandia's system software was also investigated. The goal of the investigation was to make algorithmic and architectural recommendations for next generation platform acquisitions.

More Details

Decision support for integrated water-energy planning

Tidwell, Vincent C.; Kobos, Peter H.; Malczynski, Leonard A.; Hart, William E.; Castillo, Cesar R.

Currently, electrical power generation uses about 140 billion gallons of water per day accounting for over 39% of all freshwater withdrawals thus competing with irrigated agriculture as the leading user of water. Coupled to this water use is the required pumping, conveyance, treatment, storage and distribution of the water which requires on average 3% of all electric power generated. While water and energy use are tightly coupled, planning and management of these fundamental resources are rarely treated in an integrated fashion. Toward this need, a decision support framework has been developed that targets the shared needs of energy and water producers, resource managers, regulators, and decision makers at the federal, state and local levels. The framework integrates analysis and optimization capabilities to identify trade-offs, and 'best' alternatives among a broad list of energy/water options and objectives. The decision support framework is formulated in a modular architecture, facilitating tailored analyses over different geographical regions and scales (e.g., national, state, county, watershed, NERC region). An interactive interface allows direct control of the model and access to real-time results displayed as charts, graphs and maps. Ultimately, this open and interactive modeling framework provides a tool for evaluating competing policy and technical options relevant to the energy-water nexus.

More Details

Increasing fault resiliency in a message-passing environment

Ferreira, Kurt; Oldfield, Ron A.; Stearley, Jon S.; Laros, James H.; Pedretti, Kevin T.T.; Brightwell, Ronald B.

Petaflops systems will have tens to hundreds of thousands of compute nodes which increases the likelihood of faults. Applications use checkpoint/restart to recover from these faults, but even under ideal conditions, applications running on more than 30,000 nodes will likely spend more than half of their total run time saving checkpoints, restarting, and redoing work that was lost. We created a library that performs redundant computations on additional nodes allocated to the application. An active node and its redundant partner form a node bundle which will only fail, and cause an application restart, when both nodes in the bundle fail. The goal of this library is to learn whether this can be done entirely at the user level, what requirements this library places on a Reliability, Availability, and Serviceability (RAS) system, and what its impact on performance and run time is. We find that our redundant MPI layer library imposes a relatively modest performance penalty for applications, but that it greatly reduces the number of applications interrupts. This reduction in interrupts leads to huge savings in restart and rework time. For large-scale applications the savings compensate for the performance loss and the additional nodes required for redundant computations.

More Details

Graph algorithms in the titan toolkit

McLendon, William C.

Graph algorithms are a key component in a wide variety of intelligence analysis activities. The Graph-Based Informatics for Non-Proliferation and Counter-Terrorism project addresses the critical need of making these graph algorithms accessible to Sandia analysts in a manner that is both intuitive and effective. Specifically we describe the design and implementation of an open source toolkit for doing graph analysis, informatics, and visualization that provides Sandia with novel analysis capability for non-proliferation and counter-terrorism.

More Details

Final report LDRD project 105816 : model reduction of large dynamic systems with localized nonlinearities

Lehoucq, Richard B.; Dohrmann, Clark R.; Segalman, Daniel J.

Advanced computing hardware and software written to exploit massively parallel architectures greatly facilitate the computation of extremely large problems. On the other hand, these tools, though enabling higher fidelity models, have often resulted in much longer run-times and turn-around-times in providing answers to engineering problems. The impediments include smaller elements and consequently smaller time steps, much larger systems of equations to solve, and the inclusion of nonlinearities that had been ignored in days when lower fidelity models were the norm. The research effort reported focuses on the accelerating the analysis process for structural dynamics though combinations of model reduction and mitigation of some factors that lead to over-meshing.

More Details

Performance of a parallel algebraic multilevel preconditioner for stabilized finite element semiconductor device modeling

Journal of Computational Physics

Lin, Paul T.; Shadid, John N.; Sala, Marzio; Tuminaro, Raymond S.; Hennigan, Gary L.; Hoekstra, Robert J.

In this study results are presented for the large-scale parallel performance of an algebraic multilevel preconditioner for solution of the drift-diffusion model for semiconductor devices. The preconditioner is the key numerical procedure determining the robustness, efficiency and scalability of the fully-coupled Newton-Krylov based, nonlinear solution method that is employed for this system of equations. The coupled system is comprised of a source term dominated Poisson equation for the electric potential, and two convection-diffusion-reaction type equations for the electron and hole concentration. The governing PDEs are discretized in space by a stabilized finite element method. Solution of the discrete system is obtained through a fully-implicit time integrator, a fully-coupled Newton-based nonlinear solver, and a restarted GMRES Krylov linear system solver. The algebraic multilevel preconditioner is based on an aggressive coarsening graph partitioning of the nonzero block structure of the Jacobian matrix. Representative performance results are presented for various choices of multigrid V-cycles and W-cycles and parameter variations for smoothers based on incomplete factorizations. Parallel scalability results are presented for solution of up to 108 unknowns on 4096 processors of a Cray XT3/4 and an IBM POWER eServer system. © 2009 Elsevier Inc. All rights reserved.

More Details

A comparison of Lagrangian/Eulerian approaches for tracking the kinematics of high deformation solid motion

Ames, Thomas L.; Robinson, Allen C.

The modeling of solids is most naturally placed within a Lagrangian framework because it requires constitutive models which depend on knowledge of the original material orientations and subsequent deformations. Detailed kinematic information is needed to ensure material frame indifference which is captured through the deformation gradient F. Such information can be tracked easily in a Lagrangian code. Unfortunately, not all problems can be easily modeled using Lagrangian concepts due to severe distortions in the underlying motion. Either a Lagrangian/Eulerian or a pure Eulerian modeling framework must be introduced. We discuss and contrast several Lagrangian/Eulerian approaches for keeping track of the details of material kinematics.

More Details

Investigating methods of supporting dynamically linked executables on high performance computing platforms

Laros, James H.; Kelly, Suzanne M.; Levenhagen, Michael J.; Pedretti, Kevin T.T.

Shared libraries have become ubiquitous and are used to achieve great resource efficiencies on many platforms. The same properties that enable efficiencies on time-shared computers and convenience on small clusters prove to be great obstacles to scalability on large clusters and High Performance Computing platforms. In addition, Light Weight operating systems such as Catamount have historically not supported the use of shared libraries specifically because they hinder scalability. In this report we will outline the methods of supporting shared libraries on High Performance Computing platforms using Light Weight kernels that we investigated. The considerations necessary to evaluate utility in this area are many and sometimes conflicting. While our initial path forward has been determined based on this evaluation we consider this effort ongoing and remain prepared to re-evaluate any technology that might provide a scalable solution. This report is an evaluation of a range of possible methods of supporting dynamically linked executables on capability class1 High Performance Computing platforms. Efforts are ongoing and extensive testing at scale is necessary to evaluate performance. While performance is a critical driving factor, supporting whatever method is used in a production environment is an equally important and challenging task.

More Details

Improving performance via mini-applications

Doerfler, Douglas W.; Crozier, Paul C.; Edwards, Harold C.; Williams, Alan B.; Rajan, Mahesh R.; Keiter, Eric R.; Thornquist, Heidi K.

Application performance is determined by a combination of many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, we find that the use of mini-applications - small self-contained proxies for real applications - is an excellent approach for rapidly exploring the parameter space of all these choices. Furthermore, use of mini-applications enriches the interaction between application, library and computer system developers by providing explicit functioning software and concrete performance results that lead to detailed, focused discussions of design trade-offs, algorithm choices and runtime performance issues. In this paper we discuss a collection of mini-applications and demonstrate how we use them to analyze and improve application performance on new and future computer platforms.

More Details

Efficient algorithms for mixed aleatory-epistemic uncertainty quantification with application to radiation-hardened electronics. Part I, algorithms and benchmark results

Eldred, Michael S.; Swiler, Laura P.

This report documents the results of an FY09 ASC V&V Methods level 2 milestone demonstrating new algorithmic capabilities for mixed aleatory-epistemic uncertainty quantification. Through the combination of stochastic expansions for computing aleatory statistics and interval optimization for computing epistemic bounds, mixed uncertainty analysis studies are shown to be more accurate and efficient than previously achievable. Part I of the report describes the algorithms and presents benchmark performance results. Part II applies these new algorithms to UQ analysis of radiation effects in electronic devices and circuits for the QASPR program.

More Details

Quantitative resilience analysis through control design

Vugrin, Eric D.; Camphouse, Russell C.; Sunderland, Daniel S.

Critical infrastructure resilience has become a national priority for the U. S. Department of Homeland Security. System resilience has been studied for several decades in many different disciplines, but no standards or unifying methods exist for critical infrastructure resilience analysis. Few quantitative resilience methods exist, and those existing approaches tend to be rather simplistic and, hence, not capable of sufficiently assessing all aspects of critical infrastructure resilience. This report documents the results of a late-start Laboratory Directed Research and Development (LDRD) project that investigated the development of quantitative resilience through application of control design methods. Specifically, we conducted a survey of infrastructure models to assess what types of control design might be applicable for critical infrastructure resilience assessment. As a result of this survey, we developed a decision process that directs the resilience analyst to the control method that is most likely applicable to the system under consideration. Furthermore, we developed optimal control strategies for two sets of representative infrastructure systems to demonstrate how control methods could be used to assess the resilience of the systems to catastrophic disruptions. We present recommendations for future work to continue the development of quantitative resilience analysis methods.

More Details

Toward improved branch prediction through data mining

Hemmert, Karl S.

Data mining and machine learning techniques can be applied to computer system design to aid in optimizing design decisions, improving system runtime performance. Data mining techniques have been investigated in the context of branch prediction. Specifically, a comparison of traditional branch predictor performance has been made to data mining algorithms. Additionally, the possiblity of whether additional features available within the architectural state might serve to further improve branch prediction has been evaluated. Results show that data mining techniques indicate potential for improved branch prediction, especially when register file contents are included as a feature set.

More Details

Scalable analysis tools for sensitivity analysis and UQ (3160) results

Ice, Lisa I.; Fabian, Nathan D.; Moreland, Kenneth D.; Bennett, Janine C.; Thompson, David C.; Karelitz, David B.

The 9/30/2009 ASC Level 2 Scalable Analysis Tools for Sensitivity Analysis and UQ (Milestone 3160) contains feature recognition capability required by the user community for certain verification and validation tasks focused around sensitivity analysis and uncertainty quantification (UQ). These feature recognition capabilities include crater detection, characterization, and analysis from CTH simulation data; the ability to call fragment and crater identification code from within a CTH simulation; and the ability to output fragments in a geometric format that includes data values over the fragments. The feature recognition capabilities were tested extensively on sample and actual simulations. In addition, a number of stretch criteria were met including the ability to visualize CTH tracer particles and the ability to visualize output from within an S3D simulation.

More Details

A fully implicit method for 3D quasi-steady state magnetic advection-diffusion

Siefert, Christopher S.; Robinson, Allen C.

We describe the implementation of a prototype fully implicit method for solving three-dimensional quasi-steady state magnetic advection-diffusion problems. This method allows us to solve the magnetic advection diffusion equations in an Eulerian frame with a fixed, user-prescribed velocity field. We have verified the correctness of method and implementation on two standard verification problems, the Solberg-White magnetic shear problem and the Perry-Jones-White rotating cylinder problem.

More Details

Highly scalable linear solvers on thousands of processors

Siefert, Christopher S.; Tuminaro, Raymond S.; Domino, Stefan P.; Robinson, Allen C.

In this report we summarize research into new parallel algebraic multigrid (AMG) methods. We first provide a introduction to parallel AMG. We then discuss our research in parallel AMG algorithms for very large scale platforms. We detail significant improvements in the AMG setup phase to a matrix-matrix multiplication kernel. We present a smoothed aggregation AMG algorithm with fewer communication synchronization points, and discuss its links to domain decomposition methods. Finally, we discuss a multigrid smoothing technique that utilizes two message passing layers for use on multicore processors.

More Details

Neural assembly models derived through nano-scale measurements

Fan, Hongyou F.; Forsythe, James C.; Branda, Catherine B.; Warrender, Christina E.; Schiek, Richard S.

This report summarizes accomplishments of a three-year project focused on developing technical capabilities for measuring and modeling neuronal processes at the nanoscale. It was successfully demonstrated that nanoprobes could be engineered that were biocompatible, and could be biofunctionalized, that responded within the range of voltages typically associated with a neuronal action potential. Furthermore, the Xyce parallel circuit simulator was employed and models incorporated for simulating the ion channel and cable properties of neuronal membranes. The ultimate objective of the project had been to employ nanoprobes in vivo, with the nematode C elegans, and derive a simulation based on the resulting data. Techniques were developed allowing the nanoprobes to be injected into the nematode and the neuronal response recorded. To the authors's knowledge, this is the first occasion in which nanoparticles have been successfully employed as probes for recording neuronal response in an in vivo animal experimental protocol.

More Details
Results 8201–8400 of 9,998
Results 8201–8400 of 9,998