Publications
Search results
Jump to search filtersDynamic Load Balancing for Adaptive Scientific Computations via Hypergraph Partitioning
Abstract not provided.
A numerical investigation of the effect of induced porosity on the electromechanical switching of ferroelectric ceramics
Ferroelectrics
Abstract not provided.
Analyzing the Scalability of Graph Algorithms on Eldorado
Abstract not provided.
Investigating Lightweight Storage and Overlay Networks for Fault Tolerance
Abstract not provided.
1424 News note - September 2006
Abstract not provided.
Efficient Large-scale Network-based Simulation of Disease Outbreaks
Abstract not provided.
DAKOTA and its use in Computational Experiments
Abstract not provided.
Massive Multithreading for Unstructured Problems
Abstract not provided.
Managing Petascale Complexity
Abstract not provided.
Geometry Mesh Components for Scientific Computing
Abstract not provided.
Trilinos Overview
Abstract not provided.
Solution Verification and Uncertainty Quantification Coupled
Abstract not provided.
Parallel Volume Rendering in ParaView (VTK BOF)
Abstract not provided.
Programmable Shaders in ParaView (VTK BOF)
Abstract not provided.
AMG and a Discrete Reformulation for Maxwell's Equations
Abstract not provided.
Performance of AMG-type preconditioners for fully-coupled solution of FE Transport/Reaction Simulations
Abstract not provided.
InfoVis in VTK
Abstract not provided.
Using PyTrilinos: A Tutorial
Abstract not provided.
DocUtils: A Documentation Utilities Package
Abstract not provided.
Trilinos 101: Getting Started with Trilinos
Abstract not provided.
An Infrastructure for Characterizing the Sensitivity of Parallel Applications to OS Noise
Abstract not provided.
Domain decomposition methods for advection dominated linear-quadratic elliptic optimal control problems
Computer Methods in Applied Mechanics and Engineering
We present an optimization-level domain decomposition (DD) preconditioner for the solution of advection dominated elliptic linear-quadratic optimal control problems, which arise in many science and engineering applications. The DD preconditioner is based on a decomposition of the optimality conditions for the elliptic linear-quadratic optimal control problem into smaller subdomain optimality conditions with Dirichlet boundary conditions for the states and the adjoints on the subdomain interfaces. These subdomain optimality conditions are coupled through Robin transmission conditions for the states and the adjoints. The parameters in the Robin transmission condition depend on the advection. This decomposition leads to a Schur complement system in which the unknowns are the state and adjoint variables on the subdomain interfaces. The Schur complement operator is the sum of subdomain Schur complement operators, the application of which is shown to correspond to the solution of subdomain optimal control problems, which are essentially smaller copies of the original optimal control problem. We show that, under suitable conditions, the application of the inverse of the subdomain Schur complement operators requires the solution of a subdomain elliptic linear-quadratic optimal control problem with Robin boundary conditions for the state. Numerical tests for problems with distributed and with boundary control show that the dependence of the preconditioners on mesh size and subdomain size is comparable to its counterpart applied to a single advection dominated equation. These tests also show that the preconditioners are insensitive to the size of the control regularization parameter.
Ideas underlying quantification of margins and uncertainties(QMU): a white paper
This report describes key ideas underlying the application of Quantification of Margins and Uncertainties (QMU) to nuclear weapons stockpile lifecycle decisions at Sandia National Laboratories. While QMU is a broad process and methodology for generating critical technical information to be used in stockpile management, this paper emphasizes one component, which is information produced by computational modeling and simulation. In particular, we discuss the key principles of developing QMU information in the form of Best Estimate Plus Uncertainty, the need to separate aleatory and epistemic uncertainty in QMU, and the risk-informed decision making that is best suited for decisive application of QMU. The paper is written at a high level, but provides a systematic bibliography of useful papers for the interested reader to deepen their understanding of these ideas.
Design of and comparison with verification and validation benchmarks
Abstract not provided.
Sensor Placement to Satisfy Water Security and Operational Objectives
Abstract not provided.
Memo documenting the technical review of the ASC level II milestone for the chemical aging of organic materials
Abstract not provided.
ASC Level II Milestone: Chemical Aging of Organic Materials
Abstract not provided.
Shock Capturing for High-Speed Two Material Flows on Overset Grids
Abstract not provided.
ChISELS 1.0: theory and user manual :a theoretical modeler of deposition and etch processes in microsystems fabrication
Chemically Induced Surface Evolution with Level-Sets--ChISELS--is a parallel code for modeling 2D and 3D material depositions and etches at feature scales on patterned wafers at low pressures. Designed for efficient use on a variety of computer architectures ranging from single-processor workstations to advanced massively parallel computers running MPI, ChISELS is a platform on which to build and improve upon previous feature-scale modeling tools while taking advantage of the most recent advances in load balancing and scalable solution algorithms. Evolving interfaces are represented using the level-set method and the evolution equations time integrated using a Semi-Lagrangian approach [1]. The computational meshes used are quad-trees (2D) and oct-trees (3D), constructed such that grid refinement is localized to regions near the surface interfaces. As the interface evolves, the mesh is dynamically reconstructed as needed for the grid to remain fine only around the interface. For parallel computation, a domain decomposition scheme with dynamic load balancing is used to distribute the computational work across processors. A ballistic transport model is employed to solve for the fluxes incident on each of the surface elements. Surface chemistry is computed by either coupling to the CHEMKIN software [2] or by providing user defined subroutines. This report describes the theoretical underpinnings, methods, and practical use instruction of the ChISELS 1.0 computer code.
Peridynamics Via Finite Element Analysis
Abstract not provided.
Methodology Status and Needs: Verification Validation and Uncertainty Quantification
Abstract not provided.
Reliability-Based Design Optimization for Shape Design of Compliant Micro-Electro-Mechanical Systems
Abstract not provided.
Accuracy and Stability of Operator Splitting Methods Applied to Diffusion/Reaction and Convection/Diffusion/Reaction Systems with Indefinite Operators
Abstract not provided.
Parallel Job Scheduling Policies to Improve Fairness: A Case Study
Journal of Scheduling
Abstract not provided.
Report on ASC project degradation of organic materials
Using molecular dynamics simulations, a constitutive model for the chemical aging of polymer networks was developed. This model incorporates the effects on the stress from the chemical crosslinks and the physical entanglements. The independent network hypothesis has been modified to account for the stress transfer between networks due to crosslinking and scission in strained states. This model was implemented in the finite element code Adagio and validated through comparison with experiment. Stress relaxation data was used to deduce crosslinking history and the resulting history was used to predict permanent set. The permanent set predictions agree quantitatively with experiment.
An Overview of the Thyra Interoperability Effort for Abstract Numerical Algorithms within Trilinos
Abstract not provided.
Software Strategies for Flexible High-Performance Implicit Numerical Solver Libraries
Abstract not provided.
Multi-Physics Coupling for Robust Simulation
Abstract not provided.
Understanding Abstractions
Abstract not provided.
Sandia National Laboratories Advanced Simulation and Computing (ASC) software quality plan part 2 mappings for the ASC software quality engineering practices, version 2.0
The purpose of the Sandia National Laboratories Advanced Simulation and Computing (ASC) Software Quality Plan is to clearly identify the practices that are the basis for continually improving the quality of ASC software products. The plan defines the ASC program software quality practices and provides mappings of these practices to Sandia Corporate Requirements CPR001.3.2 and CPR001.3.6 and to a Department of Energy document, ''ASCI Software Quality Engineering: Goals, Principles, and Guidelines''. This document also identifies ASC management and software project teams' responsibilities in implementing the software quality practices and in assessing progress towards achieving their software quality goals.
Sandia National Laboratories Advanced Simulation and Computing (ASC) software quality plan. Part 1: ASC software quality engineering practices, Version 2.0
The purpose of the Sandia National Laboratories Advanced Simulation and Computing (ASC) Software Quality Plan is to clearly identify the practices that are the basis for continually improving the quality of ASC software products. The plan defines the ASC program software quality practices and provides mappings of these practices to Sandia Corporate Requirements CPR 1.3.2 and 1.3.6 and to a Department of Energy document, ASCI Software Quality Engineering: Goals, Principles, and Guidelines. This document also identifies ASC management and software project teams responsibilities in implementing the software quality practices and in assessing progress towards achieving their software quality goals.
Spatial variability of brittle material properties to address mesh dependencies of conventional damage models
Abstract not provided.
Sonic Infrared (IR) Imaging and Fluorescent Penetrant Inspection Probability of Detection (POD) Comparison
Abstract not provided.
Peridynamic View of Crack Initiation
Abstract not provided.
Electrical Effects from Transient Neutron Irradiation of Silicon Devices
Abstract not provided.
QCS : a system for querying, clustering, and summarizing documents
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test sets from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.
Bayesian Calibration of the QASPR Model
Abstract not provided.
Multimodal Reliability Assessment for Complex Engineering Applications using Sequential Kriging Optimization
Abstract not provided.
A Tool for Testing Supercomputer Software
Abstract not provided.
The Surfpack Software Library for Surrogate Modeling of Sparse Irregularly Spaced Multidimensional Data
Abstract not provided.
Generalizing Smoothed Aggregation on Algebraic Multigrid
Abstract not provided.
Design of and Comparison with Verification and Validation Benchmarks
Abstract not provided.
Solutions to the Thermal Problem prepared for the ASC Validation Challenge Workshop: The Importance of Assumptions
Abstract not provided.
Atomisitic-to-continuum coupling for heat transfer in solids
Abstract not provided.
Measuring progress in order-verification
Abstract not provided.
A multilevel preconditioner for FEM modeling of semiconductor devices
Abstract not provided.
The role of mesh quality in theoretical bounds for finite elements: a survey
Abstract not provided.
DFS :a simple yet difficult benchmark for conventional architectures
Abstract not provided.
A Constitutive Model for Simultaneous Crosslinking and Scission in Rubber Networks
Abstract not provided.
Relating Atomistic-to-Continuum Coupling and Domain Decomposition
Abstract not provided.
Formulation and Optimization of Robust Sensor Placement Problems for Contaminant Warning Systems
Abstract not provided.
Parallel unstructured volume rendering in ParaView
Abstract not provided.
Spanning the length scales with peridynamics
Abstract not provided.
A Minimal Linux Environment for High Performance Computing Systems (Presentation)
Abstract not provided.
Measuring Progress in Premo Order Verification
Abstract not provided.
Nonlinear Solution Techniques for the Analysis of Large-Scale Reaction/Transport Systems
Abstract not provided.
The DAKOTA Toolkit and its use in Computational Experiments
Abstract not provided.
Using Reconfigurable Functional Units with Scientific Applications
Abstract not provided.
Accuracy and Stability of Operator Splitting Methods Applied to Diffusion/Reaction and Convection/Diffusion/Reaction Systems with Indefinite Operators
Abstract not provided.
High performance computing for the application of molecular theories to biological systems
Abstract not provided.
First-principles approach to the charge-transport characteristics of monolayer molecular-electronics devices: Application to hexanedithiolate devices
Physical Review B - Condensed Matter and Materials Physics
We report on the development of an accurate first-principles computational scheme for the charge transport characteristics of molecular monolayer junctions and its application to hexanedithiolate (C6DT) devices. Starting from the Gaussian basis set density-functional calculations of a junction model in the slab geometry and corresponding two bulk electrodes, we obtain the transmission function using the matrix Green's function method and analyze the nature of transmission channels via atomic projected density of states. Within the developed formalism, by treating isolated molecules with the supercell approach, we can investigate the current-voltage characteristics of single and parallel molecular wires in a consistent manner. For the case of single C6DT molecules stretched between Au(111) electrodes, we obtain reasonable quantitative agreement of computed conductance with a recent scanning tunneling microscope experiment result. Comparing the charge transport properties of C6DT single molecules and their monolayer counterparts in the stretched and tilted geometries, we find that the effect of intermolecular coupling and molecule tilting on the charge transport characteristics is negligible in these devices. We contrast this behavior to that of the π -conjugated biphenyldithiolate devices we have previously considered and discuss the relative importance of molecular cores and molecule-electrode contacts for the charge transport in those devices. © 2006 The American Physical Society.
Parallel parameter study of the Wigner-Poisson equations for RTDs
Computers and Mathematics with Applications
We will discuss a parametric study of the solution of the Wigner-Poisson equations for resonant tunneling diodes. These structures exhibit self-sustaining oscillations in certain operating regimes. We will describe the engineering consequences of our study and how it is a significant advance from some previous work, which used much coarser grids. We use LOCA and other packages in the Trilinos framework from Sandia National Laboratory to enable efficient parallelization of the solution methods and to perform bifurcation analysis of this model. We report on the parallel efficiency and scalability of our implementation. © 2006 Elsevier Ltd. All rights reserved.
External Review: Para View
Abstract not provided.
On the Placement of Imperfect Sensors in Municipal Water Networks
Abstract not provided.
Contaminant Mixing at Pipe Joints in a Small-Scale Network: Comparison Between Experiments and Computational Fluid Dynamics Models
Abstract not provided.
1411 Department Review 2006
Abstract not provided.
1411 Department Review 2006
Abstract not provided.
1411 Department Review 2006
Abstract not provided.
1411 Department Review 2006
Abstract not provided.
1411 Department Review 2006
Abstract not provided.
Advancing the Research and Integration of Invasive Optimization Technology
Abstract not provided.
Large Scale Intrusive Optimization and Applications
Abstract not provided.
Fracture Fragmentation and Penetration Modeling with Peridynamics
Abstract not provided.
1411 Department Review 2006
Abstract not provided.
HIGH FIDELITY COMPUTATIONAL FLUID DYNAMICS FOR MIXING IN WATER DISTRIBUTION SYSTEMS
Abstract not provided.
Converged Simulations of a TeraHertz Oscillator
Abstract not provided.
Spatial variability of material properties to address mesh dependencies of damage models
Abstract not provided.
A SCALABLE OPTIMIZATION INTERFACE FOR NUMERICAL SIMULATION APPLIED TO THE NEXT GENERATION SUPERCOMPUTER
Abstract not provided.
Topics on the growth of RT and turbulence with modulated wires and spectroscopic dopants
Abstract not provided.
Zoltan 2.0: Data-Management Services for Parallel Applications -- User's Guide
Abstract not provided.
Translation from Fine-grained to Coarse-grained Parallelism
Abstract not provided.
The Impacts of Message Rate on Applications Programming
Abstract not provided.
Benchmarking MPI: The Challenges of Getting it Right
Abstract not provided.
Architectures and APIs: Assessing Requirements for Delivering FPGA Performance to Applications
Abstract not provided.
A Simple Synchronous Distributed-Memory Algorithm for the HPCC RandomAccess Benchmark
Abstract not provided.
Software Quality Assurance Procedures for MELCOR
Abstract not provided.
NIC Architecture Research
Abstract not provided.
Designing Contaminant Warning Systems
Abstract not provided.
Automatic Differentiation of C++ Codes for Large-Scale Scientific Computing
Abstract not provided.
Sandia's Unmanned Ground Vehicles Mobile Manipulation and Cooperative Control
Abstract not provided.
Supercomputing System Design Through Simulation
Abstract not provided.
Zoltan 2.0 Release announcement
Abstract not provided.
Fairness of Job Scheduling in Cplant
Abstract not provided.
Peridynamic Modeling of Impact and Penetration
Abstract not provided.
Trilinos Brief Overview
Abstract not provided.
Algorithms and Enabling Technologies
Abstract not provided.
Automatic Differentiation For enabling Predictive Simulations
Abstract not provided.
Peridynamics Capabilities and Applications
Abstract not provided.
Current Emphases and Future Themes for ASC Verification & Validation
Abstract not provided.
Overview of Algorithms and Enabling Technologies
Abstract not provided.
Substructured Multibody Molecular Dynamics
Abstract not provided.
Robust Sensor Placement for Realistic Surveillance Problems
Abstract not provided.
Solvers R&D: Algorithms to Z-Pinch
Abstract not provided.
Computation Computers Information & Mathematics - Center 1400
Abstract not provided.
An Evolutionary Path towards Virtual Shared Memory with Random Access
Scalability of Graph Algorithms on Eldorado
Abstract not provided.
A multiscale discontinuous Galerkin method with the computational structure of a continuous Galerkin method
Computer Methods in Applied Mechanics and Engineering
Proliferation of degrees-of-freedom has plagued discontinuous Galerkin methodology from its inception over 30 years ago. This paper develops a new computational formulation that combines the advantages of discontinuous Galerkin methods with the data structure of their continuous Galerkin counterparts. The new method uses local, element-wise problems to project a continuous finite element space into a given discontinuous space, and then applies a discontinuous Galerkin formulation. The projection leads to parameterization of the discontinuous degrees-of-freedom by their continuous counterparts and has a variational multiscale interpretation. This significantly reduces the computational burden and, at the same time, little or no degradation of the solution occurs. In fact, the new method produces improved solutions compared with the traditional discontinuous Galerkin method in some situations. © 2005 Elsevier B.V. All rights reserved.
Least squares preconditioners for stabilized discretizations of the Navier-Stokes equations
Proposed for publication in the SIAM Journal on Scientific Computing.
Iterative optimized effective potential and exact exchange calculations at finite temperature
We report the implementation of an iterative scheme for calculating the Optimized Effective Potential (OEP). Given an energy functional that depends explicitly on the Kohn-Sham wave functions, and therefore, implicitly on the local effective potential appearing in the Kohn-Sham equations, a gradient-based minimization is used to find the potential that minimizes the energy. Previous work has shown how to find the gradient of such an energy with respect to the effective potential in the zero-temperature limit. We discuss a density-matrix-based derivation of the gradient that generalizes the previous results to the finite temperature regime, and we describe important optimizations used in our implementation. We have applied our OEP approach to the Hartree-Fock energy expression to perform Exact Exchange (EXX) calculations. We report our EXX results for common semiconductors and ordered phases of hydrogen at zero and finite electronic temperatures. We also discuss issues involved in the implementation of forces within the OEP/EXX approach.
Next generation of scientists workshop
Abstract not provided.
Bayesian methods in engineering design problems
Abstract not provided.
The portals 3.3 message passing interface document revision 2.1
Abstract not provided.
Verification of LHS distributions
This document provides verification test results for normal, lognormal, and uniform distributions that are used in Sandia's Latin Hypercube Sampling (LHS) software. The purpose of this testing is to verify that the sample values being generated in LHS are distributed according to the desired distribution types. The testing of distribution correctness is done by examining summary statistics, graphical comparisons using quantile-quantile plots, and format statistical tests such as the Chisquare test, the Kolmogorov-Smirnov test, and the Anderson-Darling test. The overall results from the testing indicate that the generation of normal, lognormal, and uniform distributions in LHS is acceptable.
Multilinear operators for higher-order decompositions
We propose two new multilinear operators for expressing the matrix compositions that are needed in the Tucker and PARAFAC (CANDECOMP) decompositions. The first operator, which we call the Tucker operator, is shorthand for performing an n-mode matrix multiplication for every mode of a given tensor and can be employed to concisely express the Tucker decomposition. The second operator, which we call the Kruskal operator, is shorthand for the sum of the outer-products of the columns of N matrices and allows a divorce from a matricized representation and a very concise expression of the PARAFAC decomposition. We explore the properties of the Tucker and Kruskal operators independently of the related decompositions. Additionally, we provide a review of the matrix and tensor operations that are frequently used in the context of tensor decompositions.
1433 News Note: POP Science Runs
Abstract not provided.
Nonlinear algebraic multigrid for constrained solid mechanics problems using Trilinos
Abstract not provided.
Robust optimization of contaminant sensor placement for community water systems
Mathematical Programming
We present a series of related robust optimization models for placing sensors in municipal water networks to detect contaminants that are maliciously or accidentally injected. We formulate sensor placement problems as mixed-integer programs, for which the objective coefficients are not known with certainty. We consider a restricted absolute robustness criteria that is motivated by natural restrictions on the uncertain data, and we define three robust optimization models that differ in how the coefficients in the objective vary. Under one set of assumptions there exists a sensor placement that is optimal for all admissible realizations of the coefficients. Under other assumptions, we can apply sorting to solve each worst-case realization efficiently, or we can apply duality to integrate the worst-case outcome and have one integer program. The most difficult case is where the objective parameters are bilinear, and we prove its complexity is NP-hard even under simplifying assumptions. We consider a relaxation that provides an approximation, giving an overall guarantee of near-optimality when used with branch-and-bound search. We present preliminary computational experiments that illustrate the computational complexity of solving these robust formulations on sensor placement applications.
Multilinear algebra for analyzing data with multiple linkages
Abstract not provided.
GNEP Simulation Laboratory: GNEP Enterprise Model (GEM)
Abstract not provided.
A multi-scale Q1/P0 approach to langrangian shock hydrodynamics
A new multi-scale, stabilized method for Q1/P0 finite element computations of Lagrangian shock hydrodynamics is presented. Instabilities (of hourglass type) are controlled by a stabilizing operator derived using the variational multi-scale analysis paradigm. The resulting stabilizing term takes the form of a pressure correction. With respect to currently implemented hourglass control approaches, the novelty of the method resides in its residual-based character. The stabilizing residual has a definite physical meaning, since it embeds a discrete form of the Clausius-Duhem inequality. Effectively, the proposed stabilization samples and acts to counter the production of entropy due to numerical instabilities. The proposed technique is applicable to materials with no shear strength, for which there exists a caloric equation of state. The stabilization operator is incorporated into a mid-point, predictor/multi-corrector time integration algorithm, which conserves mass, momentum and total energy. Encouraging numerical results in the context of compressible gas dynamics confirm the potential of the method.
The verdict geometric quality library
Verdict is a collection of subroutines for evaluating the geometric qualities of triangles, quadrilaterals, tetrahedra, and hexahedra using a variety of metrics. A metric is a real number assigned to one of these shapes depending on its particular vertex coordinates. These metrics are used to evaluate the input to finite element, finite volume, boundary element, and other types of solvers that approximate the solution to partial differential equations defined over regions of space. The geometric qualities of these regions is usually strongly tied to the accuracy these solvers are able to obtain in their approximations. The subroutines are written in C++ and have a simple C interface. Each metric may be evaluated individually or in combination. When multiple metrics are evaluated at once, they share common calculations to lower the cost of the evaluation.
Electronic structure of intrinsic defects in crystalline germanium telluride
Physical Review B - Condensed Matter and Materials Physics
Germanium telluride undergoes rapid transition between polycrystalline and amorphous states under either optical or electrical excitation. While the crystalline phases are predicted to be semiconductors, polycrystalline germanium telluride always exhibits p -type metallic conductivity. We present a study of the electronic structure and formation energies of the vacancy and antisite defects in both known crystalline phases. We show that these intrinsic defects determine the nature of free-carrier transport in crystalline germanium telluride. Germanium vacancies require roughly one-third the energy of the other three defects to form, making this by far the most favorable intrinsic defect. While the tellurium antisite and vacancy induce gap states, the germanium counterparts do not. A simple counting argument, reinforced by integration over the density of states, predicts that the germanium vacancy leads to empty states at the top of the valence band, thus giving a complete explanation of the observed p -type metallic conduction.
The variational multiscale framework for discontinuous Galerkin methods
Abstract not provided.
On a viscoplastic model for rocks with mechanism-dependent characteristic times
Proposed for publication in Acta Geotechnica.
This paper summarizes the results of a theoretical and experimental program at Sandia National Laboratories aimed at identifying and modeling key physical features of rocks and rock-like materials at the laboratory scale over a broad range of strain rates. The mathematical development of a constitutive model is discussed and model predictions versus experimental data are given for a suite of laboratory tests. Concurrent pore collapse and cracking at the microscale are seen as competitive micromechanisms that give rise to the well-known macroscale phenomenon of a transition from volumetric compaction to dilatation under quasistatic triaxial compression. For high-rate loading, this competition between pore collapse and microcracking also seems to account for recently identified differences in strain-rate sensitivity between uniaxial-strain 'plate slap' data compared to uniaxial-stress Kolsky bar data. A description is given of how this work supports ongoing efforts to develop a predictive capability in simulating deformation and failure of natural geological materials, including those that contain structural features such as joints and other spatial heterogeneities.
HPC InfiniBand requirements : lessons learned from five years of building InfiniBand clusters
Abstract not provided.
The complex of sensor placement in municipal water networks
Abstract not provided.
Parallel hypergraph partitioning for irregular problems
Abstract not provided.
MD simulations of chemically reacting networks : analysis of permanent set
The Independent Network Model (INM) has proven to be a useful tool for understanding the development of permanent set in strained elastomers. Our previous work showed the applicability of the INM to our simulations of polymer systems crosslinking in strained states. This study looks at the INM applied to theoretical models incorporating entanglement effects, including Flory's constrained junction model and more recent tube models. The effect of entanglements has been treated as a separate network formed at gelation, with additional curing treated as traditional phantom contributions. Theoretical predictions are compared with large-scale molecular dynamics simulations.
Parallel space-time solutions of PDE applications
Abstract not provided.
A survey of Asian life scientists :the state of biosciences, laboratory biosecurity, and biosafety in Asia
Over 300 Asian life scientists were surveyed to provide insight into work with infectious agents. This report provides the reader with a more complete understanding of the current practices employed to study infectious agents by laboratories located in Asian countries--segmented by level of biotechnology sophistication. The respondents have a variety of research objectives and study over 60 different pathogens and toxins. Many of the respondents indicated that their work was hampered by lack of adequate resources and the difficulty of accessing critical resources. The survey results also demonstrate that there appears to be better awareness of laboratory biosafety issues compared to laboratory biosecurity. Perhaps not surprisingly, many of these researchers work with pathogens and toxins under less stringent laboratory biosafety and biosecurity conditions than would be typical for laboratories in the West.
Technical assessment of Navitar Zoom 6000 optic and Sony HDC-X310 camera for MEMS presentations and training
Size-Dependent Softening in Structural Ceramics Including Correlations with Mesoscale Interferometry in Impact Experiments
Abstract not provided.
Genomes to Life Project Quarterly Report April 2005
This SAND report provides the technical progress through April 2005 of the Sandia-led project, "Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling," funded by the DOE Office of Science Genomics:GTL Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO2 are important terms in the global environmental response to anthropogenic atmospheric inputs of CO2 and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these - 4 -pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. The ultimate goal of this effort is develop and apply new experimental and computational methods needed to generate a new level of understanding of how the Synechococcus genome affects carbon fixation at the global scale. Anticipated experimental and computational methods will provide ever-increasing insight about the individual elements and steps in the carbon fixation process, however relating an organism's genome to its cellular response in the presence of varying environments will require systems biology approaches. Thus a primary goal for this effort is to integrate the genomic data generated from experiments and lower level simulations with data from the existing body of literature into a whole cell model. We plan to accomplish this by developing and applying a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats. These challenges are unprecedented in high performance scientific computing and necessitate the development of a companion computational infrastructure to support this effort. More information about this project can be found at www.genomes-to-life.org Acknowledgment We want to gratefully acknowledge the contributions of: Grant Heffelfinger1*, Anthony Martino2, Brian Palenik6, Andrey Gorin3, Ying Xu10,3, Mark Daniel Rintoul1, Al Geist3, Matthew Ennis1, with Pratul Agrawal3, Hashim Al-Hashimi8, Andrea Belgrano12, Mike Brown1, Xin Chen9, Paul Crozier1, PguongAn Dam10, Jean-Loup Faulon2, Damian Gessler12, David Haaland1, Victor Havin4, C.F. Huang5, Tao Jiang9, Howland Jones1, David Jung3, Katherine Kang14, Michael Langston15, Shawn Martin1, Shawn Means1, Vijaya Natarajan4, Roy Nielson5, Frank Olken4, Victor Olman10, Ian Paulsen14, Steve Plimpton1, Andreas Reichsteiner5, Nagiza Samatova3, Arie Shoshani4, Michael Sinclair1, Alex Slepoy1, Shawn Stevens8, Charlie Strauss5, Zhengchang Su10, Ed Thomas1, Jerilyn Timlin1, WimVermaas13, Xiufeng Wan11, HongWei Wu10, Dong Xu11, Grover Yip8, Erik Zuiderweg8 *Author to whom correspondence should be addressed (gsheffe@sandia.gov) 1. Sandia National Laboratories, Albuquerque, NM 2. Sandia National Laboratories, Livermore, CA 3. Oak Ridge National Laboratory, Oak Ridge, TN 4. Lawrence Berkeley National Laboratory, Berkeley, CA 5. Los Alamos National Laboratory, Los Alamos, NM 6. University of California, San Diego 7. University of Illinois, Urbana/Champaign 8. University of Michigan, Ann Arbor 9. University of California, Riverside 10. University of Georgia, Athens 11. University of Missouri, Columbia 12. National Center for Genome Resources, Santa Fe, NM 13. Arizona State University 14. The Institute for Genomic Research 15. University of Tennessee 5 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL8500.
A rational approach to biosafety and biosecurity
Abstract not provided.
Effects of alcohols on lipid bilayers from molecular theory
Abstract not provided.
Open source high performance floating-point modules
Given the logic density of modern FPGAs, it is feasible to use FPGAs for floating-point applications. However, it is important that any floating-point units that are used be highly optimized. This paper introduces an open source library of highly optimized floating-point units for Xilinx FPGAs. The units are fully IEEE compliant and achieve approximately 230 MHz operation frequency for double-precision add and multiply in a Xilinx Virtex-2-Pro FPGA (-7 speed grade). This speed is achieved with a 10 stage adder pipeline and a 12 stage multiplier pipeline. The area requirement is 571 slices for the adder and 905 slices for the multiplier.
Preconditioners for the space-time solution of large-scale PDE applications
Abstract not provided.
PyTrilinos : a parallel python interface to Trilinos
PyTrilinos provides python access to selected Trilinos packages: emerging from early stages, portability, completeness; parallelism; rapid prototyping; application development; unit testing; and numeric compatibility (migrating to NumPy). PyTrilinos complements and supplements the SciPy package.
Biosciences Laboratory Biosecurity and Biosafety in Asia
Abstract not provided.
Petaflops, exaflops, and zettaflops for climate modeling
Abstract not provided.
Multi-core processors : coping with the inevitable
Abstract not provided.
Implications of application usage characteristics for collective communication offload
International Journal of High Performance Computing and Networking
The global, synchronous nature of some collective operations implies that they will become the bottleneck when scaling to hundreds of thousands of nodes. One approach improves collective performance using a programmable network interface to directly implement collectives. While these implementations improve micro-benchmark performance, accelerating applications will require deeper understanding of application behaviour. We describe several characteristics of applications that impact collective communication performance. We analyse network resource usage data to guide the design of collective offload engines and their associated programming interfaces. In particular, we provide an analysis of the potential benefit of non-blocking collective communication operations for MPI. © 2006 Inderscience Enterprises Ltd.
Enabling fluid-structural strong thermal coupling within a multi-physics environment
Collection of Technical Papers - 44th AIAA Aerospace Sciences Meeting
We demonstrate use of a Jacobian-Free Newton-Krylov solver to enable strong thermal coupling at the interface between a solid body and an external compressible fluid. Our method requires only information typically used in loose coupling based on successive substitution and is implemented within a multi-physics framework. We present results for two external flows over thermally conducting solid bodies obtained using both loose and strong coupling strategies. Performance of the two strategies is compared to elucidate both advantages and caveats associated with strong coupling.
Intelligent nonlinear solvers for computational fluid dynamics
Abstract not provided.
Peridynamic analysis of damage and failure in composites
Abstract not provided.
Analysis of type IV collagen gene expression in autosomal recessive and X-linked forms of alport syndrome in the english cocker spaniel and mixed breed dog
Proposed for publication in Gene.
Abstract not provided.
Intelligent nonlinear solvers for computational fluid dynamics
Abstract not provided.
Enabling fluid-structural strong thermal coupling within a multi-physics environment
Abstract not provided.
Automated expert modeling for automated student evaluation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
This paper presents automated expert modeling for automated student evaluation, or AEMASE (pronounced "amaze"). This technique grades students by comparing their actions to a model of expert behavior. The expert model is constructed with machine learning techniques, avoiding the costly and time-consuming process of manual knowledge elicitation and expert system implementation. A brief summary of after action review (AAR) and intelligent tutoring systems (ITS) provides background for a prototype AAR application with a learning expert model. A validation experiment confirms that the prototype accurately grades student behavior on a tactical aircraft maneuver application. Finally, several topics for further research are proposed. © Springer-Verlag Berlin Heidelberg 2006.
Large data visualization using client/server architecture in the paraview framework
Abstract not provided.
Recycling Krylov subspaces for sequences of linear systems
SIAM Journal on Scientific Computing
Many problems in science and engineering require the solution of a long sequence of slowly changing linear systems. We propose and analyze two methods that significantly reduce the total number of matrix-vector products required to solve all systems. We consider the general case where both the matrix and right-hand side change, and we make no assumptions regarding the change in the right-hand sides. Furthermore, we consider general nonsingular matrices, and we do not assume that all matrices are pairwise close or that the sequence of matrices converges to a particular matrix. Our methods work well under these general assumptions, and hence form a significant advancement with respect to related work in this area. We can reduce the cost of solving subsequent systems in the sequence by recycling selected subspaces generated for previous systems. We consider two approaches that allow for the continuous improvement of the recycled subspace at low cost. We consider both Hermitian and non-Hermitian problems, and we analyze our algorithms both theoretically and numerically to illustrate the effects of subspace recycling. We also demonstrate the effectiveness of our algorithms for a range of applications from computational mechanics, materials science, and computational physics. © 2006 Society for Industrial and Applied Mathematics.
Automatic differentiation of C++ codes for large-scale scientific computing
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
We discuss computing first derivatives for models based on elements, such as large-scale finite-element PDE discretizations, implemented in the C++ programming language. We use a hybrid technique of automatic differentiation (AD) and manual assembly, with local element-level derivatives computed via AD and manually summed into the global derivative. C++ templating and operator overloading work well for both forward- and reverse-mode derivative computations. We found that AD derivative computations compared favorably in time to finite differencing for a scalable finite-element discretization of a convection-diffusion problem in two dimensions. © Springer-Verlag Berlin Heidelberg 2006.
Circuit simulation: unique solution requirements
Abstract not provided.
The surfpack software library for surrogate modeling of sparse irregularly spaced multidimensional data
Collection of Technical Papers - 11th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference
Surfpack is a general-purpose software library of multidimensional function approximation methods for applications such as data visualization, data mining, sensitivity analysis, uncertainty quantification, and numerical optimization. Surfpack is primarily intended for use on sparse, irregularly-spaced, n-dimensional data sets where classical function approximation methods are not applicable. Surfpack is under development at Sandia National Laboratories, with a public release of Surfpack version 1.0 in August 2006. This paper provides an overview of Surfpack's function approximation methods along with some of its software design attributes. In addition, this paper provides some simple examples to illustrate the utility of Surfpack for data trend analysis, data visualization, and optimization. Copyright © 2006 by the American Institute of Aeronautics and Astronautics, Inc.
Measuring MPI send and receive overhead and application availability in high performance network interfaces
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
In evaluating new high-speed network interfaces, the usual metrics of latency and bandwidth are commonly measured and reported. There are numerous other message passing characteristics that can have a dramatic effect on application performance that should be analyzed when evaluating a new interconnect. One such metric is overhead, which dictates the networks ability to allow the application to perform non-message passing work while a transfer is taking place. A method for measuring overhead, and hence calculating application availability, is presented. Results for several next-generation network interfaces are also presented. © Springer-Verlag Berlin Heidelberg 2006.
The surfpack software library for surrogate modeling of sparse irregularly spaced multidimensional data
Collection of Technical Papers - 11th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference
Surfpack is a general-purpose software library of multidimensional function approximation methods for applications such as data visualization, data mining, sensitivity analysis, uncertainty quantification, and numerical optimization. Surfpack is primarily intended for use on sparse, irregularly-spaced, n-dimensional data sets where classical function approximation methods are not applicable. Surfpack is under development at Sandia National Laboratories, with a public release of Surfpack version 1.0 in August 2006. This paper provides an overview of Surfpack's function approximation methods along with some of its software design attributes. In addition, this paper provides some simple examples to illustrate the utility of Surfpack for data trend analysis, data visualization, and optimization. Copyright © 2006 by the American Institute of Aeronautics and Astronautics, Inc.
ALEGRA-HEDP validation strategy
This report presents a initial validation strategy for specific SNL pulsed power program applications of the ALEGRA-HEDP radiation-magnetohydrodynamics computer code. The strategy is written to be (1) broadened and deepened with future evolution of particular specifications given in this version; (2) broadly applicable to computational capabilities other than ALEGRA-HEDP directed at the same pulsed power applications. The content and applicability of the document are highly constrained by the R&D thrust of the SNL pulsed power program. This means that the strategy has significant gaps, indicative of the flexibility required to respond to an ongoing experimental program that is heavily engaged in phenomena discovery.
Constitutive models for rubber networks undergoing simultaneous crosslinking and scission
Constitutive models for chemically reacting networks are formulated based on a generalization of the independent network hypothesis. These models account for the coupling between chemical reaction and strain histories, and have been tested by comparison with microscopic molecular dynamics simulations. An essential feature of these models is the introduction of stress transfer functions that describe the interdependence between crosslinks formed and broken at various strains. Efforts are underway to implement these constitutive models into the finite element code Adagio. Preliminary results are shown that illustrate the effects of changing crosslinking and scission rates and history.
Algorithm and simulation development in support of response strategies for contamination events in air and water systems
Chemical/Biological/Radiological (CBR) contamination events pose a considerable threat to our nation's infrastructure, especially in large internal facilities, external flows, and water distribution systems. Because physical security can only be enforced to a limited degree, deployment of early warning systems is being considered. However to achieve reliable and efficient functionality, several complex questions must be answered: (1) where should sensors be placed, (2) how can sparse sensor information be efficiently used to determine the location of the original intrusion, (3) what are the model and data uncertainties, (4) how should these uncertainties be handled, and (5) how can our algorithms and forward simulations be sufficiently improved to achieve real time performance? This report presents the results of a three year algorithmic and application development to support the identification, mitigation, and risk assessment of CBR contamination events. The main thrust of this investigation was to develop (1) computationally efficient algorithms for strategically placing sensors, (2) identification process of contamination events by using sparse observations, (3) characterization of uncertainty through developing accurate demands forecasts and through investigating uncertain simulation model parameters, (4) risk assessment capabilities, and (5) reduced order modeling methods. The development effort was focused on water distribution systems, large internal facilities, and outdoor areas.
A software and hardware architecture for a modular, portable, extensible reliability availability and serviceability system
Abstract not provided.
Latent semantic analysis and fiedler embeddings
Proposed for publication Linear Algebra and Its Applications.
Abstract not provided.
Peridynamic modeling of fracture, impact, and penetration
Abstract not provided.
A comparison of parallel block multi-level preconditions for the incompressible Navier-Stokes equations
Abstract not provided.
BEC :a virtual shared memory parallel programming environment
Some language issues in high performance computing: translation from fine-grained parallelism to coarse-grained parallelism
Abstract not provided.
Gaussian processes in trust-region optimization methods
Abstract not provided.
Parallel hex mesh generation: an overview
Abstract not provided.
Modeling and simulation technology readiness levels
This report summarizes the results of an effort to establish a framework for assigning and communicating technology readiness levels (TRLs) for the modeling and simulation (ModSim) capabilities at Sandia National Laboratories. This effort was undertaken as a special assignment for the Weapon Simulation and Computing (WSC) program office led by Art Hale, and lasted from January to September 2006. This report summarizes the results, conclusions, and recommendations, and is intended to help guide the program office in their decisions about the future direction of this work. The work was broken out into several distinct phases, starting with establishing the scope and definition of the assignment. These are characterized in a set of key assertions provided in the body of this report. Fundamentally, the assignment involved establishing an intellectual framework for TRL assignments to Sandia's modeling and simulation capabilities, including the development and testing of a process to conduct the assignments. To that end, we proposed a methodology for both assigning and understanding the TRLs, and outlined some of the restrictions that need to be placed on this process and the expected use of the result. One of the first assumptions we overturned was the notion of a ''static'' TRL--rather we concluded that problem context was essential in any TRL assignment, and that leads to dynamic results (i.e., a ModSim tool's readiness level depends on how it is used, and by whom). While we leveraged the classic TRL results from NASA, DoD, and Sandia's NW program, we came up with a substantially revised version of the TRL definitions, maintaining consistency with the classic level definitions and the Predictive Capability Maturity Model (PCMM) approach. In fact, we substantially leveraged the foundation the PCMM team provided, and augmented that as needed. Given the modeling and simulation TRL definitions and our proposed assignment methodology, we conducted four ''field trials'' to examine how this would work in practice. The results varied substantially, but did indicate that establishing the capability dependencies and making the TRL assignments was manageable and not particularly time consuming. The key differences arose in perceptions of how this information might be used, and what value it would have (opinions ranged from negative to positive value). The use cases and field trial results are included in this report. Taken together, the results suggest that we can make reasonably reliable TRL assignments, but that using those without the context of the information that led to those results (i.e., examining the measures suggested by the PCMM table, and extended for ModSim TRL purposes) produces an oversimplified result--that is, you cannot really boil things down to just a scalar value without losing critical information.
Semi-infinite target penetration by ogive-nose penetrators: ALEGRA/SHISM code predictions for ideal and non-ideal impacts
American Society of Mechanical Engineers, Pressure Vessels and Piping Division (Publication) PVP
The physics of ballistic penetration mechanics is of great interest in penetrator and counter-measure design. The phenomenology associated with these events can be quite complex and a significant number of studies have been conducted ranging from purely experimental to 'engineering' models based on empirical and/or analytical descriptions to fully-coupled penetrator/target, thermo-mechanical numerical simulations. Until recently, however, there appears to be a paucity of numerical studies considering 'non-ideal' impacts [1]. The goal of this work is to demonstrate the SHISM algorithm implemented in the ALEGRA Multi-Material ALE (Arbitrary Lagrangian Eulerian) code [13]. The SHISM algorithm models the three-dimensional continuum solid mechanics response of the target and penetrator in a fully coupled manner. This capability allows for the study of 'non-ideal' impacts (e.g. pitch, yaw and/or obliquity of the target/penetrator pair). In this work predictions using the SHISM algorithm are compared to previously published experimental results for selected ideal and non-ideal impacts of metal penetrator-target pairs. These results show good agreement between predicted and measured maximum depth-of-penetration, DOP, for ogive-nose penetrators with striking velocities in the 0.5 to 1.5 km/s range. Ideal impact simulations demonstrate convergence in predicted DOP for the velocity range considered. A theory is advanced to explain disagreement between predicted and measured DOP at higher striking velocities. This theory postulates uncertainties in angle-of-attack for the observed discrepancies. It is noted that material models and associated parameters used here, were unmodified from those in the literature. Hence, no tuning of models was performed to match experimental data. Copyright © 2005 by ASME.
Kevlar and Carbon Composite body armor - Analysis and testing
American Society of Mechanical Engineers, Pressure Vessels and Piping Division (Publication) PVP
Kevlar materials make excellent body armor due to their fabric-like flexibility and ultra-high tensile strength. Carbon composites are made up from many layers of carbon AS-4 material impregnated with epoxy. Fiber orientation is bidirectional, orientated at 0° and 90°. They also have ultra-high tensile strength but can be made into relatively hard armor pieces. Once many layers are cut and assembled they can be ergonomicically shaped in a mold during the heated curing process. Kevlar and carbon composites can be used together to produce light and effective body armor. This paper will focus on computer analysis and laboratory testing of a Kevlar/carbon composite cross-section proposed for body armor development. The carbon composite is inserted between layers of Kevlar. The computer analysis was performed with a Lagrangian transversely Isotropic material model for both the Kevlar and Carbon Composite. The computer code employed is AUTODYN. Both the computer analysis and laboratory testing utilized different fragments sizes of hardened steel impacting on the armor cross-section. The steel fragments are right-circular cylinders. Laboratory testing was undertaken by firing various sizes of hardened steel fragments at square test coupons of Kevlar layers and heat cured carbon composites. The V50 velocity for the various fragment sizes was determined from the testing. This V50 data can be used to compare the body armor design with other previously designed armor systems. AUTODYN [1] computer simulations of the fragment impacts were compared to the experimental results and used to evaluate and guide the overall design process. This paper will include the detailed transversely isotropic computer simulations of the Kevlar/carbon composite cross-section as well as the experimental results and a comparison between the two. Conclusions will be drawn about the design process and the validity of current computer modeling methods for Kevlar and carbon composites. Copyright © 2005 by ASME.
Enhancing NIC performance for MPI using processing-in-memory
Proceedings - 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
Processing-in-Memory (PIM) technology encompasses a range of research leveraging a tight coupling of memory and processing. The most unique features of the technology are extremely wide paths to memory, extremely low memory latency, and wide functional units. Many PIM researchers are also exploring extremely fine-grained multi-threading capabilities. This paper explores a mechanism for leveraging these features of PIM technology to enhance commodity architectures in a seemingly mundane way: accelerating MPI. Modern network interfaces leverage simple processors to offload portions of the MPI semantics, particularly the management of posted receive and unexpected message queues. Without adding cost or increasing clock frequency, using PIMs in the network interface can enhance performance. The results are a significant decrease in latency and increase in small message bandwidth, particularly when long queues are present.
Computational stability study of 3D flow in a differentially heated 8:1:1 cavity
3rd M.I.T. Conference on Computational Fluid and Solid Mechanics
The critical Rayleigh number Racr of the Hopf bifurcation that signals the limit of steady flows in a differentially heated 8:1:1 cavity is computed. The two-dimensional analog of this problem was the subject of a comprehensive set of benchmark calculations that included the estimation of Racr [1]. In this work we begin to answer the question of whether the 2D results carry over into 3D models. For the case of the 2D model being extruded for a depth of 1, and no-slip/no-penetration and adiabatic boundary conditions placed at these walls, the steady flow and destabilizing eigenvectors qualitatively match those from the 2D model. A mesh resolution study extending to a 20-million unknown model shows that the presence of these walls delays the first critical Rayleigh number from 3.06 × 105 to 5.13 × 105. © 2005 Elsevier Ltd.
Considering the relative importance of network performance and network features
Proceedings of the International Conference on Parallel Processing
Latency and bandwidth are usually considered to be the dominant factor in parallel application performance; however, recent studies have indicated that support for independent progress in MPI can also have a significant impact on application performance. This paper leverages the Cplant system at Sandia National Labs to compare a faster, vendor provided MPI library without independent progress to an internally developed MPI library that sacrifices some performance to provide independent progress. The results are surprising. Although some applications see significant negative impacts from the reduced network performance, others are more sensitive to the presence of independent progress. © 2005 IEEE.
An analysis of the double-precision floating-point FFT on FPGAs
Proceedings - 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2005
Advances in FPGA technology have led to dramatic improvements in double precision floating-point performance. Modern FPGAs boast several GigaFLOPs of raw computing power. Unfortunately, this computing power is distributed across 30 floating-point units with over 10 cycles of latency each. The user must find two orders of magnitude more parallelism than is typically exploited in a single microprocessor; thus, it is not clear that the computational power of FPGAs can be exploited across a wide range of algorithms. This paper explores three implementation alternatives for the Fast Fourier Transform (FFT) on FPGAs. The algorithms are compared in terms of sustained performance and memory requirements for various FFT sizes and FPGA sizes. The results indicate that FPGAs are competitive with microprocessors in terms of performance and that the "correct" FFT implementation varies based on the size of the transform and the size of the FPGA. © 2005 IEEE.
Perspectives on optimization under uncertainty: Algorithms and applications
This paper provides an overview of several approaches to formulating and solving optimization under uncertainty (OUU) engineering design problems. In addition, the topic of high-performance computing and OUU is addressed, with a discussion of the coarse- and fine-grained parallel computing opportunities in the various OUU problem formulations. The OUU approaches covered here are: sampling-based OUU, surrogate model-based OUU, analytic reliability-based OUU (also known as reliability-based design optimization), polynomial chaos-based OUU, and stochastic perturbation-based OUU.
A comparison of floating point and logarithmic number systems for FPGAs
Proceedings - 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2005
There have been many papers proposing the use of logarithmic numbers (LNS) as an alternative to floating point because of simpler multiplication, division and exponentiation computations [1,4-9,13]. However, this advantage comes at the cost of complicated, inexact addition and subtraction, as well as the need to convert between the formats. In this work, we created a parameterized LNS library of computational units and compared them to an existing floating point library. Specifically, we considered multiplication, division, addition, subtraction, and format conversion to determine when one format should be used over the other and when it is advantageous to change formats during a calculation. © 2005 IEEE.
A comparison of Navier Stokes and network models to predict chemical transport in municipal water distribution systems
World Water Congress 2005: Impacts of Global Climate Change - Proceedings of the 2005 World Water and Environmental Resources Congress
Reversible logic for supercomputing
2005 Computing Frontiers Conference
This paper is about making reversible logic a reality for supercomputing. Reversible logic offers a way to exceed certain basic limits on the performance of computers, yet a powerful case will have to be made to justify its substantial development expense. This paper explores the limits of current, irreversible logic for supercomputers, thus forming a threshold above which reversible logic is the only solution. Problems above this threshold are discussed, with the science and mitigation of global warming being discussed in detail. To further develop the idea of using reversible logic in supercomputing, a design for a 1 Zettaflops supercomputer as required for addressing global climate warming is presented. However, to create such a design requires deviations from the mainstream of both the software for climate simulation and research directions of reversible logic. These deviations provide direction on how to make reversible logic practical. Copyright 2005 ACM.
The ParaView Guide (2.4) Introduction to Parallel Computing and Visualization
Abstract not provided.
Copy of The ParaView Guide (2.4) Parallel ParaView
Abstract not provided.
Response surface (Meta-model) methods and applications
Abstract not provided.
Molecular simulations of beta-amyloid protein near hydrated lipids (PECASE)
We performed molecular dynamics simulations of beta-amyloid (A{beta}) protein and A{beta} fragment(31-42) in bulk water and near hydrated lipids to study the mechanism of neurotoxicity associated with the aggregation of the protein. We constructed full atomistic models using Cerius2 and ran simulations using LAMMPS. MD simulations with different conformations and positions of the protein fragment were performed. Thermodynamic properties were compared with previous literature and the results were analyzed. Longer simulations and data analyses based on the free energy profiles along the distance between the protein and the interface are ongoing.
New Automatic Differentiation Tools Expedite Code Development and Enable New Design Algorithms
Abstract not provided.
Error Estimation Approaches for Progressive Response Surfaces- More Results
Abstract not provided.
Laboratory Biosecurity and Biosafety in the Middle East
Abstract not provided.
Advanced Modeling and Simulation for Fluid Flow and Heat & Mass Transfer Applications
Abstract not provided.
What's New in NOX...Plus New Techniques for Solving Large-Scale Steady-State and Transient Stability Analysis Problems with LOCA
Abstract not provided.
Quantum Programming for Classical Programmers
Abstract not provided.
Scalable InfiniBand Cluster Architectures
Abstract not provided.