Publications Search

LDRD final report : massive multithreading applied to national infrastructure and informatics

Barrett, Brian; Hendrickson, Bruce A.; Laviolette, Randall A.; Leung, Vitus J.; Mackey, Greg E.; Murphy, Richard C.; Phillips, Cynthia A.; Pinar, Ali P.

Large relational datasets such as national-scale social networks and power grids present different computational challenges than do physical simulations. Sandia's distributed-memory supercomputers are well suited for solving problems concerning the latter, but not the former. The reason is that problems such as pattern recognition and knowledge discovery on large networks are dominated by memory latency and not by computation. Furthermore, most memory requests in these applications are very small, and when the datasets are large, most requests miss the cache. The result is extremely low utilization. We are unlikely to be able to grow out of this problem with conventional architectures. As the power density of microprocessors has approached that of a nuclear reactor in the past two years, we have seen a leveling of Moores Law. Building larger and larger microprocessor-based supercomputers is not a solution for informatics and network infrastructure problems since the additional processors are utilized to only a tiny fraction of their capacity. An alternative solution is to use the paradigm of massive multithreading with a large shared memory. There is only one instance of this paradigm today: the Cray MTA-2. The proposal team has unique experience with and access to this machine. The XMT, which is now being delivered, is a Red Storm machine with up to 8192 multithreaded 'Threadstorm' processors and 128 TB of shared memory. For many years, the XMT will be the only way to address very large graph problems efficiently, and future generations of supercomputers will include multithreaded processors. Roughly 10 MTA processor can process a simple short paths problem in the time taken by the Gordon Bell Prize-nominated distributed memory code on 32,000 processors of Blue Gene/Light. We have developed algorithms and open-source software for the XMT, and have modified that software to run some of these algorithms on other multithreaded platforms such as the Sun Niagara and Opteron multi-core chips.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing

Pedretti, Kevin T.T.; Levenhagen, Michael; Brightwell, Ronald B.

Palacios and Kitten are new open source tools that enable applications, whether ported or not, to achieve scalable high performance on large machines. They provide a thin layer over the hardware to support both full-featured virtualized environments and native code bases. Kitten is an OS under development at Sandia that implements a lightweight kernel architecture to provide predictable behavior and increased flexibility on large machines, while also providing Linux binary compatibility. Palacios is a VMM that is under development at Northwestern University and the University of New Mexico. Palacios, which can be embedded into Kitten and other OSes, supports existing, unmodified applications and operating systems by using virtualization that leverages hardware technologies. We describe the design and implementation of both Kitten and Palacios. Our benchmarks show that they provide near native, scalable performance. Palacios and Kitten provide an incremental path to using supercomputer resources that is not performance-compromised.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Waste Forms and Systems Integrated Performance and Safety Codes System Design Specification

Edwards, Harold C.; Freeze, Geoffrey; Schultz, Peter A.; Arguello, Jose G.; Bartlett, Roscoe; Wang, Yifeng

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Statistical Theory of the List Experiment to Measure Socially Sensitive Attitudes

Siefert, Christopher

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

An Implementation of the Generalized Finite Element Method for Large Scale Modeling and Simulation of Polycrystalline Ferroelectric Ceramics

Robbins, Joshua; Voth, Thomas E.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

A Comparison of Intrusive Stochastic Galerkin Methods for Uncertainty Quantification of Stochastic PDEs

Phipps, Eric T.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

A Nodal-based Variational Multiscale Method for Lagrangian Shock Hydrodynamics

Computer Methods in Applied Mechanics and Engineering

Shadid, John N.; Love, Edward; Rider, William J.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2009

OSTI

HPC application fault-tolerance using transparent redundant computation

Ferreira, Kurt; Riesen, Rolf; Oldfield, Ron; Brightwell, Ronald B.; Laros, James H.; Pedretti, Kevin P.

As the core count of HPC machines continue to grow in size, issues such as fault tolerance and reliability are becoming limiting factors for application scalability. Current techniques to ensure progress across faults, for example coordinated checkpoint-restart, are unsuitable for machines of this scale due to their predicted high overheads. In this study, we present the design and implementation of a novel system for ensuring reliability which uses transparent, rank-level, redundant computation. Using this system, we show the overheads involved in redundant computation for a number of real-world HPC applications. Additionally, we relate the communication characteristics of an application to the overheads observed.

More Details

TYPE Conference YEAR 2009

OSTI

CAT Workshop 2009 Poster

Mitchell, Scott A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

The subsystem functional scheme: The Armiento-Mattsson 2005 (AM05) functional and beyond

Wills, Ann E.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Risks and Metrics in Influence Ops Modeling

Trucano, Timothy G.; Backus, George A.; Hills, Richard G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Structural simulation toolkit

Rodrigues, Arun

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Persistent homology for parameter sensitivity in large-scale text-analysis (informatics) graphs

Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

First principles site occupation and migration of helium in Beta-phase erbium hydride

Snow, Clark S.; Wixom, Ryan R.; Schultz, Peter A.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Diverging Color Maps for Scientific Visualization (Expanded)

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Diverging Color Maps for Scientific Visualization

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

An Optimization Approach for Fitting Canonical Tensor Decompositions

Acar Ataman, Evrim N.; Kolda, Tamara G.; Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Understanding the neurophysiology of analogy-making through computational modeling

Speed, Ann E.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Viscoplasticity using Peridynamics

Foster, John T.; Silling, Stewart

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Projective Integration for Simulating Multiple Timescale Diffusion Processes in Solids

Wagner, Gregory J.; Zhou, Xiaowang; Plimpton, Steven J.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Integrating error estimation, adaptivity, and optimization

Van Bloemen Waanders, Bart

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Catamount Lightweight Kernel

Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Numerical Approaches for the Quadratic Eigenvalue Problem on Large Structural Acoustic Systems

Reese, Garth M.; Walsh, Timothy W.; Baker, Christopher G.; Jones, Andrea N.A.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Algebraic connectivity and graph robustness

Feddema, John T.

Recent papers have used Fiedler's definition of algebraic connectivity to show that network robustness, as measured by node-connectivity and edge-connectivity, can be increased by increasing the algebraic connectivity of the network. By the definition of algebraic connectivity, the second smallest eigenvalue of the graph Laplacian is a lower bound on the node-connectivity. In this paper we show that for circular random lattice graphs and mesh graphs algebraic connectivity is a conservative lower bound, and that increases in algebraic connectivity actually correspond to a decrease in node-connectivity. This means that the networks are actually less robust with respect to node-connectivity as the algebraic connectivity increases. However, an increase in algebraic connectivity seems to correlate well with a decrease in the characteristic path length of these networks - which would result in quicker communication through the network. Applications of these results are then discussed for perimeter security.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Recent Experiences on Performance and Scalability of SNL Applications on Red Storm and TLCC

Rajan, Mahesh; Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Copy of IEEE Vis 2009 ParaView Tutorial Plugins

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

IceT users' guide and reference

Moreland, Kenneth D.

The Image Composition Engine for Tiles (IceT) is a high-performance sort-last parallel rendering library. In addition to providing accelerated rendering for a standard display, IceT provides the unique ability to generate images for tiled displays. The overall resolution of the display may be several times larger than any viewport that may be rendered by a single machine. This document is an overview of the user interface to IceT.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Globalized Newton-Krylov Solvers Applied to Large-scale Simulation of Navier-Stokes and Magneto-hydrodynamic Systems

Pawlowski, Roger

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Sandia Simulation and Networking

Hemmert, Karl S.; Rodrigues, Arun

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Summary of Modifications to Tabular EOS Material Driver

Carpenter, John H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Self-diffusion in Mo using the AM05 density functional

Mattsson, Thomas; Wills, Ann E.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Density Functional Theory (DFT) Simulations of Shocked Liquid Xenon

Mattsson, Thomas; Magyar, Rudolph J.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Simulating Lifetime Diabetes Risk among Mexican-Americans Living along the US-Mexico: An Agent-Based Modeling Approach

Watson, Jean-Paul; Diegert, Carl; Rintoul, Mark D.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Brief Announcement: The Impact of Classical Electronics Constraints on a Solid-State Logical Qubit Memory

Levy, James E.; Ganti, Anand; Phillips, Cynthia A.; Hamlet, Benjamin R.; Carroll, M.S.; Landahl, Andrew J.; Gurrieri, Thomas; Carr, Robert D.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Trilinos Tutorial

Pawlowski, Roger

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Access to external resources using service-node proxies

Wilson, Andrew T.; Davidson, George W.; Ulmer, Craig; Oldfield, Ron

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

On Sacling I/O for Commodity Clusters

Rudish, Donald W.; Cranford, Scott C.; Ward, Harry L.; Allan, Benjamin A.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Red Storm / Cray XT4: A Superior Architecture for Scalability

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Rethinking a Pythonic Modeling Architecture

Hart, William E.; Watson, Jean-Paul

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Evolution of Biosecurity

Gaudioso, Jennifer M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

LOCA and Other Trilinos Tools for Analysis of Large-Scale Dynamical Systems

Phipps, Eric T.; Salinger, Andrew G.; Pawlowski, Roger

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Recent Advances in Non-Intrusive Polynomial Chaos and Stochastic Collocation Methods for Uncertainty Analysis and Design

Eldred, Michael

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Using adversary text to detect adversary phase changes

Doser, Adele; Speed, Ann E.; Warrender, Christina E.

The purpose of this work was to help develop a research roadmap and small proof ofconcept for addressing key problems and gaps from the perspective of using text analysis methods as a primary tool for detecting when a group is undergoing a phase change. Self- rganizing map (SOM) techniques were used to analyze text data obtained from the tworld-wide web. Statistical studies indicate that it may be possible to predict phase changes, as well as detect whether or not an example of writing can be attributed to a group of interest.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Catamount N-Way Performance on XT5

Brightwell, Ronald B.; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

The Canonical Tensor Decomposition and Its Applications to Data Analysis

Acar Ataman, Evrim N.; Kolda, Tamara G.; Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Using adversary text to detect adversary phase changes

Doser, Adele; Speed, Ann E.; Warrender, Christina E.

Abstract not provided.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Application Performance on a Mildly Heterogeneous Supercomputer

Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

NEAMS: Overview of Verification

Stewart, James

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Decision Support for Integrated Water-Energy Planning

Tidwell, Vincent C.; Kobos, Peter; Malczynski, Leonard A.; Hart, William E.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Parallel phase model : a programming model for high-end parallel machines with manycores

Brightwell, Ronald B.; Heroux, Michael A.; Wen, Zhaofang

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

An Algebraic Multigrid Method for Compatible Least-Squares Formulations of Div-Curl Equations

Siefert, Christopher; Bochev, Pavel B.; Peterson, Kara J.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

What's New in ParaView (DOECGF 2009)

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

UltraVis Overview for DOECGF 2009

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

A semantic disambiguation algorithm to reason about cars from the shapes in over-segmentation of high-resolution orthophotos

Diegert, Carl

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

A semantic disambiguation algorithm to reason about cars from the shapes in over-segmentation of high-resolution orthophotos

Diegert, Carl

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

A scalable and adaptable solution framework within components of the Community Climate System Model

Sprinter Lecture Notes

Rouson, Damian R.; Salinger, Andrew G.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2009

OSTI

Memory in Silico: Building a Neuromimetic Episodic Cognitive Model

Taylor, Shawn E.; Bernard, Michael; Vineyard, Craig M.; Verzi, Stephen J.; Morrow, James D.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

An extensible operating system design for large-scale parallel machines

Riesen, Rolf; Ferreira, Kurt

Running untrusted user-level code inside an operating system kernel has been studied in the 1990's but has not really caught on. We believe the time has come to resurrect kernel extensions for operating systems that run on highly-parallel clusters and supercomputers. The reason is that the usage model for these machines differs significantly from a desktop machine or a server. In addition, vendors are starting to add features, such as floating-point accelerators, multicore processors, and reconfigurable compute elements. An operating system for such machines must be adaptable to the requirements of specific applications and provide abstractions to access next-generation hardware features, without sacrificing performance or scalability.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Red Storm/XT4: A Superior Architecture for Scalability

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Adjoint based optimization and adaptivity for flow and transport problems

Carnes, Brian R.; Bartlett, Roscoe

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Timed-Run Scheduling

Leung, Vitus J.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

A Flexible Approach for the Statistical Visualization of Ensemble Data

Potter, Kristin C.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

LSAView: A Tool for Visual Exploration of Latent Semantic Modeling

Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Algorithmic properties of the midpoint predictor-corrector time integrator

Love, Edward; Scovazzi, Guglielmo S.; Rider, William J.

Algorithmic properties of the midpoint predictor-corrector time integration algorithm are examined. In the case of a finite number of iterations, the errors in angular momentum conservation and incremental objectivity are controlled by the number of iterations performed. Exact angular momentum conservation and exact incremental objectivity are achieved in the limit of an infinite number of iterations. A complete stability and dispersion analysis of the linearized algorithm is detailed. The main observation is that stability depends critically on the number of iterations performed.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Link Prediction on Evolving Data using Tensor Factorizations

Acar Ataman, Evrim N.; Kolda, Tamara G.; Dunlavy, Daniel M.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Simulation & Modeling

Rodrigues, Arun

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Can we continue to build supercomputers out of processors optimized for laptops?

Murphy, Richard C.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Energy Minimizing Algebraic Multigrid for Systems of Partial Differential Equations

Tuminaro, Raymond S.; Hu, Jonathan J.; Cyr, Eric C.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Verification of complex codes

Ober, Curtis C.

Over the past several years, verifying and validating complex codes at Sandia National Laboratories has become a major part of code development. These aspects tackle two important parts of simulation modeling: determining if the models have been correctly implemented - verification, and determining if the correct models have been selected - validation. In this talk, we will focus on verification and discuss the basics of code verification and its application to a few codes and problems at Sandia.

More Details

TYPE Conference YEAR 2009

OSTI

Notes on a gap in advancing geospatial image processing methods for NA-22

Diegert, Carl

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Model-free Learning and Control in a Mobile Robot

Rohrer, Brandon R.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Xyce Parallel Electronic Simulator : reference guide, version 4.1

Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Pawlowski, Roger; Schiek, Richard; Santarelli, Keith R.; Coffey, Todd S.; Thornquist, Heidi K.

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Xyce Parallel Electronic Simulator : users' guide, version 4.1

Keiter, Eric R.; Mei, Ting; Russo, Thomas V.; Pawlowski, Roger; Schiek, Richard; Santarelli, Keith R.; Coffey, Todd S.; Thornquist, Heidi K.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only). (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

CSSE Simulation Tools Quarterly Update FY2009 Q2

Rodrigues, Arun; Adalsteinsson, Helgi; Cranford, Scott C.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Enabling immersive simulation

Abbott, Robert G.; Basilico, Justin D.; Glickman, Matthew R.; Hart, Derek; Whetzel, Jonathan H.

The object of the 'Enabling Immersive Simulation for Complex Systems Analysis and Training' LDRD has been to research, design, and engineer a capability to develop simulations which (1) provide a rich, immersive interface for participation by real humans (exploiting existing high-performance game-engine technology wherever possible), and (2) can leverage Sandia's substantial investment in high-fidelity physical and cognitive models implemented in the Umbra simulation framework. We report here on these efforts. First, we describe the integration of Sandia's Umbra modular simulation framework with the open-source Delta3D game engine. Next, we report on Umbra's integration with Sandia's Cognitive Foundry, specifically to provide for learning behaviors for 'virtual teammates' directly from observed human behavior. Finally, we describe the integration of Delta3D with the ABL behavior engine, and report on research into establishing the theoretical framework that will be required to make use of tools like ABL to scale up to increasingly rich and realistic virtual characters.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

EEG analyses with SOBI

Glickman, Matthew R.

The motivating vision behind Sandia's MENTOR/PAL LDRD project has been that of systems which use real-time psychophysiological data to support and enhance human performance, both individually and of groups. Relevant and significant psychophysiological data being a necessary prerequisite to such systems, this LDRD has focused on identifying and refining such signals. The project has focused in particular on EEG (electroencephalogram) data as a promising candidate signal because it (potentially) provides a broad window on brain activity with relatively low cost and logistical constraints. We report here on two analyses performed on EEG data collected in this project using the SOBI (Second Order Blind Identification) algorithm to identify two independent sources of brain activity: one in the frontal lobe and one in the occipital. The first study looks at directional influences between the two components, while the second study looks at inferring gender based upon the frontal component.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

An optimization approach for fitting canonical tensor decompositions

Acar Ataman, Evrim N.; Dunlavy, Daniel M.

Tensor decompositions are higher-order analogues of matrix decompositions and have proven to be powerful tools for data analysis. In particular, we are interested in the canonical tensor decomposition, otherwise known as the CANDECOMP/PARAFAC decomposition (CPD), which expresses a tensor as the sum of component rank-one tensors and is used in a multitude of applications such as chemometrics, signal processing, neuroscience, and web analysis. The task of computing the CPD, however, can be difficult. The typical approach is based on alternating least squares (ALS) optimization, which can be remarkably fast but is not very accurate. Previously, nonlinear least squares (NLS) methods have also been recommended; existing NLS methods are accurate but slow. In this paper, we propose the use of gradient-based optimization methods. We discuss the mathematical calculation of the derivatives and further show that they can be computed efficiently, at the same cost as one iteration of ALS. Computational experiments demonstrate that the gradient-based optimization methods are much more accurate than ALS and orders of magnitude faster than NLS.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

The Dual-Use Dilemma

Gaudioso, Jennifer M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Biosecurity Policy Drivers

Gaudioso, Jennifer M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

The Design for Tractable Analysis (DTA) Framework: A Methodology for the Analysis and Simulation of Complex Systems

International Journal of Decision Support System Technology (IJDSST)

Linebarger, John M.; De Spain, Mark J.; Mcdonald, Michael J.; Spencer, Floyd W.; Cloutier, Robert J.

The Design for Tractable Analysis (DTA) framework was developed to address the analysis of complex systems and so-called “wicked problems.” DTA is distinctive because it treats analytic processes as key artifacts that can be created and improved through formal design processes. Systems (or enterprises) are analyzed as a whole, in conjunction with decomposing them into constituent elements for domain-specific analyses that are informed by the whole. After using the Systems Modeling Language (SysML) to frame the problem in the context of stakeholder needs, DTA harnesses the Design Structure Matrix (DSM) to structure the analysis of the system and address questions about the emergent properties of the system. The novel use of DSM to “design the analysis” makes DTA particularly suitable for addressing the interdependent nature of complex systems. The use of DTA is demonstrated by a case study of sensor grid placement decisions to secure assets at a fixed site. © 2009, IGI Global. All rights reserved.

More Details

TYPE Journal Article YEAR 2009

Scopus OSTI

Low-dimensional modeling for spatial developing free shear layers

Barone, Matthew F.; Van Bloemen Waanders, Bart

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

SPECIAL FINITE ELEMENT METHODS BASED ON COMPONENT MODE SYNTHESIS TECHNIQUES

ESAIM: Mathematical Modelling and Numerical Analysis

Lehoucq, Rich

Abstract not provided.

More Details

TYPE Journal Article YEAR 2009

OSTI

Peridynamic modeling of the dynamic response of heterogeneous media

Silling, Stewart; Lehoucq, Rich

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

I/O trace data from homme_cam_3_2_59 code runs

Ward, Harry L.

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Calculation of chemical reaction energies using the AM05 density functional

Journal of Computational Chemistry

Wills, Ann E.; Janssen, Curtis L.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2009

OSTI

Improving The Semidefinite Programming Bound To Max Cut

Operations Research Letters

Carr, Robert D.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2009

OSTI

Verification Validation Uncertainty Quantification Predictive Modeling and Simulation: Integration of NW Capabilities into NEAMS

Stewart, James

Abstract not provided.

More Details

TYPE Presentation YEAR 2009

OSTI

Interoperable mesh components for large-scale, distributed-memory simulations

Journal of Physics: Conference Series

Devine, Karen; Diachin, L.; Kraftcheck, J.; Jansen, K.E.; Leung, Vitus J.; Luo, X.; Miller, M.; Ollivier-Gooch, C.; Ovcharenko, A.; Sahni, O.; Shephard, M.S.; Tautges, T.; Xie, T.; Zhou, M.

SciDAC applications have a demonstrated need for advanced software tools to manage the complexities associated with sophisticated geometry, mesh, and field manipulation tasks, particularly as computer architectures move toward the petascale. In this paper, we describe a software component - an abstract data model and programming interface - designed to provide support for parallel unstructured mesh operations. We describe key issues that must be addressed to successfully provide high-performance, distributed-memory unstructured mesh services and highlight some recent research accomplishments in developing new load balancing and MPI-based communication libraries appropriate for leadership class computing. Finally, we give examples of the use of parallel adaptive mesh modification in two SciDAC applications. © 2009 IOP Publishing Ltd.

More Details

TYPE Conference YEAR 2009

Scopus OSTI

DOE's Institute for Advanced Architecture and Algorithms: An application-driven approach

Journal of Physics: Conference Series

Murphy, Richard C.

This paper describes an application driven methodology for understanding the impact of future architecture decisions on the end of the MPP era. Fundamental transistor device limitations combined with application performance characteristics have created the switch to multicore/multithreaded architectures. Designing large-scale supercomputers to match application demands is particularly challenging since performance characteristics are highly counter-intuitive. In fact, data movement more than FLOPS dominates. This work discusses some basic performance analysis for a set of DOE applications, the limits of CMOS technology, and the impact of both on future architectures. © 2009 IOP Publishing Ltd.

More Details

TYPE Conference YEAR 2009

OSTI Scopus

Current trends in parallel computation and the implications for modeling and optimization

Computer Aided Chemical Engineering

Siirola, John D.

Process Systems Engineering (PSE) is built on the application of computational tools to the solution of physical engineering problems. Over the course of its nearly five decade history, advances in PSE have relied roughly equally on advancements in desktop computing technology and developments of new tools and approaches for representing and solving problems (Westerberg, 2004). Just as desktop computing development over that period focused on increasing the net serial instruction rate, tool development in PSE has emphasized creating faster general-purpose serial algorithms. However, in recent years the increase in net serial instruction rate has slowed dramatically, with processors first reaching an effective upper limit for clock speed and now approaching apparent limits for microarchitecture efficiency. Current trends in desktop processor development suggest that future performance gains will occur primarily through exploitation of parallelism. For PSE to continue to leverage the "free" advancements from desktop computing technology in the future, the PSE toolset will need to embrace the use of parallelization. Unfortunately, "parallelization" is more than just identifying multiple things to do at once. Parallel algorithm design has two fundamental challenges: first, to match the characteristics of the parallelizable problem workload to the capabilities of the hardware platform, and second to properly balance parallel computation with the overhead of communication and synchronization on that platform. The performance of any parallel algorithm is thus a strong function of how well the characteristics of the problem and algorithm match those of the hardware platform on which it will run. This has led to a proliferation of highly specialized parallel hardware platforms, each designed around specific problems or problem classes. While every platform has its own unique characteristics, we can group current approaches into six basic classes: symmetric multiprocessing (SMP), networks of workstations (NOW), massively parallel processing (MPP), specialized coprocessors, multi-threaded shared memory, and hybrids that combine components of the first five classes. Perhaps the most familiar of these is the SMP architecture, which forms the bulk of current the desktop and workstation market. These systems have multiple processing units (processors and/or cores) controlled by a single operating system image and sharing a single common shared memory space. While SMP systems provide only a modest level of parallelism (typically 2-16 processing units), the existence of shared memory and full-featured processing units makes them perhaps the most straightforward development platform. A challenge of SMP platforms is the discrepancy between the speed of the processor and the memory system: both latency and overall memory bandwidth limitations can lead to processors idling waiting for data. Clusters, a generic term for coordinated groups of independent computers (nodes) connected with high-speed networks, provide the opportunity for a radically different level of parallelism, with the largest clusters having over 25,000 nodes and 100,000 processing units. The challenge with clusters is memory is distributed across independent nodes. Communication and coordination among nodes must be explicitly managed and occurs over a relatively high latency network interconnect. Efficient use of this architecture requires applications that decompose into pseudo-independent components that run with high computation to communication ratios. The level to which systems utilize commodity components distinguishes the two main types of cluster architectures, with NOW nodes running commodity network interconnects and operating systems and MPP nodes using specialized or proprietary network layers or microkernels. Specialized coprocessors, including graphics processing units (GPU) and the Cell Broadband Engine (Cell), are gaining popularity as scientific computing platforms. These platforms employ non-general purpose dependent processing units to speed fine-grained, repetitive processing. Architecturally, they are reminiscent of vector computing, combining very fast access to a small amount of local memory with processing elements implementing either a single-instruction-multiple-data (SIMD) (GPU) or a pipelined (Cell) model. As application developers must explicitly manage both parallelism on the coprocessor and the movement of data to and from the coprocessor memory space, these architectures can be some of the most challenging to program. Finally, multi-threaded shared memory (MTSM) systems represent a fundamental departure from traditional distributed memory systems like NOW and MPP. Instead of a collection of independent nodes and memory spaces, an MTSM system runs a single system image across all nodes, combining all node memory into a single coherent shared memory space. To a developer, the MTSM appears to be a single very large SMP. However, unlike a SMP that uses caches to reduce the latency of a memory access, the MTSM tolerates latency by using a large number of concurrent threads. While this architecture lends itself to problems that are not readily decomposable, effective utilization of MTSM systems requires applications to run hundreds - or thousands - of concurrent threads. The proliferation of specialized parallel computing architectures presents several significant challenges for developers of parallel modeling and optimization applications. Foremost is the challenge of selecting the "appropriate" platform to target when developing the application. While it is clear that architectural characteristics can significantly affect the performance of an algorithm, relatively few rules or heuristics exist for selecting a platform based solely on application characteristics. A contributing challenge is that different architectures employ fundamentally different programming paradigms, libraries, and tools. Knowledge and experience on one platform does not necessarily translate to other platforms. This also complicates the process of directly comparing platform performance, as applications are rarely portable: software designed for one platform rarely compiles on another without modification, and the modifications may require a redesign of the fundamental parallelization approach. A final challenge is effectively communicating parallel results. While the relatively homogenous environment of serial desktop computing facilitated extremely terse descriptions of a test platform, often limited to processor make and clock speed, reporting results for parallel architectures must include not only processor information, but depending on the architecture, also include operating system, network interconnect, coprocessor make, model, and interconnect, and node configuration. There are numerous examples of algorithms and applications designed explicitly to leverage specific architectural features of parallel systems. While by no means comprehensive, three current representative efforts are the development of parallel branch and bound algorithms, distributed collaborative optimization algorithms, and multithreaded parallel discrete event simulation. PICO, the Parallel Integer and Combinatorial Optimizer (Eckstein, et al., 2001), is a scalable parallel mixed-integer linear optimizer. Designed explicitly for cluster environments (both NOW and MPP), PICO leverages the synergy between the inherently decomposable branch and bound tree search and the independent nature of the nodes within a cluster by distributing the independent sub-problems for the tree search across the nodes of the cluster. In contrast, agent-based collaborative optimization (Siirola, et al., 2004, 2007) matches traditionally non-decomposable nonlinear programming algorithms to high-latency clusters (e.g. NOWs or Grids) by replicating serial search algorithms intact and unmodified across the independent nodes of the cluster. The system then enforces collaboration through sharing intermediate "solutions" to the common problem. This creates a decomposable artificial meta-algorithm with a high computation to communication ratio that can scale efficiently on large, high latency, low bandwidth cluster environments. For modeling applications, efficiently parallelizing discrete event simulations has presented a longstanding challenge, with several decades of study and literature (Perumalla, 2006). The central challenge in parallelizing discrete event simulations on traditional distributed memory clusters is efficiently synchronizing the simulation time across the processing nodes during a simulation. A promising alternative approach leverages the Cray XMT (formerly called Eldorado; Feo, et al. 2005). The XMT implements an MTSM architecture and provides a single shared memory space across all nodes, greatly simplifying the time synchronization challenge. Further, the fine-grained parallelism provided by the architecture opens new opportunities for additional parallelism beyond simple event parallelization, for example, parallelizing the event queue management. While these three examples are a small subset of current parallel algorithm design, they demonstrate the impact that parallel architectures have had and will continue to have on future developments for modeling and optimization in PSE. © 2009 Elsevier B.V. All rights reserved.

More Details

TYPE Conference YEAR 2009

OSTI Scopus

Networks Grand Challenge LDRD External Advisory Board Meeting

Rountree, Suzanne L.K.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

Magnetic-pulse-driven Rayleigh-Taylor instability in plastically deforming metals

Niederhaus, John H.J.; Alexander, Charles S.; Haill, Thomas A.; Vogler, Tracy J.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Multilevel Project

Tuminaro, Raymond S.; Hu, Jonathan J.; Siefert, Christopher

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

A New Parallel Strategy for Transistor-Level Circuit Simulation

Keiter, Eric R.; Thornquist, Heidi K.; Day, David M.; Boman, Erik G.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Climate Changes in the Arctic and The Challenge to USCG Operations

Mitchiner, John L.; Strickland, James H.; Heermann, Philip D.; Sanzero, George V.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

Performance of an MPI-only semiconductor device simulator on a quad socket/quad core InfiniBand platform

Shadid, John N.

This preliminary study considers the scaling and performance of a finite element (FE) semiconductor device simulator on a capacity cluster with 272 compute nodes based on a homogeneous multicore node architecture utilizing 16 cores. The inter-node communication backbone for this Tri-Lab Linux Capacity Cluster (TLCC) machine is comprised of an InfiniBand interconnect. The nonuniform memory access (NUMA) nodes consist of 2.2 GHz quad socket/quad core AMD Opteron processors. The performance results for this study are obtained with a FE semiconductor device simulation code (Charon) that is based on a fully-coupled Newton-Krylov solver with domain decomposition and multilevel preconditioners. Scaling and multicore performance results are presented for large-scale problems of 100+ million unknowns on up to 4096 cores. A parallel scaling comparison is also presented with the Cray XT3/4 Red Storm capability platform. The results indicate that an MPI-only programming model for utilizing the multicore nodes is reasonably efficient on all 16 cores per compute node. However, the results also indicated that the multilevel preconditioner, which is critical for large-scale capability type simulations, scales better on the Red Storm machine than the TLCC machine.

More Details

TYPE SAND Report YEAR 2008

DOI OSTI

An Extensible Operating System Design for Large-Scale Parallel Machines

Riesen, Rolf; Ferreira, Kurt

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Modeling Populations of Interest in Order to Simulate Cultural Response to Influence Activities

Bernard, Michael; Backus, George A.; Glickman, Matthew R.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

DAKOTA Training 2008: Optimization and Calibration

Adams, Brian M.; Swiler, Laura P.; Eldred, Michael; Gay, David M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2008

OSTI

On the two-domain equations for gas chromatography

Romero, Louis; Parks, Michael L.

We present an analysis of gas chromatographic columns where the stationary phase is not assumed to be a thin uniform coating along the walls of the cross section. We also give an asymptotic analysis assuming that the parameter {beta} = KD{sup II}{rho}{sup II}/D{sup I}{rho}{sup I} is small. Here K is the partition coefficient, and D{sup i} and {rho}{sup i}, i = I, II are the diffusivity and density in the mobile (i = I) and stationary (i = II) regions.

More Details

TYPE SAND Report YEAR 2008

DOI OSTI

Publications

Search results