Publications

Results 5201–5400 of 9,998

Search results

Jump to search filters

Quantum mechanical studies of carbon structures

Ward, Donald K.; Zhou, Xiaowang Z.; Bartelt, Norman C.; Foster, Michael E.; Schultz, Peter A.; Wang, Bryan M.; Mccarty, Kevin F.

Carbon nanostructures, such as nanotubes and graphene, are of considerable interest due to their unique mechanical and electrical properties. The materials exhibit extremely high strength and conductivity when defects created during synthesis are minimized. Atomistic modeling is one technique for high resolution studies of defect formation and mitigation. To enable simulations of the mechanical behavior and growth mechanisms of C nanostructures, a high-fidelity analytical bond-order potential for the C is needed. To generate inputs for developing such a potential, we performed quantum mechanical calculations of various C structures.

More Details

Roadmap for Peridynamic Software Implementation

Littlewood, David J.

The application of peridynamics for engineering analysis requires an efficient and robust software implementation. Key elements include processing of the discretization, the proximity search for identification of pairwise interactions, evaluation of the con- stitutive model, application of a bond-damage law, and contact modeling. Additional requirements may arise from the choice of time integration scheme, for example esti- mation of the maximum stable time step for explicit schemes, and construction of the tangent stiffness matrix for many implicit approaches. This report summaries progress to date on the software implementation of the peridynamic theory of solid mechanics. Discussion is focused on parallel implementation of the meshfree discretization scheme of Silling and Askari [33] in three dimensions, although much of the discussion applies to computational peridynamics in general.

More Details

Time Series Discord Detection in Medical Data using a Parallel Relational Database

Woodbridge, Diane W.; Laros, James H.; Wilson, Andrew T.; Goldstein, Richard

Recent advances in sensor technology have made continuous real-time health monitoring available in both hospital and non-hospital settings. Since data collected from high frequency medical sensors includes a huge amount of data, storing and processing continuous medical data is an emerging big data area. Especially detecting anomaly in real time is important for patients’ emergency detection and prevention. A time series discord indicates a subsequence that has the maximum difference to the rest of the time series subsequences, meaning that it has abnormal or unusual data trends. In this study, we implemented two versions of time series discord detection algorithms on a high performance parallel database management system (DBMS) and applied them to 240 Hz waveform data collected from 9,723 patients. The initial brute force version of the discord detection algorithm takes each possible subsequence and calculates a distance to the nearest non-self match to find the biggest discords in time series. For the heuristic version of the algorithm, a combination of an array and a trie structure was applied to order time series data for enhancing time efficiency. The study results showed efficient data loading, decoding and discord searches in a large amount of data, benefiting from the time series discord detection algorithm and the architectural characteristics of the parallel DBMS including data compression, data pipe-lining, and task scheduling.

More Details

Active Learning in the Era of Big Data

Jamieson, Kevin; Davis, Warren L.

Active learning methods automatically adapt data collection by selecting the most informative samples in order to accelerate machine learning. Because of this, real-world testing and comparing active learning algorithms requires collecting new datasets (adaptively), rather than simply applying algorithms to benchmark datasets, as is the norm in (passive) machine learning research. To facilitate the development, testing and deployment of active learning for real applications, we have built an open-source software system for large-scale active learning research and experimentation. The system, called NEXT, provides a unique platform for realworld, reproducible active learning research. This paper details the challenges of building the system and demonstrates its capabilities with several experiments. The results show how experimentation can help expose strengths and weaknesses of active learning algorithms, in sometimes unexpected and enlightening ways.

More Details

Introduction to the Special Issue on Innovative Applications of Artificial Intelligence 2014

AI Magazine

Stracuzzi, David J.; Gunning, David

This issue features expanded versions of articles selected from the 2014 AAAI Conference on Innovative Applications of Artificial Intelligence held in Quebec City, Canada. We present a selection of four articles describing deployed applications plus two more articles that discuss work on emerging applications.

More Details

Training neural hardware with noisy components

Proceedings of the International Joint Conference on Neural Networks

Rothganger, Fredrick R.; Evans, Brian R.; Aimone, James B.; DeBenedictis, Erik

Some next generation computing devices may consist of resistive memory arranged as a crossbar. Currently, the dominant approach is to use crossbars as the weight matrix of a neural network, and to use learning algorithms that require small incremental weight updates, such as gradient descent (for example Backpropagation). Using real-world measurements, we demonstrate that resistive memory devices are unlikely to support such learning methods. As an alternative, we offer a random search algorithm tailored to the measured characteristics of our devices.

More Details

Repeated play of the SVM game as a means of adaptive classification

Proceedings of the International Joint Conference on Neural Networks

Vineyard, Craig M.; Verzi, Stephen J.; James, Conrad D.; Aimone, James B.; Heileman, Gregory L.

The field of machine learning strives to develop algorithms that, through learning, lead to generalization; that is, the ability of a machine to perform a task that it was not explicitly trained for. An added challenge arises when the problem domain is dynamic or non-stationary with the data distributions or categorizations changing over time. This phenomenon is known as concept drift. Game-theoretic algorithms are often iterative by nature, consisting of repeated game play rather than a single interaction. Effectively, rather than requiring extensive retraining to update a learning model, a game-theoretic approach can adjust strategies as a novel approach to concept drift. In this paper we present a variant of our Support Vector Machine (SVM) Game classifier which may be used in an adaptive manner with repeated play to address concept drift, and show results of applying this algorithm to synthetic as well as real data.

More Details

Modeling of hydride precipitation and re-orientation

Tikare, Veena T.; Weck, Philippe F.; Mitchell, John A.

In this report, we present a thermodynamic-­based model of hydride precipitation in Zr-based claddings. The model considers the state of the cladding immediately following drying, after removal from cooling-pools, and presents the evolution of precipitate formation upon cooling as follows: The pilgering process used to form Zr-based cladding imparts strong crystallographic and grain shape texture, with the basal plane of the hexagonal α-Zr grains being strongly aligned in the rolling-­direction and the grains are elongated with grain size being approximately twice as long parallel to the rolling direction, which is also the long axis of the tubular cladding, as it is in the orthogonal directions.

More Details

Evaluating Moving Target Defense with PLADD

Jones, Stephen T.; Outkin, Alexander V.; Gearhart, Jared L.; Hobbs, Jacob A.; Siirola, John D.; Phillips, Cynthia A.; Verzi, Stephen J.; Tauritz, Daniel T.; Mulder, Samuel A.; Naugle, Asmeret B.

This project evaluates the effectiveness of moving target defense (MTD) techniques using a new game we have designed, called PLADD, inspired by the game FlipIt [28]. PLADD extends FlipIt by incorporating what we believe are key MTD concepts. We have analyzed PLADD and proven the existence of a defender strategy that pushes a rational attacker out of the game, demonstrated how limited the strategies available to an attacker are in PLADD, and derived analytic expressions for the expected utility of the game’s players in multiple game variants. We have created an algorithm for finding a defender’s optimal PLADD strategy. We show that in the special case of achieving deterrence in PLADD, MTD is not always cost effective and that its optimal deployment may shift abruptly from not using MTD at all to using it as aggressively as possible. We believe our effort provides basic, fundamental insights into the use of MTD, but conclude that a truly practical analysis requires model selection and calibration based on real scenarios and empirical data. We propose several avenues for further inquiry, including (1) agents with adaptive capabilities more reflective of real world adversaries, (2) the presence of multiple, heterogeneous adversaries, (3) computational game theory-based approaches such as coevolution to allow scaling to the real world beyond the limitations of analytical analysis and classical game theory, (4) mapping the game to real-world scenarios, (5) taking player risk into account when designing a strategy (in addition to expected payoff), (6) improving our understanding of the dynamic nature of MTD-inspired games by using a martingale representation, defensive forecasting, and techniques from signal processing, and (7) using adversarial games to develop inherently resilient cyber systems.

More Details

Three-dimensional fully-coupled electrical and thermal transport model of dynamic switching in oxide memristors

ECS Transactions (Online)

Gao, Xujiao G.; Mamaluy, Denis M.; Mickel, Patrick R.; Marinella, Matthew J.

In this paper, we present a fully-coupled electrical and thermal transport model for oxide memristors that solves simultaneously the time-dependent continuity equations for all relevant carriers, together with the time-dependent heat equation including Joule heating sources. The model captures all the important processes that drive memristive switching and is applicable to simulate switching behavior in a wide range of oxide memristors. The model is applied to simulate the ON switching in a 3D filamentary TaOx memristor. Simulation results show that, for uniform vacancy density in the OFF state, vacancies fill in the conduction filament till saturation, and then fill out a gap formed in the Ta electrode during ON switching; furthermore, ON-switching time strongly depends on applied voltage and the ON-to-OFF current ratio is sensitive to the filament vacancy density in the OFF state.

More Details

PANTHER. Pattern ANalytics To support High-performance Exploitation and Reasoning

Czuchlewski, Kristina R.; Hart, William E.

Sandia has approached the analysis of big datasets with an integrated methodology that uses computer science, image processing, and human factors to exploit critical patterns and relationships in large datasets despite the variety and rapidity of information. The work is part of a three-year LDRD Grand Challenge called PANTHER (Pattern ANalytics To support High-performance Exploitation and Reasoning). To maximize data analysis capability, Sandia pursued scientific advances across three key technical domains: (1) geospatial-temporal feature extraction via image segmentation and classification; (2) geospatial-temporal analysis capabilities tailored to identify and process new signatures more efficiently; and (3) domain- relevant models of human perception and cognition informing the design of analytic systems. Our integrated results include advances in geographical information systems (GIS) in which we discover activity patterns in noisy, spatial-temporal datasets using geospatial-temporal semantic graphs. We employed computational geometry and machine learning to allow us to extract and predict spatial-temporal patterns and outliers from large aircraft and maritime trajectory datasets. We automatically extracted static and ephemeral features from real, noisy synthetic aperture radar imagery for ingestion into a geospatial-temporal semantic graph. We worked with analysts and investigated analytic workflows to (1) determine how experiential knowledge evolves and is deployed in high-demand, high-throughput visual search workflows, and (2) better understand visual search performance and attention. Through PANTHER, Sandia's fundamental rethinking of key aspects of geospatial data analysis permits the extraction of much richer information from large amounts of data. The project results enable analysts to examine mountains of historical and current data that would otherwise go untouched, while also gaining meaningful, measurable, and defensible insights into overlooked relationships and patterns. The capability is directly relevant to the nation's nonproliferation remote-sensing activities and has broad national security applications for military and intelligence- gathering organizations.

More Details

ASC Trilab L2 Codesign Milestone 2015

Trott, Christian R.; Hammond, Simon D.; Dinge, Dennis D.; Lin, Paul L.; Vaughan, Courtenay T.; Cook, Jeanine C.; Rajan, Mahesh R.; Edwards, Harold C.; Hoekstra, Robert J.

For the FY15 ASC L2 Trilab Codesign milestone Sandia National Laboratories performed two main studies. The first study investigated three topics (performance, cross-platform portability and programmer productivity) when using OpenMP directives and the RAJA and Kokkos programming models available from LLNL and SNL respectively. The focus of this first study was the LULESH mini-application developed and maintained by LLNL. In the coming sections of the report the reader will find performance comparisons (and a demonstration of portability) for a variety of mini-application implementations produced during this study with varying levels of optimization. Of note is that the implementations utilized including optimizations across a number of programming models to help ensure claims that Kokkos can provide native-class application performance are valid. The second study performed during FY15 is a performance assessment of the MiniAero mini-application developed by Sandia. This mini-application was developed by the SIERRA Thermal-Fluid team at Sandia for the purposes of learning the Kokkos programming model and so is available in only a single implementation. For this report we studied its performance and scaling on a number of machines with the intent of providing insight into potential performance issues that may be experienced when similar algorithms are deployed on the forthcoming Trinity ASC ATS platform.

More Details

Predicting growth of graphene nanostructures using high-fidelity atomistic simulations

Bartelt, Norman C.; Mccarty, Keven F.; Foster, Michael E.; Schultz, Peter A.; Zhou, Xiaowang Z.; Ward, Donald K.

In this project we developed t he atomistic models needed to predict how graphene grows when carbon is deposited on metal and semiconductor surfaces. We first calculated energies of many carbon configurations using first principles electronic structure calculations and then used these energies to construct an empirical bond order potentials that enable s comprehensive molecular dynamics simulation of growth. We validated our approach by comparing our predictions to experiments of graphene growth on Ir, Cu and Ge. The robustness of ou r understanding of graphene growth will enable high quality graphene to be grown on novel substrates which will expand the number of potential types of graphene electronic devices.

More Details

Efficient Probability of Failure Calculations for QMU using Computational Geometry LDRD 13-0144 Final Report

Mitchell, Scott A.; Ebeida, Mohamed S.; Romero, Vicente J.; Swiler, Laura P.; Rushdi, Ahmad A.; Abdelkader, Ahmad

This SAND report summarizes our work on the Sandia National Laboratory LDRD project titled "Efficient Probability of Failure Calculations for QMU using Computational Geometry" which was project #165617 and proposal #13-0144. This report merely summarizes our work. Those interested in the technical details are encouraged to read the full published results, and contact the report authors for the status of the software and follow-on projects.

More Details

Strong Local-Nonlocal Coupling for Integrated Fracture Modeling

Littlewood, David J.; Silling, Stewart A.; Mitchell, John A.; Seleson, Pablo D.; Bond, Stephen D.; Parks, Michael L.; Turner, Daniel Z.; Burnett, Damon J.; Ostien, Jakob O.; Gunzburger, Max

Peridynamics, a nonlocal extension of continuum mechanics, is unique in its ability to capture pervasive material failure. Its use in the majority of system-level analyses carried out at Sandia, however, is severely limited, due in large part to computational expense and the challenge posed by the imposition of nonlocal boundary conditions. Combined analyses in which peridynamics is em- ployed only in regions susceptible to material failure are therefore highly desirable, yet available coupling strategies have remained severely limited. This report is a summary of the Laboratory Directed Research and Development (LDRD) project "Strong Local-Nonlocal Coupling for Inte- grated Fracture Modeling," completed within the Computing and Information Sciences (CIS) In- vestment Area at Sandia National Laboratories. A number of challenges inherent to coupling local and nonlocal models are addressed. A primary result is the extension of peridynamics to facilitate a variable nonlocal length scale. This approach, termed the peridynamic partial stress, can greatly reduce the mathematical incompatibility between local and nonlocal equations through reduction of the peridynamic horizon in the vicinity of a model interface. A second result is the formulation of a blending-based coupling approach that may be applied either as the primary coupling strategy, or in combination with the peridynamic partial stress. This blending-based approach is distinct from general blending methods, such as the Arlequin approach, in that it is specific to the coupling of peridynamics and classical continuum mechanics. Facilitating the coupling of peridynamics and classical continuum mechanics has also required innovations aimed directly at peridynamic models. Specifically, the properties of peridynamic constitutive models near domain boundaries and shortcomings in available discretization strategies have been addressed. The results are a class of position-aware peridynamic constitutive laws for dramatically improved consistency at domain boundaries, and an enhancement to the meshfree discretization applied to peridynamic models that removes irregularities at the limit of the nonlocal length scale and dramatically improves conver- gence behavior. Finally, a novel approach for modeling ductile failure has been developed, moti- vated by the desire to apply coupled local-nonlocal models to a wide variety of materials, including ductile metals, which have received minimal attention in the peridynamic literature. Software im- plementation of the partial-stress coupling strategy, the position-aware peridynamic constitutive models, and the strategies for improving the convergence behavior of peridynamic models was completed within the Peridigm and Albany codes, developed at Sandia National Laboratories and made publicly available under the open-source 3-clause BSD license.

More Details

Towards Accurate Application Characterization for Exascale (APEX)

Hammond, Simon D.

Sandia National Laboratories has been engaged in hardware and software codesign activities for a number of years, indeed, it might be argued that prototyping of clusters as far back as the CPLANT machines and many large capability resources including ASCI Red and RedStorm were examples of codesigned solutions. As the research supporting our codesign activities has moved closer to investigating on-node runtime behavior a nature hunger has grown for detailed analysis of both hardware and algorithm performance from the perspective of low-level operations. The Application Characterization for Exascale (APEX) LDRD was a project concieved of addressing some of these concerns. Primarily the research was to intended to focus on generating accurate and reproducible low-level performance metrics using tools that could scale to production-class code bases. Along side this research was an advocacy and analysis role associated with evaluating tools for production use, working with leading industry vendors to develop and refine solutions required by our code teams and to directly engage with production code developers to form a context for the application analysis and a bridge to the research community within Sandia. On each of these accounts significant progress has been made, particularly, as this report will cover, in the low-level analysis of operations for important classes of algorithms. This report summarizes the development of a collection of tools under the APEX research program and leaves to other SAND and L2 milestone reports the description of codesign progress with Sandia’s production users/developers.

More Details

Preliminary Results on Uncertainty Quantification for Pattern Analytics

Stracuzzi, David J.; Brost, Randolph B.; Chen, Maximillian G.; Malinas, Rebecca; Peterson, Matthew G.; Phillips, Cynthia A.; Robinson, David G.; Woodbridge, Diane W.

This report summarizes preliminary research into uncertainty quantification for pattern ana- lytics within the context of the Pattern Analytics to Support High-Performance Exploitation and Reasoning (PANTHER) project. The primary focus of PANTHER was to make large quantities of remote sensing data searchable by analysts. The work described in this re- port adds nuance to both the initial data preparation steps and the search process. Search queries are transformed from does the specified pattern exist in the data? to how certain is the system that the returned results match the query? We show example results for both data processing and search, and discuss a number of possible improvements for each.

More Details

PANTHER. Trajectory Analysis

Laros, James H.; Wilson, Andrew T.; Valicka, Christopher G.; Kegelmeyer, William P.; Shead, Timothy M.; Czuchlewski, Kristina R.; Newton, Benjamin D.

We want to organize a body of trajectories in order to identify, search for, classify and predict behavior among objects such as aircraft and ships. Existing compari- son functions such as the Fr'echet distance are computationally expensive and yield counterintuitive results in some cases. We propose an approach using feature vectors whose components represent succinctly the salient information in trajectories. These features incorporate basic information such as total distance traveled and distance be- tween start/stop points as well as geometric features related to the properties of the convex hull, trajectory curvature and general distance geometry. Additionally, these features can generally be mapped easily to behaviors of interest to humans that are searching large databases. Most of these geometric features are invariant under rigid transformation. We demonstrate the use of different subsets of these features to iden- tify trajectories similar to an exemplar, cluster a database of several hundred thousand trajectories, predict destination and apply unsupervised machine learning algorithms.

More Details

ASC ATDM Level 2 Milestone #5325: Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms

Baker, Gavin M.; Bettencourt, Matthew T.; Bova, S.W.; Franko, Ken; Gamell, Marc; Grant, Ryan E.; Hammond, Simon D.; Hollman, David S.; Knight, Samuel K.; Kolla, Hemanth K.; Lin, Paul L.; Olivier, Stephen L.; Sjaardema, Gregory D.; Slattengren, Nicole L.; Teranishi, Keita T.; Wilke, Jeremiah J.; Bennett, Janine C.; Clay, Robert L.; Kale, Laxkimant; Jain, Nikhil; Mikida, Eric; Aiken, Alex; Bauer, Michael; Lee, Wonchan; Slaughter, Elliott; Treichler, Sean; Berzins, Martin; Harman, Todd; Humphreys, Alan; Schmidt, John; Sunderland, Dan; Mccormick, Pat; Gutierrez, Samuel; Shulz, Martin; Gamblin, Todd; Bremer, Peer-Timo

Abstract not provided.

ASC ATDM Level 2 Milestone #5325: Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms

Baker, Gavin M.; Bettencourt, Matthew T.; Bova, S.W.; Franko, Ken; Gamell, Marc; Grant, Ryan E.; Hammond, Simon D.; Hollman, David S.; Knight, Samuel K.; Kolla, Hemanth K.; Lin, Paul L.; Olivier, Stephen L.; Sjaardema, Gregory D.; Slattengren, Nicole L.; Teranishi, Keita T.; Wilke, Jeremiah J.; Bennett, Janine C.; Clay, Robert L.; Kale, Laxkimant; Jain, Nikhil; Mikida, Eric; Aiken, Alex; Bauer, Michael; Lee, Wonchan; Slaughter, Elliott; Treichler, Sean; Berzins, Martin; Harman, Todd; Humphreys, Alan; Schmidt, John; Sunderland, Dan; Mccormick, Pat; Gutierrez, Samuel; Shulz, Martin; Gamblin, Todd; Bremer, Peer T.

This report provides in-depth information and analysis to help create a technical road map for developing next-generation programming models and runtime systems that support Advanced Simulation and Computing (ASC) work- load requirements. The focus herein is on asynchronous many-task (AMT) model and runtime systems, which are of great interest in the context of "Oriascale7 computing, as they hold the promise to address key issues associated with future extreme-scale computer architectures. This report includes a thorough qualitative and quantitative examination of three best-of-class AIM] runtime systems – Charm-++, Legion, and Uintah, all of which are in use as part of the Centers. The studies focus on each of the runtimes' programmability, performance, and mutability. Through the experiments and analysis presented, several overarching Predictive Science Academic Alliance Program II (PSAAP-II) Asc findings emerge. From a performance perspective, AIV runtimes show tremendous potential for addressing extreme- scale challenges. Empirical studies show an AM runtime can mitigate performance heterogeneity inherent to the machine itself and that Message Passing Interface (MP1) and AM11runtimes perform comparably under balanced conditions. From a programmability and mutability perspective however, none of the runtimes in this study are currently ready for use in developing production-ready Sandia ASC applications. The report concludes by recommending a co- design path forward, wherein application, programming model, and runtime system developers work together to define requirements and solutions. Such a requirements-driven co-design approach benefits the community as a whole, with widespread community engagement mitigating risk for both application developers developers. and high-performance computing runtime systein

More Details

PDE Constrained Optimization for Digital Image Correlation

Turner, Daniel Z.; Lehoucq, Richard B.; Garavito-Garzon, Carlos A.

The purpose of this report is to investigate a partial differential equation (PDE) constrained optimization approach for estimating the velocity field given image data for use within digital image correlation (DIC). We first introduce the problem and the standard DIC approach and then demonstrate why the DIC problem is ill-posed and introduce a standard regularization of the problem. We also demonstrate that the functional used is sensitive and robust via a sequence of experiments given by a stochastic model inducing the PDE constraint.

More Details

Sensitivity Analysis of OECD Benchmark Tests in BISON

Swiler, Laura P.; Gamble, Kyle; Schmidt, Rodney C.; Williamson, Richard

This report summarizes a NEAMS (Nuclear Energy Advanced Modeling and Simulation) project focused on sensitivity analysis of a fuels performance benchmark problem. The benchmark problem was defined by the Uncertainty Analysis in Modeling working group of the Nuclear Science Committee, part of the Nuclear Energy Agency of the Organization for Economic Cooperation and Development (OECD ). The benchmark problem involv ed steady - state behavior of a fuel pin in a Pressurized Water Reactor (PWR). The problem was created in the BISON Fuels Performance code. Dakota was used to generate and analyze 300 samples of 17 input parameters defining core boundary conditions, manuf acturing tolerances , and fuel properties. There were 24 responses of interest, including fuel centerline temperatures at a variety of locations and burnup levels, fission gas released, axial elongation of the fuel pin, etc. Pearson and Spearman correlatio n coefficients and Sobol' variance - based indices were used to perform the sensitivity analysis. This report summarizes the process and presents results from this study.

More Details

High Performance Computing - Power Application Programming Interface Specification

Laros, James H.; Kelly, Suzanne M.; Laros, James H.; Grant, Ryan E.; Olivier, Stephen L.; Levenhagen, Michael J.; DeBonis, David D.

Achieving practical exascale supercomputing will require massive increases in energy efficiency. The bulk of this improvement will likely be derived from hardware advances such as improved semiconductor device technologies and tighter integration, hopefully resulting in more energy efficient computer architectures. Still, software will have an important role to play. With every generation of new hardware, more power measurement and control capabilities are exposed. Many of these features require software involvement to maximize feature benefits. This trend will allow algorithm designers to add power and energy efficiency to their optimization criteria. Similarly, at the system level, opportunities now exist for energy-aware scheduling to meet external utility constraints such as time of day cost charging and power ramp rate limitations. Finally, future architectures might not be able to operate all components at full capability for a range of reasons including temperature considerations or power delivery limitations. Software will need to make appropriate choices about how to allocate the available power budget given many, sometimes conflicting considerations.

More Details

The Promise of Quantum Simulation

ACS Nano

Muller, Richard P.; Blume-Kohout, Robin J.

Quantum simulations promise to be one of the primary applications of quantum computers, should one be constructed. This article briefly summarizes the history of quantum simulation in light of the recent result of Wang and co-workers, demonstrating calculation of the ground and excited states for a HeH+ molecule, and concludes with a discussion of why this and other recent progress in the field suggest that quantum simulations of quantum chemistry have a bright future. (Figure Presented).

More Details

A new class of finite element variational multiscale turbulence models for incompressible magnetohydrodynamics

Journal of Computational Physics

Sondak, D.; Shadid, John N.; Oberai, A.A.; Pawlowski, Roger P.; Cyr, Eric C.; Smith, Thomas M.

New large eddy simulation (LES) turbulence models for incompressible magnetohydrodynamics (MHD) derived from the variational multiscale (VMS) formulation for finite element simulations are introduced. The new models include the variational multiscale formulation, a residual-based eddy viscosity model, and a mixed model that combines both of these component models. Each model contains terms that are proportional to the residual of the incompressible MHD equations and is therefore numerically consistent. Moreover, each model is also dynamic, in that its effect vanishes when this residual is small. The new models are tested on the decaying MHD Taylor Green vortex at low and high Reynolds numbers. The evaluation of the models is based on comparisons with available data from direct numerical simulations (DNS) of the time evolution of energies as well as energy spectra at various discrete times. A numerical study, on a sequence of meshes, is presented that demonstrates that the large eddy simulation approaches the DNS solution for these quantities with spatial mesh refinement.

More Details

Optimal adiabatic scaling and the processor-in-memory-and-storage architecture (OAS+PIMS)

Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2015

DeBenedictis, Erik; Cook, Jeanine C.; Hoemmen, Mark F.; Metodi, Tzvetan S.

We discuss a new approach to computing that retains the possibility of exponential growth while making substantial use of the existing technology. The exponential improvement path of Moore's Law has been the driver behind the computing approach of Turing, von Neumann, and FORTRAN-like languages. Performance growth is slowing at the system level, even though further exponential growth should be possible. We propose two technology shifts as a remedy, the first being the formulation of a scaling rule for scaling into the third dimension. This involves use of circuit-level energy efficiency increases using adiabatic circuits to avoid overheating. However, this scaling rule is incompatible with the von Neumann architecture. The second technology shift is a computer architecture and programming change to an extremely aggressive form of Processor-In-Memory (PIM) architecture, which we call Processor-In-Memory-and-Storage (PIMS). Theoretical analysis shows that the PIMS architecture is compatible with the 3D scaling rule, suggesting both immediate benefit and a long-term improvement path.

More Details

Modeling of Arctic Storms with a Variable High-Resolution General Circulation Model

Roesler, Erika L.; Bosler, Peter A.; Taylor, Mark A.

The Department of Energy’s (DOE) Biological and Environmental Research project, “Water Cycle and Climate Extremes Modeling” is improving our understanding and modeling of regional details of the Earth’s water cycle. Sandia is using high resolution model behavior to investigate storms in the Arctic.

More Details

Assessing a mini-application as a performance proxy for a finite element method engineering application

Concurrency and Computation. Practice and Experience

Lin, Paul L.; Heroux, Michael A.; Williams, Alan B.; Barrett, Richard F.

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community. However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.

More Details

Phase Detection with Hidden Markov Models for DVFS on Many-Core Processors

Proceedings - International Conference on Distributed Computing Systems

Booth, Joshua D.; Kotra, Jagadish; Zhao, Hui; Kandemir, Mahmut; Raghavan, Padma

The energy concerns of many-core processors are increasing with the number of cores. We provide a new method that reduces energy consumption of an application on many-core processors by identifying unique segments to apply dynamic voltage and frequency scaling (DVFS). Our method, phase-based voltage and frequency scaling (PVFS), hinges on the identification of phases, i.e., Segments of code with unique performance and power attributes, using hidden Markov Models. In particular, we demonstrate the use of this method to target hardware components on many-core processors such as Network-on-Chip (NoC). PVFS uses these phases to construct a static power schedule that uses DVFS to reduce energy with minimal performance penalty. This general scheme can be used with a variety of performance and power metrics to match the needs of the system and application. More importantly, the flexibility in the general scheme allows for targeting of the unique hardware components of future many-core processors. We provide an in-depth analysis of PVFS applied to five threaded benchmark applications, and demonstrate the advantage of using PVFS for 4 to 32 cores in a single socket. Empirical results of PVFS show a reduction of up to 10.1% of total energy while only impacting total time by at most 2.7% across all core counts. Furthermore, PVFS outperforms standard coarse-grain time-driven DVFS, while scaling better in terms of energy savings with increasing core counts.

More Details

High-Performance Graph Analytics on Manycore Processors

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Slota, George M.; Rajamanickam, Sivasankaran R.; Madduri, Kamesh

The divergence in the computer architecture landscape has resulted in different architectures being considered mainstream at the same time. For application and algorithm developers, a dilemma arises when one must focus on using underlying architectural features to extract the best performance on each of these architectures, while writing portable code at the same time. We focus on this problem with graph analytics as our target application domain. In this paper, we present an abstraction-based methodology for performance-portable graph algorithm design on manicure architectures. We demonstrate our approach by systematically optimizing algorithms for the problems of breadth-first search, color propagation, and strongly connected components. We use Kokkos, a manicure library and programming model, for prototyping our algorithms. Our portable implementation of the strongly connected components algorithm on the NVIDIA Tesla K40M is up to 3.25× faster than a state-of-the-art parallel CPU implementation on a dual-socket Sandy Bridge compute node.

More Details

Cooperative Computing for Autonomous Data Centers

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Berry, Jonathan W.; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy

We present a new distributed model for graph computations motivated by limited information sharing. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a poly logarithmic size subgraph, 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centres' have results for both models for s-t connectivity, one of the simplest graph problems that requires global information in the worst case. In the limited-sharing model, our results exploit social network structure. Standard communication complexity gives polynomial lower bounds on s-t connectivity for general graphs. However, if the graph for each data centre has a giant component and these giant components intersect, then we can overcome this lower bound, computing-t connectivity while exchanging O(log 2 n) bits for a constant number of data centers. We can also test the assumption that the giant components overlap using O(log 2 n) bits provided the (unknown) overlap is sufficiently large. The second result is in the low trust model. We give a secure multi-party computation (MPC) algorithm that 1) does not make cryptographic assumptions when there are 3 or more entities, and 2) is efficient, especially when compared to the usual garbled circuit approach. The entities learn only the yes/no answer. No party learns anything about the others' graph, not even node names. This algorithm does not require any special graph structure. This secure MPC result for s-t connectivity is one of the first that involves a few parties computing on large inputs, instead of many parties computing on a few local values.

More Details

Modeling Mathematical Programs with Equilibrium Constraints in Pyomo

Hart, William E.; Siirola, John D.

We describe new capabilities for modeling MPEC problems within the Pyomo modeling software. These capabilities include new modeling components that represent complementar- ity conditions, modeling transformations for re-expressing models with complementarity con- ditions in other forms, and meta-solvers that apply transformations and numeric optimization solvers to optimize MPEC problems. We illustrate the breadth of Pyomo's modeling capabil- ities for MPEC problems, and we describe how Pyomo's meta-solvers can perform local and global optimization of MPEC problems.

More Details
Results 5201–5400 of 9,998
Results 5201–5400 of 9,998