Publications

Results 8376–8400 of 9,998

Search results

Jump to search filters

Increasing fault resiliency in a message-passing environment

Ferreira, Kurt; Oldfield, Ron; Stearley, Jon S.; Laros, James H.; Pedretti, Kevin T.T.; Brightwell, Ronald B.

Petaflops systems will have tens to hundreds of thousands of compute nodes which increases the likelihood of faults. Applications use checkpoint/restart to recover from these faults, but even under ideal conditions, applications running on more than 30,000 nodes will likely spend more than half of their total run time saving checkpoints, restarting, and redoing work that was lost. We created a library that performs redundant computations on additional nodes allocated to the application. An active node and its redundant partner form a node bundle which will only fail, and cause an application restart, when both nodes in the bundle fail. The goal of this library is to learn whether this can be done entirely at the user level, what requirements this library places on a Reliability, Availability, and Serviceability (RAS) system, and what its impact on performance and run time is. We find that our redundant MPI layer library imposes a relatively modest performance penalty for applications, but that it greatly reduces the number of applications interrupts. This reduction in interrupts leads to huge savings in restart and rework time. For large-scale applications the savings compensate for the performance loss and the additional nodes required for redundant computations.

More Details

Graph algorithms in the titan toolkit

Mclendon, William

Graph algorithms are a key component in a wide variety of intelligence analysis activities. The Graph-Based Informatics for Non-Proliferation and Counter-Terrorism project addresses the critical need of making these graph algorithms accessible to Sandia analysts in a manner that is both intuitive and effective. Specifically we describe the design and implementation of an open source toolkit for doing graph analysis, informatics, and visualization that provides Sandia with novel analysis capability for non-proliferation and counter-terrorism.

More Details

Final report LDRD project 105816 : model reduction of large dynamic systems with localized nonlinearities

Lehoucq, Rich; Dohrmann, Clark R.; Segalman, Daniel J.

Advanced computing hardware and software written to exploit massively parallel architectures greatly facilitate the computation of extremely large problems. On the other hand, these tools, though enabling higher fidelity models, have often resulted in much longer run-times and turn-around-times in providing answers to engineering problems. The impediments include smaller elements and consequently smaller time steps, much larger systems of equations to solve, and the inclusion of nonlinearities that had been ignored in days when lower fidelity models were the norm. The research effort reported focuses on the accelerating the analysis process for structural dynamics though combinations of model reduction and mitigation of some factors that lead to over-meshing.

More Details

A comparison of Lagrangian/Eulerian approaches for tracking the kinematics of high deformation solid motion

Ames, Thomas L.; Robinson, Allen C.

The modeling of solids is most naturally placed within a Lagrangian framework because it requires constitutive models which depend on knowledge of the original material orientations and subsequent deformations. Detailed kinematic information is needed to ensure material frame indifference which is captured through the deformation gradient F. Such information can be tracked easily in a Lagrangian code. Unfortunately, not all problems can be easily modeled using Lagrangian concepts due to severe distortions in the underlying motion. Either a Lagrangian/Eulerian or a pure Eulerian modeling framework must be introduced. We discuss and contrast several Lagrangian/Eulerian approaches for keeping track of the details of material kinematics.

More Details

Investigating Methods of Supporting Dynamically Linked Executables on High Performance Computing Platforms

Laros, James H.; Kelly, Suzanne M.; Levenhagen, Michael; Pedretti, Kevin T.T.

Shared libraries have become ubiquitous and are used to achieve great resource efficiencies on many platforms. The same properties that enable efficiencies on time-shared computers and convenience on small clusters prove to be great obstacles to scalability on large clusters and High Performance Computing platforms. In addition, Light Weight operating systems such as Catamount have historically not supported the use of shared libraries specifically because they hinder scalability. In this report we will outline the methods of supporting shared libraries on High Performance Computing platforms using Light Weight kernels that we investigated. The considerations necessary to evaluate utility in this area are many and sometimes conflicting. While our initial path forward has been determined based on this evaluation we consider this effort ongoing and remain prepared to re-evaluate any technology that might provide a scalable solution. This report is an evaluation of a range of possible methods of supporting dynamically linked executables on capability class1 High Performance Computing platforms. Efforts are ongoing and extensive testing at scale is necessary to evaluate performance. While performance is a critical driving factor, supporting whatever method is used in a production environment is an equally important and challenging task.

More Details

Improving performance via mini-applications

Doerfler, Douglas W.; Crozier, Paul; Edwards, Harold C.; Williams, Alan B.; Rajan, Mahesh; Keiter, Eric R.; Thornquist, Heidi K.

Application performance is determined by a combination of many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, we find that the use of mini-applications - small self-contained proxies for real applications - is an excellent approach for rapidly exploring the parameter space of all these choices. Furthermore, use of mini-applications enriches the interaction between application, library and computer system developers by providing explicit functioning software and concrete performance results that lead to detailed, focused discussions of design trade-offs, algorithm choices and runtime performance issues. In this paper we discuss a collection of mini-applications and demonstrate how we use them to analyze and improve application performance on new and future computer platforms.

More Details

Efficient algorithms for mixed aleatory-epistemic uncertainty quantification with application to radiation-hardened electronics. Part I, algorithms and benchmark results

Eldred, Michael; Swiler, Laura P.

This report documents the results of an FY09 ASC V&V Methods level 2 milestone demonstrating new algorithmic capabilities for mixed aleatory-epistemic uncertainty quantification. Through the combination of stochastic expansions for computing aleatory statistics and interval optimization for computing epistemic bounds, mixed uncertainty analysis studies are shown to be more accurate and efficient than previously achievable. Part I of the report describes the algorithms and presents benchmark performance results. Part II applies these new algorithms to UQ analysis of radiation effects in electronic devices and circuits for the QASPR program.

More Details

Quantitative resilience analysis through control design

Vugrin, Eric; Camphouse, Russell; Sunderland, Daniel

Critical infrastructure resilience has become a national priority for the U. S. Department of Homeland Security. System resilience has been studied for several decades in many different disciplines, but no standards or unifying methods exist for critical infrastructure resilience analysis. Few quantitative resilience methods exist, and those existing approaches tend to be rather simplistic and, hence, not capable of sufficiently assessing all aspects of critical infrastructure resilience. This report documents the results of a late-start Laboratory Directed Research and Development (LDRD) project that investigated the development of quantitative resilience through application of control design methods. Specifically, we conducted a survey of infrastructure models to assess what types of control design might be applicable for critical infrastructure resilience assessment. As a result of this survey, we developed a decision process that directs the resilience analyst to the control method that is most likely applicable to the system under consideration. Furthermore, we developed optimal control strategies for two sets of representative infrastructure systems to demonstrate how control methods could be used to assess the resilience of the systems to catastrophic disruptions. We present recommendations for future work to continue the development of quantitative resilience analysis methods.

More Details

Toward improved branch prediction through data mining

Hemmert, Karl S.

Data mining and machine learning techniques can be applied to computer system design to aid in optimizing design decisions, improving system runtime performance. Data mining techniques have been investigated in the context of branch prediction. Specifically, a comparison of traditional branch predictor performance has been made to data mining algorithms. Additionally, the possiblity of whether additional features available within the architectural state might serve to further improve branch prediction has been evaluated. Results show that data mining techniques indicate potential for improved branch prediction, especially when register file contents are included as a feature set.

More Details

Scalable analysis tools for sensitivity analysis and UQ (3160) results

Ice, Lisa; Fabian, Nathan; Moreland, Kenneth D.; Bennett, Janine C.; Thompson, David; Karelitz, David B.

The 9/30/2009 ASC Level 2 Scalable Analysis Tools for Sensitivity Analysis and UQ (Milestone 3160) contains feature recognition capability required by the user community for certain verification and validation tasks focused around sensitivity analysis and uncertainty quantification (UQ). These feature recognition capabilities include crater detection, characterization, and analysis from CTH simulation data; the ability to call fragment and crater identification code from within a CTH simulation; and the ability to output fragments in a geometric format that includes data values over the fragments. The feature recognition capabilities were tested extensively on sample and actual simulations. In addition, a number of stretch criteria were met including the ability to visualize CTH tracer particles and the ability to visualize output from within an S3D simulation.

More Details

A fully implicit method for 3D quasi-steady state magnetic advection-diffusion

Siefert, Christopher; Robinson, Allen C.

We describe the implementation of a prototype fully implicit method for solving three-dimensional quasi-steady state magnetic advection-diffusion problems. This method allows us to solve the magnetic advection diffusion equations in an Eulerian frame with a fixed, user-prescribed velocity field. We have verified the correctness of method and implementation on two standard verification problems, the Solberg-White magnetic shear problem and the Perry-Jones-White rotating cylinder problem.

More Details

Highly scalable linear solvers on thousands of processors

Siefert, Christopher; Tuminaro, Raymond S.; Domino, Stefan P.; Robinson, Allen C.

In this report we summarize research into new parallel algebraic multigrid (AMG) methods. We first provide a introduction to parallel AMG. We then discuss our research in parallel AMG algorithms for very large scale platforms. We detail significant improvements in the AMG setup phase to a matrix-matrix multiplication kernel. We present a smoothed aggregation AMG algorithm with fewer communication synchronization points, and discuss its links to domain decomposition methods. Finally, we discuss a multigrid smoothing technique that utilizes two message passing layers for use on multicore processors.

More Details

Neural assembly models derived through nano-scale measurements

Fan, Hongyou; Forsythe, James C.; Branda, Catherine; Warrender, Christina E.; Schiek, Richard

This report summarizes accomplishments of a three-year project focused on developing technical capabilities for measuring and modeling neuronal processes at the nanoscale. It was successfully demonstrated that nanoprobes could be engineered that were biocompatible, and could be biofunctionalized, that responded within the range of voltages typically associated with a neuronal action potential. Furthermore, the Xyce parallel circuit simulator was employed and models incorporated for simulating the ion channel and cable properties of neuronal membranes. The ultimate objective of the project had been to employ nanoprobes in vivo, with the nematode C elegans, and derive a simulation based on the resulting data. Techniques were developed allowing the nanoprobes to be injected into the nematode and the neuronal response recorded. To the authors's knowledge, this is the first occasion in which nanoparticles have been successfully employed as probes for recording neuronal response in an in vivo animal experimental protocol.

More Details

Building more powerful less expensive supercomputers using Processing-In-Memory (PIM) LDRD final report

Murphy, Richard C.

This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential of PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.

More Details

Final Report on LDRD project 130784 : functional brain imaging by tunable multi-spectral Event-Related Optical Signal (EROS)

Hsu, Alan Y.; Speed, Ann E.

Functional brain imaging is of great interest for understanding correlations between specific cognitive processes and underlying neural activity. This understanding can provide the foundation for developing enhanced human-machine interfaces, decision aides, and enhanced cognition at the physiological level. The functional near infrared spectroscopy (fNIRS) based event-related optical signal (EROS) technique can provide direct, high-fidelity measures of temporal and spatial characteristics of neural networks underlying cognitive behavior. However, current EROS systems are hampered by poor signal-to-noise-ratio (SNR) and depth of measure, limiting areas of the brain and associated cognitive processes that can be investigated. We propose to investigate a flexible, tunable, multi-spectral fNIRS EROS system which will provide up to 10x greater SNR as well as improved spatial and temporal resolution through significant improvements in electronics, optoelectronics and optics, as well as contribute to the physiological foundation of higher-order cognitive processes and provide the technical foundation for miniaturized portable neuroimaging systems.

More Details
Results 8376–8400 of 9,998
Results 8376–8400 of 9,998