Publications Search

Next-Generation Capabilities for Large-Scale Scientific Visualization

Moreland, Kenneth D.; Fabian, Nathan D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

PyTrilinos: Recent Advances in the Python Interface to Trilinos

Scientific Programming

Spotz, William S.

PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of the underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.

More Details

TYPE Journal Article YEAR 2012

OSTI DOI

Automating embedded analysis capabilities and managing software complexity in multiphysics simulation, Part I: Template-based generic programming

Scientific Programming

Pawlowski, Roger P.; Phipps, Eric T.; Salinger, Andrew G.

An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering. © 2012 - IOS Press and the authors. All rights reserved.

More Details

TYPE Journal Article YEAR 2012

Scopus OSTI DOI

Peridynamic simulation of damage evolution for structural health monitoring

ASME International Mechanical Engineering Congress and Exposition, Proceedings (IMECE)

Littlewood, David J.; Mish, Kyran D.; Pierson, Kendall H.

Modal-based methods for structural health monitoring require the identification of characteristic frequencies associated with a structure's primary modes of failure. A major difficulty is the extraction of damage-related frequency shifts from the large set of often benign frequency shifts observed experimentally. In this study, we apply peridynamics in combination with modal analysis for the prediction of characteristic frequency shifts throughout the damage evolution process. Peridynamics, a nonlocal extension of continuum mechanics, is unique in its ability to capture progressive material damage. The application of modal analysis to peridynamic models enables the tracking of structural modes and characteristic frequencies over the course of a simulation. Shifts in characteristic frequencies resulting from evolving structural damage can then be isolated and utilized in the analysis of frequency responses observed experimentally. We present a methodology for quasi-static peridynamic analyses, including the solution of the eigenvalue problem for identification of structural modes. Repeated solution of the eigenvalue problem over the course of a transient simulation yields a data set from which critical shifts in modal frequencies can be isolated. The application of peridynamics to modal analysis is demonstrated on the benchmark problem of a simply-supported beam. The computed natural frequencies of an undamaged beam are found to agree well with the classical local solution. Analyses in the presence of cracks of various lengths are shown to reveal frequency shifts associated with structural damage. Copyright © 2012 by ASME.

More Details

TYPE Conference YEAR 2012

OSTI Scopus

Dax toolkit: A proposed framework for data analysis and visualization at extreme scale

1st IEEE Symposium on Large-Scale Data Analysis and Visualization 2011, LDAV 2011 - Proceedings

Moreland, Kenneth D.; Ayachit, Utkarsh; Geveci, Berk; Ma, Kwan L.

Experts agree that the exascale machine will comprise processors that contain many cores, which in turn will necessitate a much higher degree of concurrency. Software will require a minimum of a 1,000 times more concurrency. Most parallel analysis and visualization algorithms today work by partitioning data and running mostly serial algorithms concurrently on each data partition. Although this approach lends itself well to the concurrency of current high-performance computing, it does not exhibit the appropriate pervasive parallelism required for exascale computing. The data partitions are too small and the overhead of the threads is too large to make effective use of all the cores in an extreme-scale machine. This paper introduces a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. We demonstrate the use of this system on a GPU processor, which we feel is the best analog to an exascale node that we have available today. © 2011 IEEE.

More Details

TYPE Conference YEAR 2011

Scopus OSTI

First principles predictions of intrinsic defects in aluminum arsenide, AlAs

Materials Research Society Symposium Proceedings

Schultz, Peter A.

The structures, energies, and energy levels of a comprehensive set of simple intrinsic point defects in aluminum arsenide are predicted using density functional theory (DFT). The calculations incorporate explicit and rigorous treatment of charged supercell boundary conditions. The predicted defect energy levels, computed as total energy differences, do not suffer from the DFT band gap problem, spanning the experimental gap despite the Kohn-Sham eigenvalue gap being much smaller than experiment. Defects in AlAs exhibit a surprising complexity - with a greater range of charge states, bistabilities, and multiple negative-U systems - that would be impossible to resolve with experiment alone. The simulation results can be used to populate defect physics models in III-V semiconductor device simulations with reliable quantities in those cases where experimental data is lacking, as in AlAs. © 2011 Materials Research Society.

More Details

TYPE Conference YEAR 2011

OSTI Scopus

Decision making under epistemic uncertainty for a complex mechanical system

Collection of Technical Papers - AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference

Urbina, Angel U.; Swiler, Laura P.

This paper explores various frameworks to quantify and propagate sources of epistemic and aleatoric uncertainty within the context of decision making for assessing system performance relative to design margins of a complex mechanical system. If sufficient data is available for characterizing aleatoric-type uncertainties, probabilistic methods are commonly used for computing response distribution statistics based on input probability distribution specifications. Conversely, for epistemic uncertainties, data is generally too sparse to support objective probabilistic input descriptions, leading to either subjective probabilistic descriptions (e.g., assumed priors in Bayesian analysis) or non-probabilistic methods based on interval specifications. Among the techniques examined in this work are (1) Interval analysis, (2) Dempster-Shafer Theory of Evidence, (3) a second-order probability (SOP) analysis in which the aleatory and epistemic variables are treated separately, and a nested iteration is performed, typically sampling epistemic variables on the outer loop, then sampling over aleatory variables on the inner loop and (4) a Bayesian approach where plausible prior distributions describing the epistemic variable are created and updated using available experimental data. This paper compares the results and the information provided by different methods to enable decision making in the context of performance assessment when epistemic uncertainty is considered.

More Details

TYPE Conference YEAR 2011

Scopus OSTI

An initial comparison of methods for representing and aggregating experimental uncertainties involving sparse data

Collection of Technical Papers - AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference

Romero, Vicente J.; Swiler, Laura P.; Urbina, Angel U.

This paper discusses the handling and treatment of uncertainties corresponding to relatively few data samples in experimental characterization of random quantities. The importance of this topic extends beyond experimental uncertainty to situations where the derived experimental information is used for model validation or calibration. With very sparse data it is not practical to have a goal of accurately estimating the underlying variability distribution (probability density function, PDF). Rather, a pragmatic goal is that the uncertainty representation should be conservative so as to bound a desired percentage of the actual PDF, say 95% included probability, with reasonable reliability. A second, opposing objective is that the representation not be overly conservative; that it minimally over-estimate the random-variable range corresponding to the desired percentage of the actual PDF. The performance of a variety of uncertainty representation techniques is tested and characterized in this paper according to these two opposing objectives. An initial set of test problems and results is presented here from a larger study currently underway.

More Details

TYPE Conference YEAR 2011

OSTI Scopus

A hybrid-hybrid solver for manycore platforms

SC'11 - Proceedings of the 2011 High Performance Computing Networking, Storage and Analysis Companion, Co-located with SC'11

Rajamanickam, Sivasankaran R.; Boman, Erik G.; Heroux, Michael A.

With the increasing levels of parallelism in a compute node, it is important to exploit multiple levels of parallelism even within a single compute node. We present ShyLU (pro- nounced\Shy-loo"for Scalable Hybrid LU), a\hybrid-hybrid" solver for general sparse linear systems that is hybrid in two ways: First, it combines direct and iterative methods. The iterative method is based on approximate Schur com- plements. Second, the solver uses two levels of parallelism via hybrid programming (MPI+threads). Our solver is use- ful both in shared-memory environments and on large par- allel computers with distributed memory (as a subdomain solver). We compare the robustness of ShyLU against other algebraic preconditioners. ShyLU scales well up to 192 cores for a given problem size. We compare at MPI performance of ShyLU against a hybrid implementation. We conclude that on present multicore nodes at MPI is better. However, for future manycore machines (48 or more cores) hybrid/ hi- erarchical algorithms and implementations are important for sustained performance. Copyright is held by the author/owner(s).

More Details

TYPE Conference YEAR 2011

OSTI Scopus

Flexible approximate counting

ACM International Conference Proceeding Series

Mitchell, Scott A.; Day, David M.

Approximate counting [18] is useful for data stream and database summarization. It can help in many settings that allow only one pass over the data, want low memory usage, and can accept some relative error. Approximate counters use fewer bits; we focus on 8-bits but our results are general. These small counters represent a sparse sequence of larger numbers. Counters are incremented probabilistically based on the spacing between the numbers they represent. Our contributions are a customized distribution of counter values and efficient strategies for deciding when to increment them. At run-time, users may independently select the spacing (accuracy) of the approximate counter for small, medium, and large values. We allow the user to select the maximum number to count up to, and our algorithm will select the exponential base of the spacing. These provide additional flexibility over both classic and Csurös's [4] floating-point approximate counting. These provide additional structure, a useful schema for users, over Kruskal and Greenberg [13]. We describe two new and efficient strategies for incrementing approximate counters: use a deterministic countdown or sample from a geometric distribution. In Csurös's all increments are powers of two, so random bits rather than full random numbers can be used. We also provide the option to use powers-of-two but retain flexibility. We show when each strategy is fastest in our implementation. © 2011 ACM.

More Details

TYPE Conference YEAR 2011

OSTI Scopus

On the preservation of total enthalpy in SUPG methods

20th AIAA Computational Fluid Dynamics Conference 2011

Bova, S.W.; Kirk, Benjamin S.

We analyze the artificial dissipation introduced by a streamline-upwind Petrov-Galerkin finite element method and consider its effect on the conservation of total enthalpy for the Euler and laminar Navier-Stokes equations. We also consider the chemically reacting case. We demonstrate that in general, total enthalpy is not conserved for the important special case of the steady-state Euler equations. A modification to the artificial dissipation is proposed and shown to significantly improve the conservation of total enthalpy.

More Details

TYPE Conference YEAR 2011

Scopus OSTI

Scalable stabilized fe formulations for simulating turbulent reacting flows in light water reactors

11AIChE - 2011 AIChE Annual Meeting, Conference Proceedings

Pawlowski, Roger P.; Shadid, John N.; Smith, Tom M.; Cyr, Eric C.

This presentation will discuss progress towards developing a large-scale parallel CFD capability using stabilized finite element formulations to simulate turbulent reacting flow and heat transfer in light water nuclear reactors (LWRs). Numerical simultation plays a critical role in the design, certification, and operation of LWRs. The Consortium for Advanced Simulation of Light Water Reactors is a U. S. Department of Energy Innovation Hub that is developing a virtual reactor toolkit that will incorporate science-based models, state-of-the-art numerical methods, modern computational science and engineering practices, and uncertainty quantification (UQ) and validation against operating pressurized water reactors. It will couple state-of-the-art fuel performance, neutronics, thermal-hydraulics (T-H), and structural models with existing tools for systems and safety analysis and will be designed for implementation on both today's leadership-class computers and next-generation advanced architecture platforms. We will first describe the finite element discretization utilizing PSPG, SUPG, and discontinuity capturing stabilization. We will then discuss our initial turbulence modeling formulations (LES and URANS) and the scalable fully implicit, fully coupled solution methods that are used to solve the challenging systems. These include globalized Newton-Krylov methods for solving the nonlinear systems of equaitons and preconditioned Krylov techniques. The preconditioners are based on fully-coupled algebraic multigrid and approximate block factorization preconditioners. We will discuss how these methods provide a powerful integration path for multiscale coupling to the neutronics and structures applications. Initial results on scalabiltiy will be presented. Finally we will comment on our use of embedded technology and how this capbaility impacts the application of implicit methods, sensitivity analysis and UQ.

More Details

TYPE Conference YEAR 2011

Scopus OSTI

A plausibility-based approach to incremental inference

AAAI Fall Symposium - Technical Report

Stracuzzi, David J.

Inference techniques play a central role in many cognitive systems. They transform low-level observations of the environment into high-level, actionable knowledge which then gets used by mechanisms that drive action, problem-solving, and learning. This paper presents an initial effort at combining results from AI and psychology into a pragmatic and scalable computational reasoning system. Our approach combines a numeric notion of plausibility with first-order logic to produce an incremental inference engine that is guided by heuristics derived from the psychological literature. We illustrate core ideas with detailed examples and discuss the advantages of the approach with respect to cognitive systems.

More Details

TYPE Conference YEAR 2011

OSTI Scopus

Extending scalability of collective IO through nessie and staging

PDSW'11 - Proceedings of the 6th Parallel Data Storage Workshop, Co-located with SC'11

Lofstead, Jay; Oldfield, Ron A.; Kordenbrock, Todd; Reiss, Charles

The increasing fidelity of scientific simulations as they scale towards exascale sizes is straining the proven IO techniques championed throughout terascale computing. Chief among the successful IO techniques is the idea of collective IO where processes coordinate and exchange data prior to writing to storage in an effort to reduce the number of small, independent IO operations. As well as collective IO works for efficiently creating a data set in the canonical order, 3-D domain decompositions prove troublesome due to the amount of data exchanged prior to writing to storage. When each process has a tiny piece of a 3-D simulation space rather than a complete 'pencil' or 'plane', 2-D or 1-D domain decompositions respectively, the communication overhead to rearrange the data can dwarf the time spent actually writing to storage [27]. Our approach seeks to transparently increase scalability and performance while maintaining both the IO routines in the application and the final data format in the storage system. Accomplishing this leverages both the Nessie [23] RPC framework and a staging area with staging services. Through these tools, we employ a variety of data processing operations prior to invoking the native API to write data to storage yielding as much as a 3X performance improvement over the native calls. © 2011 ACM.

More Details

TYPE Conference YEAR 2011

Scopus OSTI

The Anisotropic Elastic-Decohesive Constitutive Model in CICE

Peterson, Kara J.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

A framework for architecture-level power area and thermal simulation and its application to network-on-chip design exploration

SIGMETRICS Performance Evaluation Review

Hsieh, Mingyu H.; Rodrigues, Arun

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

System Implications of Memory Reliability in Exascale Computing

Hsieh, Mingyu H.; Rodrigues, Arun

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

DAKOTA : a multilevel parallel object-oriented framework for design optimization, parameter estimation, uncertainty quantification, and sensitivity analysis

Adams, Brian M.; Bohnhoff, William J.; Dalbey, Keith D.; Eddy, John P.; Eldred, Michael S.; Hough, Patricia D.; Lefantzi, Sophia L.; Swiler, Laura P.; Vigil, Dena V.

The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a theoretical manual for selected algorithms implemented within the DAKOTA software. It is not intended as a comprehensive theoretical treatment, since a number of existing texts cover general optimization theory, statistical analysis, and other introductory topics. Rather, this manual is intended to summarize a set of DAKOTA-related research publications in the areas of surrogate-based optimization, uncertainty quantification, and optimization under uncertainty that provide the foundation for many of DAKOTA's iterative analysis capabilities.

More Details

TYPE SAND Report YEAR 2011

DOI OSTI

Fault-tolerant quantum computing QEC 2011 tutorial

Landahl, Andrew J.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Trinity Architecture & Design

Doerfler, Douglas W.; Hemmert, Karl S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Efficient Source Inversion Methodologies using Regional Transport Models

Safta, Cosmin S.; Sargsyan, Khachik S.; Bambha, Ray B.; Michelsen, Hope A.; Debusschere, Bert D.; Najm, H.N.; van Bloemen Waanders, Bart G.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Coarse-Grain Simulation of Networks-on-Chip using SST/Macro

Hendry, Gilbert H.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Uncertainty segregation and tensor-product-type approximation in reduced-dimensional stochastic analysis of coupled systems

International Journal for Numerical Methods in Engineering

Phipps, Eric T.; Red-Horse, John R.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

Dax Design/Basic Vector Classes

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Shock compression of hydrocarbon foam to 200 GPa: experiments mesoscale modeling and atomistic simulations

Physical Review B

Root, Seth R.; Haill, Thomas A.; Lane, James M.; Thompson, Aidan P.; Grest, Gary S.; Mattsson, Thomas M.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

A high resolution Lagrangian method using nonlinear hybridization and hyperviscosity

Proposed for publication in Computers & Fluids.

Rider, William J.; Love, Edward L.; Scovazzi, Guglielmo S.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

QIP 2012: Fault-tolerant quantum computing with color codes

Landahl, Andrew J.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Nuclear Energy Advanced Modeling and Simulation Waste Integrated Performance and Safety Codes (NEAMS Waste IPSC)

Schultz, Peter A.

The objective of the U.S. Department of Energy Office of Nuclear Energy Advanced Modeling and Simulation Waste Integrated Performance and Safety Codes (NEAMS Waste IPSC) is to provide an integrated suite of computational modeling and simulation (M&S) capabilities to quantitatively assess the long-term performance of waste forms in the engineered and geologic environments of a radioactive-waste storage facility or disposal repository. Achieving the objective of modeling the performance of a disposal scenario requires describing processes involved in waste form degradation and radionuclide release at the subcontinuum scale, beginning with mechanistic descriptions of chemical reactions and chemical kinetics at the atomic scale, and upscaling into effective, validated constitutive models for input to high-fidelity continuum scale codes for coupled multiphysics simulations of release and transport. Verification and validation (V&V) is required throughout the system to establish evidence-based metrics for the level of confidence in M&S codes and capabilities, including at the subcontiunuum scale and the constitutive models they inform or generate. This Report outlines the nature of the V&V challenge at the subcontinuum scale, an approach to incorporate V&V concepts into subcontinuum scale modeling and simulation (M&S), and a plan to incrementally incorporate effective V&V into subcontinuum scale M&S destined for use in the NEAMS Waste IPSC work flow to meet requirements of quantitative confidence in the constitutive models informed by subcontinuum scale phenomena.

More Details

TYPE SAND Report YEAR 2011

DOI OSTI

Large-Scale Interactive Visualization with ParaView

International Journal of Computational Science and Engineering (IJCSE)

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

Embedded UQ and QoI/Adjoints in Drekar: New Directions

Pawlowski, Roger P.; Shadid, John N.; Wildey, Timothy M.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Dax Design/Basic Control Environment API

Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Numerical study of a matrix-free trust-region SQP method for equality constrained optimization

Ridzal, Denis R.; Aguilo Valentin, Miguel A.

This is a companion publication to the paper 'A Matrix-Free Trust-Region SQP Algorithm for Equality Constrained Optimization' [11]. In [11], we develop and analyze a trust-region sequential quadratic programming (SQP) method that supports the matrix-free (iterative, in-exact) solution of linear systems. In this report, we document the numerical behavior of the algorithm applied to a variety of equality constrained optimization problems, with constraints given by partial differential equations (PDEs).

More Details

TYPE SAND Report YEAR 2011

OSTI DOI

How will VERA-CFD couple into VERA?

Pawlowski, Roger P.; Summers, Randall M.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

VERA MultiphysicsCoupling with LIME: Code Requirements

Pawlowski, Roger P.; Belcourt, Kenneth N.; Hooper, Russell H.; Schmidt, Rodney C.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Pathogen detection in clinical samples by high-throughput sequencing

Solberg, Owen D.; Misra, Milind; Schoeniger, Joseph S.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Precision Neutral Computation Enables Efficient Robust Algorithms

Parks, Michael L.; Heroux, Michael A.; Day, David M.; Frischknecht, Amalie F.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Semiclassical Poisson and Self-Consistent Poisson-Schrodinger Solvers in QCAD

Gao, Xujiao G.; Nielsen, Erik N.; Young, Ralph W.; Salinger, Andrew G.; Muller, Richard P.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Development of few-electron Si quantum dots for use as qubits

Muller, Richard P.; Bishop, Nathaniel B.; Lu, Tzu-Ming L.; Pluym, Tammy P.; Bielejec, Edward S.; Lilly, Michael L.; Landahl, Andrew J.; Carroll, Malcolm; Young, Ralph W.; Nielsen, Erik N.; Rahman, Rajib R.; Witzel, Wayne W.; Gao, Xujiao G.; Tracy, Lisa A.; Bussmann, Ezra B.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Quantum control of hybrid nuclear-electronic qubits

Nature

Witzel, Wayne W.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

Code verification for the extended finite element method: the compound cohesionless impact problem

Niederhaus, John H.; Voth, Thomas E.; Mosso, Stewart J.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

CHALLENGES IN PARALLEL GRAPH PROCESSING

Parallel Processing Letters

Hendrickson, Bruce A.; Berry, Jonathan W.

Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel computing resources are required to meet their computational and memory requirements. Unfortunately, the algorithms, software, and hardware that have worked well for developing mainstream parallel scientific applications are not necessarily effective for large-scale graph problems. In this paper we present the inter-relationships between graph problems, software, and parallel hardware in the current state of the art and discuss how those issues present inherent challenges in solving large-scale graph problems. The range of these challenges suggests a research agenda for the development of scalable high-performance software for graph problems.

More Details

TYPE Journal Article YEAR 2011

OSTI DOI

Enabling flexible collective communication offload with triggered operations

Proceedings - Symposium on the High Performance Interconnects, Hot Interconnects

Underwood, Keith D.; Coffman, Jerrie; Larsen, Roy; Hemmert, Karl S.; Barrett, Brian W.; Brightwell, Ronald B.; Levenhagen, Michael J.

Low latency collective communications are key to application scalability. As systems grow larger, minimizing collective communication time becomes increasingly challenging. Offload is an effective technique for accelerating collective operations; however, algorithms for collective communication constantly evolve such that flexible implementations are critical. This paper presents triggered operations-a semantic building block that allows the key components of collective communications to be offloaded while allowing the host side software to define the algorithm. Simulations are used to demonstrate the performance improvements achievable through the offload of MPI-Allreduce using these building blocks. © 2011 IEEE.

More Details

TYPE Conference YEAR 2011

Scopus OSTI