Publications

Results 7001–7200 of 9,998

Search results

Jump to search filters

Efficient expression templates for operator overloading-based automatic differentiation

Lecture Notes in Computational Science and Engineering

Phipps, Eric T.; Pawlowski, Roger P.

Expression templates are a well-known set of techniques for improving the efficiency of operator overloading-based forward mode automatic differentiation schemes in the C++ programming language by translating the differentiation from individual operators to whole expressions. However standard expression template approaches result in a large amount of duplicate computation, particularly for large expression trees, degrading their performance. In this paper we describe several techniques for improving the efficiency of expression templates and their implementation in the automatic differentiation package Sacado (Phipps et al., Advances in automatic differentiation, Lecture notes in computational science and engineering, Springer, Berlin, 2008; Phipps and Gay, Sacado automatic differentiation package. http://trilinos.sandia.gov/packages/sacado/, 2011). We demonstrate their improved efficiency through test functions as well as their application to differentiation of a large-scale fluid dynamics simulation code. © 2012 Springer-Verlag.

More Details

Brief announcement: Subgraph Isomorphism on a MultiThreaded shared memory architecture

Annual ACM Symposium on Parallelism in Algorithms and Architectures

Ralph, Claire C.; Leung, Vitus J.; McLendon, William C.

Graph algorithms tend to suffer poor performance due to the irregularity of access patterns within general graph data structures, arising from poor data locality, which translates to high memory latency. The result is that advances in high-performance solutions for graph algorithms are most likely to come through advances in both architectures and algorithms. Specialized MMT shared memory machines offer a potentially transformative environment in which to approach the problem. Here, we explore the challenges of implementing Subgraph Isomorphism (SI) algorithms based on the Ullmann and VF2 algorithms in the Cray XMT environment, where issues of memory contention, scheduling, and compiler parallelizability must be optimized. Copyright is held by the author/owner(s).

More Details

Accelerated Cartesian expansions for the rapid solution of periodic multiscale problems

IEEE Transactions on Antennas and Propagation

Baczewski, Andrew D.; Dault, Daniel L.; Shanker, Balasubramaniam

We present an algorithm for the fast and efficient solution of integral equations that arise in the analysis of scattering from periodic arrays of PEC objects, such as multiband frequency selective surfaces (FSS) or metamaterial structures. Our approach relies upon the method of Accelerated Cartesian Expansions (ACE) to rapidly evaluate the requisite potential integrals. ACE is analogous to FMM in that it can be used to accelerate the matrix vector product used in the solution of systems discretized using MoM. Here, ACE provides linear scaling in both CPU time and memory. Details regarding the implementation of this method within the context of periodic systems are provided, as well as results that establish error convergence and scalability. In addition, we also demonstrate the applicability of this algorithm by studying several exemplary electrically dense systems.

More Details

Goal-oriented adaptivity and multilevel preconditioning for the poisson-boltzmann equation

Journal of Scientific Computing

Aksoylu, Burak; Bond, Stephen D.; Cyr, Eric C.; Holst, Michael

In this article, we develop goal-oriented error indicators to drive adaptive refinement algorithms for the Poisson-Boltzmann equation. Empirical results for the solvation free energy linear functional demonstrate that goal-oriented indicators are not sufficient on their own to lead to a superior refinement algorithm. To remedy this, we propose a problem-specific marking strategy using the solvation free energy computed from the solution of the linear regularized Poisson-Boltzmann equation. The convergence of the solvation free energy using this marking strategy, combined with goal-oriented refinement, compares favorably to adaptive methods using an energy-based error indicator. Due to the use of adaptive mesh refinement, it is critical to use multilevel preconditioning in order to maintain optimal computational complexity. We use variants of the classical multigrid method, which can be viewed as generalizations of the hierarchical basis multigrid and Bramble-Pasciak-Xu (BPX) preconditioners. © 2011 Springer Science+Business Media (outside the USA).

More Details

Adaptive tabulation for verified equations of state

AIP Conference Proceedings

Carpenter, John H.

A new adaptive tabulation scheme for multi-phase equations of state (EOS) is described. Adaptation allows verification that a table represents an EOS model to some desired accuracy at a much lower computational cost than standard tables. Computational efficiency is provided through the use of a quad-tree representation. Using both rectangular and triangular interpolation regions results in accurate descriptions of phase boundaries. The new format is demonstrated on a representative multi-phase EOS model. © 2012 American Institute of Physics.

More Details

Molecular dynamics simulation of dynamic response of beryllium

AIP Conference Proceedings

Thompson, Aidan P.; Lane, James M.; Desjarlais, Michael P.

The response of beryllium to dynamic loading has been extensively studied, both experimentally and theoretically, due to its importance in several technological areas. We use a MEAM empirical potential to examine the melt transition. MD simulations of equilibrated two-phase systems were used to calculate the HCP melting curve up to 300 GPa. This was found to agree well with previous ab initio calculations. The Hugoniostat method was used to examine dynamic compression along the two principal orientations of the HCP crystal. In both directions, the melting transition occurred at 230 GPa and 5000 K, consistent with the equilibrium melting curve. Direct NEMD simulations of uniaxial compression show a transition to an amorphous material at shocked states that lie below the equilibrium melt curve. © 2012 American Institute of Physics.

More Details

Optimization-based modeling with applications to transport: Part 3. Computational studies

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Ridzal, Denis R.; Young, Joseph G.; Bochev, Pavel B.; Peterson, Kara J.

This paper is the final of three related articles that develop and demonstrate a new optimization-based framework for computational modeling. The framework uses optimization and control ideas to assemble and decompose multiphysics operators and to preserve their fundamental physical properties in the discretization process. One application of the framework is in the formulation of robust algorithms for optimization-based transport (OBT). Based on the theoretical foundations established in Part 1 and the optimization algorithm for the solution of the remap subproblem, derived in Part 2, this paper focuses on the application of OBT to a set of benchmark transport problems. Numerical comparisons with two other transport schemes based on incremental remapping, featuring flux-corrected remap and the linear reconstruction with van Leer limiting, respectively, demonstrate that OBT is a competitive transport algorithm. © 2012 Springer-Verlag.

More Details

Towards Automated Memory Model Generation Via Event Tracing

Computer Journal

Hammond, Simon D.

The importance of memory performance and capacity is a growing concern for high performance computing laboratories around the world. It has long been recognized that improvements in processor speed exceed the rate of improvement in dynamic random access memory speed and, as a result, memory access times can be the limiting factor in high performance scientific codes. The use of multi-core processors exacerbates this problem with the rapid growth in the number of cores not being matched by similar improvements in memory capacity, increasing the likelihood of memory contention. In this paper, we present WMTools , a lightweight memory tracing tool and analysis framework for parallel codes, which is able to identify peak memory usage and also analyse per-function memory use over time. An evaluation of WMTools , in terms of its effectiveness and also its overheads, is performed using nine established scientific applications/benchmark codes representing a variety of programming languages and scientific domains. We also show how WMTools can be used to automatically generate a parameterized memory model for one of these applications, a two-dimensional non-linear magnetohydrodynamics application, Lare2D . Through the memory model we are able to identify an unexpected growth term which becomes dominant at scale. With a refined model we are able to predict memory consumption with under 7% error.

More Details
Results 7001–7200 of 9,998
Results 7001–7200 of 9,998