Publications

Results 7001–7200 of 9,998

Search results

Jump to search filters

Efficient expression templates for operator overloading-based automatic differentiation

Lecture Notes in Computational Science and Engineering

Phipps, Eric T.; Pawlowski, Roger

Expression templates are a well-known set of techniques for improving the efficiency of operator overloading-based forward mode automatic differentiation schemes in the C++ programming language by translating the differentiation from individual operators to whole expressions. However standard expression template approaches result in a large amount of duplicate computation, particularly for large expression trees, degrading their performance. In this paper we describe several techniques for improving the efficiency of expression templates and their implementation in the automatic differentiation package Sacado (Phipps et al., Advances in automatic differentiation, Lecture notes in computational science and engineering, Springer, Berlin, 2008; Phipps and Gay, Sacado automatic differentiation package. http://trilinos.sandia.gov/packages/sacado/, 2011). We demonstrate their improved efficiency through test functions as well as their application to differentiation of a large-scale fluid dynamics simulation code. © 2012 Springer-Verlag.

More Details

Defect reaction network in C-doped GaAs: Numerical predictions

Schultz, Peter A.

This Report characterizes the defect reaction network in carbon doped, p-type GaAs deduced from first principles density functional theory. The reaction network is deduced by following exothermic defect reactions starting with the initially mobile interstitial defects reacting with common displacement damage defects in C-doped GaAs until culminating in immobile reaction products. The defect reactions and reaction energies are tabulated, along with the properties of all the carbon-related defects in the reaction network. This Report serves to extend the results for intrinsic defects in: P.A. Schultz and O.A. von Lilienfeld, “Simple intrinsic defects in GaAs”, Modelling Simul. Mater. Sci Eng., Vol. 17, 084007 (2009) and its numerical supplement in SAND 2012-2675, and the preliminary carbon defect network results in: P.A. Schultz, “First-principles defect chemistry for modeling irradiated GaAs and III-V semiconductors”, J. Rad. Effects, Res. and Eng. Vol. 30, p257 (2012).

More Details

Brief announcement: Subgraph Isomorphism on a MultiThreaded shared memory architecture

Annual ACM Symposium on Parallelism in Algorithms and Architectures

Leung, Vitus J.; Mclendon, William

Graph algorithms tend to suffer poor performance due to the irregularity of access patterns within general graph data structures, arising from poor data locality, which translates to high memory latency. The result is that advances in high-performance solutions for graph algorithms are most likely to come through advances in both architectures and algorithms. Specialized MMT shared memory machines offer a potentially transformative environment in which to approach the problem. Here, we explore the challenges of implementing Subgraph Isomorphism (SI) algorithms based on the Ullmann and VF2 algorithms in the Cray XMT environment, where issues of memory contention, scheduling, and compiler parallelizability must be optimized. Copyright is held by the author/owner(s).

More Details

Accelerated Cartesian expansions for the rapid solution of periodic multiscale problems

IEEE Transactions on Antennas and Propagation

Baczewski, Andrew D.; Dault, Daniel L.; Shanker, Balasubramaniam

We present an algorithm for the fast and efficient solution of integral equations that arise in the analysis of scattering from periodic arrays of PEC objects, such as multiband frequency selective surfaces (FSS) or metamaterial structures. Our approach relies upon the method of Accelerated Cartesian Expansions (ACE) to rapidly evaluate the requisite potential integrals. ACE is analogous to FMM in that it can be used to accelerate the matrix vector product used in the solution of systems discretized using MoM. Here, ACE provides linear scaling in both CPU time and memory. Details regarding the implementation of this method within the context of periodic systems are provided, as well as results that establish error convergence and scalability. In addition, we also demonstrate the applicability of this algorithm by studying several exemplary electrically dense systems.

More Details

Goal-oriented adaptivity and multilevel preconditioning for the poisson-boltzmann equation

Journal of Scientific Computing

Aksoylu, Burak; Bond, Stephen D.; Cyr, Eric C.; Holst, Michael

In this article, we develop goal-oriented error indicators to drive adaptive refinement algorithms for the Poisson-Boltzmann equation. Empirical results for the solvation free energy linear functional demonstrate that goal-oriented indicators are not sufficient on their own to lead to a superior refinement algorithm. To remedy this, we propose a problem-specific marking strategy using the solvation free energy computed from the solution of the linear regularized Poisson-Boltzmann equation. The convergence of the solvation free energy using this marking strategy, combined with goal-oriented refinement, compares favorably to adaptive methods using an energy-based error indicator. Due to the use of adaptive mesh refinement, it is critical to use multilevel preconditioning in order to maintain optimal computational complexity. We use variants of the classical multigrid method, which can be viewed as generalizations of the hierarchical basis multigrid and Bramble-Pasciak-Xu (BPX) preconditioners. © 2011 Springer Science+Business Media (outside the USA).

More Details

Adaptive tabulation for verified equations of state

AIP Conference Proceedings

Carpenter, John H.

A new adaptive tabulation scheme for multi-phase equations of state (EOS) is described. Adaptation allows verification that a table represents an EOS model to some desired accuracy at a much lower computational cost than standard tables. Computational efficiency is provided through the use of a quad-tree representation. Using both rectangular and triangular interpolation regions results in accurate descriptions of phase boundaries. The new format is demonstrated on a representative multi-phase EOS model. © 2012 American Institute of Physics.

More Details

Molecular dynamics simulation of dynamic response of beryllium

AIP Conference Proceedings

Thompson, A.P.; Lane, James M.D.; Desjarlais, Michael P.

The response of beryllium to dynamic loading has been extensively studied, both experimentally and theoretically, due to its importance in several technological areas. We use a MEAM empirical potential to examine the melt transition. MD simulations of equilibrated two-phase systems were used to calculate the HCP melting curve up to 300 GPa. This was found to agree well with previous ab initio calculations. The Hugoniostat method was used to examine dynamic compression along the two principal orientations of the HCP crystal. In both directions, the melting transition occurred at 230 GPa and 5000 K, consistent with the equilibrium melting curve. Direct NEMD simulations of uniaxial compression show a transition to an amorphous material at shocked states that lie below the equilibrium melt curve. © 2012 American Institute of Physics.

More Details

Towards Automated Memory Model Generation Via Event Tracing

Computer Journal

Hammond, Simon

The importance of memory performance and capacity is a growing concern for high performance computing laboratories around the world. It has long been recognized that improvements in processor speed exceed the rate of improvement in dynamic random access memory speed and, as a result, memory access times can be the limiting factor in high performance scientific codes. The use of multi-core processors exacerbates this problem with the rapid growth in the number of cores not being matched by similar improvements in memory capacity, increasing the likelihood of memory contention. In this paper, we present WMTools , a lightweight memory tracing tool and analysis framework for parallel codes, which is able to identify peak memory usage and also analyse per-function memory use over time. An evaluation of WMTools , in terms of its effectiveness and also its overheads, is performed using nine established scientific applications/benchmark codes representing a variety of programming languages and scientific domains. We also show how WMTools can be used to automatically generate a parameterized memory model for one of these applications, a two-dimensional non-linear magnetohydrodynamics application, Lare2D . Through the memory model we are able to identify an unexpected growth term which becomes dominant at scale. With a refined model we are able to predict memory consumption with under 7% error.

More Details
Results 7001–7200 of 9,998
Results 7001–7200 of 9,998