Performance measurement of parallel algorithms is well studied and well understood. However, a flaw in traditional performance metrics is that they rely on comparisons to serial performance with the same input. This comparison is convenient for theoretical complexity analysis but impossible to perform in large-scale empirical studies with data sizes far too large to run on a single serial computer. Consequently, scaling studies currently rely on ad hoc methods that, although effective, have no grounded mathematical models. In this position paper we advocate using a rate-based model that has a concrete meaning relative to speedup and efficiency and that can be used to unify strong and weak scaling studies.
Visual search has been an active area of research – empirically and theoretically – for a number of decades, however much of that work is based on novice searchers performing basic tasks in a laboratory. This paper summarizes some of the issues associated with quantifying expert, domain-specific visual search behavior in operationally realistic environments.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Ciesko, Jan; Mateo, Sergi; Teruel, Xavier; Martorell, Xavier; Ayguade, Eduard; Labarta, Jesus; Duran, Alex; De Supinski, Bronis R.; Olivier, Stephen L.; Li, Kelvin; Eichenberger, Alexandre E.
Reductions represent a common algorithmic pattern in many scientific applications. OpenMP* has always supported them on parallel and worksharing constructs. OpenMP 3.0’s tasking constructs enable new parallelization opportunities through the annotation of irregular algorithms. Unfortunately the tasking model does not easily allow the expression of concurrent reductions, which limits the general applicability of the programming model to such algorithms. In this work, we present an extension to OpenMP that supports task-parallel reductions on task and taskgroup constructs to improve productivity and programmability. We present specification of the feature and explore issues for programmers and software vendors regarding programming transparency as well as the impact on the current standard with respect to nesting, untied task support and task data dependencies. Our performance evaluation demonstrates comparable results to hand coded task reductions.
Electric distribution utilities, the companies that feed electricity to end users, are overseeing a technological transformation of their networks, installing sensors and other automated equipment, that are fundamentally changing the way the grid operates. These grid modernization efforts will allow utilities to incorporate some of the newer technology available to the home user – such as solar panels and electric cars – which will result in a bi-directional flow of energy and information. How will this new flow of information affect control room operations? How will the increased automation associated with smart grid technologies influence control room operators’ decisions? And how will changes in control room operations and operator decision making impact grid resilience? These questions have not been thoroughly studied, despite the enormous changes that are taking place. In this study, which involved collaborating with utility companies in the state of Vermont, the authors proposed to advance the science of control-room decision making by understanding the impact of distribution grid modernization on operator performance. Distribution control room operators were interviewed to understand daily tasks and decisions and to gain an understanding of how these impending changes will impact control room operations. Situation awareness was found to be a major contributor to successful control room operations. However, the impact of growing levels of automation due to smart grid technology on operators’ situation awareness is not well understood. Future work includes performing a naturalistic field study in which operator situation awareness will be measured in real-time during normal operations and correlated with the technological changes that are underway. The results of this future study will inform tools and strategies that will help system operators adapt to a changing grid, respond to critical incidents and maintain critical performance skills.
The elastic properties and mechanical stability of zirconium alloys and zirconium hydrides have been investigated within the framework of density functional perturbation theory. Results show that the lowest-energy cubic Pn3m polymorph of δ-ZrH1.5 does not satisfy all the Born requirements for mechanical stability, unlike its nearly degenerate tetragonal P42/mcm polymorph. Elastic moduli predicted with the Voigt-Reuss-Hill approximations suggest that mechanical stability of α-Zr, Zr-alloy and Zr-hydride polycrystalline aggregates is limited by the shear modulus. According to both Pugh's and Poisson's ratios, α-Zr, Zr-alloy and Zr-hydride polycrystalline aggregates can be considered ductile. The Debye temperatures predicted for γ-ZrH, δ-ZrH1.5 and ε-ZrH2 are D = 299.7, 415.6 and 356.9 K, respectively, while D = 273.6, 284.2, 264.1 and 257.1 K for the α-Zr, Zry-4, ZIRLO and M5 matrices, i.e. suggesting that Zry-4 possesses the highest micro-hardness among Zr matrices.
The reproducing kernel particle method (RKPM) is a meshfree method for computational solid mechanics that can be tailored for an arbitrary order of completeness and smoothness. The primary advantage of RKPM relative to standard finiteelement (FE) approaches is its capacity to model large deformations, material damage, and fracture. Additionally, the use of a meshfree approach offers great flexibility in the domain discretization process and reduces the complexity of mesh modifications such as adaptive refinement. We present an overview of the RKPM implementation in the Sierra/SolidMechanics analysis code, with a focus on verification, validation, and software engineering for massively parallel computation. Key details include the processing of meshfree discretizations within a FE code, RKPM solution approximation and domain integration, stress update and calculation of internal force, and contact modeling. The accuracy and performance of RKPM are evaluated using a set of benchmark problems. Solution verification, mesh convergence, and parallel scalability are demonstrated using a simulation of wave propagation along the length of a bar. Initial model validation is achieved through simulation of a Taylor bar impact test. The RKPM approach is shown to be a viable alternative to standard FE techniques that provides additional flexibility to the analyst community.
We study several natural instances of the geometric hitting set problem for input consisting of sets of line segments (and rays, lines) having a small number of distinct slopes. These problems model path monitoring (e.g., on road networks) using the fewest sensors (the “hitting points”). We give approximation algorithms for cases including (i) lines of 3 slopes in the plane, (ii) vertical lines and horizontal segments, (iii) pairs of horizontal/vertical segments. We give hardness and hardness of approximation results for these problems. We prove that the hitting set problem for vertical lines and horizontal rays is polynomially solvable.
The reproducing kernel particle method (RKPM) is a meshfree method for computational solid mechanics that can be tailored for an arbitrary order of completeness and smoothness. The primary advantage of RKPM relative to standard finiteelement (FE) approaches is its capacity to model large deformations, material damage, and fracture. Additionally, the use of a meshfree approach offers great flexibility in the domain discretization process and reduces the complexity of mesh modifications such as adaptive refinement. We present an overview of the RKPM implementation in the Sierra/SolidMechanics analysis code, with a focus on verification, validation, and software engineering for massively parallel computation. Key details include the processing of meshfree discretizations within a FE code, RKPM solution approximation and domain integration, stress update and calculation of internal force, and contact modeling. The accuracy and performance of RKPM are evaluated using a set of benchmark problems. Solution verification, mesh convergence, and parallel scalability are demonstrated using a simulation of wave propagation along the length of a bar. Initial model validation is achieved through simulation of a Taylor bar impact test. The RKPM approach is shown to be a viable alternative to standard FE techniques that provides additional flexibility to the analyst community.
Proceedings - 15th European Turbulence Conference, ETC 2015
Smith, Thomas M.; Christon, Mark A.; Baglietto, Emilio; Luo, Hong
Accurate simulation of turbulence remains one of the most challenging problems in nuclear reactor analysis and design. Due to limitations in computing resources, Reynolds averaged Navier Stokes models (RANS) continue to play an important role in reactor simulations. The Consortium for advanced simulations of light water reactors (CASL) is a Department of Energy technology hub that is investing in research and development of a state-of-the-art computational fluid dynamics capability to meet the challenges of turbulent simulation of nuclear reactors. In this presentation, we assess several RANS eddy viscosity models appropriate for single-phase incompressible turbulent flows. Specifically, we compare the single equation Splalart-Allmaras to several variations of the k − ε model. The assessment takes into consideration elements of full system reactor cores such as complex geometries, heterogeneous meshes, swirling flow, near wall flow behavior, heat transfer and robustness issues. The goal of this strategically oriented assessment is to provide an accurate and robust turbulent simulation capability for the CASL community. Metrics of performance will be constructed by comparing different models on a strategically chosen set of problems that represent reactor core sub-systems.
Previously, the current authors (Hopkins et al. 2015) described research in which subjects provided a tool that facilitated their construction of a narrative account of events performed better in conducting cyber security forensic analysis. The narrative tool offered several distinct features. In the current paper, an analysis is reported that considered which features of the tool contributed to superior performance. This analysis revealed two features that accounted for a statistically significant portion of the variance in performance. The first feature provided a mechanism for subjects to identify suspected perpetrators of the crimes and their motives. The second feature involved the ability to create an annotated visuospatial diagram of clues regarding the crimes and their relationships to one another. Based on these results, guidance may be provided for the development of software tools meant to aid cyber security professionals in conducting forensic analysis.
The purpose of this paper is to consider the exit-time problem for a finite-range Markov jump process, i.e, the distance the particle can jump is bounded independent of its location. Such jump diffusions are expedient models for anomalous transport exhibiting super-diffusion or nonstandard normal diffusion. We refer to the associated deterministic equation as a volume-constrained nonlocal diffusion equation. The volume constraint is the nonlocal analogue of a boundary condition necessary to demonstrate that the nonlocal diffusion equation is well-posed and is consistent with the jump process. A critical aspect of the analysis is a variational formulation and a recently developed nonlocal vector calculus. This calculus allows us to pose nonlocal backward and forward Kolmogorov equations, the former equation granting the various moments of the exit-time distribution.
An approach for building energy-stable Galerkin reduced order models (ROMs) for linear hyperbolic or incompletely parabolic systems of partial differential equations (PDEs) using continuous projection is developed. This method is an extension of earlier work by the authors specific to the equations of linearized compressible inviscid flow. The key idea is to apply to the PDEs a transformation induced by the Lyapunov function for the system, and to build the ROM in the transformed variables. For linear problems, the desired transformation is induced by a special inner product, termed the "symmetry inner product", which is derived herein for several systems of physical interest. Connections are established between the proposed approach and other stability-preserving model reduction methods, giving the paper a review flavor. More specifically, it is shown that a discrete counterpart of this inner product is a weighted L2 inner product obtained by solving a Lyapunov equation, first proposed by Rowley et al. and termed herein the "Lyapunov inner product". Comparisons between the symmetry inner product and the Lyapunov inner product are made, and the performance of ROMs constructed using these inner products is evaluated on several benchmark test cases.
The purpose of this paper is to consider the exit-time problem for a finite-range Markov jump process, i.e, the distance the particle can jump is bounded independent of its location. Additionally, such jump diffusions are expedient models for anomalous transport exhibiting super-diffusion or nonstandard normal diffusion. We refer to the associated deterministic equation as a volume-constrained nonlocal diffusion equation. The volume constraint is the nonlocal analogue of a boundary condition necessary to demonstrate that the nonlocal diffusion equation is well-posed and is consistent with the jump process. A critical aspect of the analysis is a variational formulation and a recently developed nonlocal vector calculus. Finally, this calculus allows us to pose nonlocal backward and forward Kolmogorov equations, the former equation granting the various moments of the exit-time distribution.
We present a spectral mimetic least-squares method for a model diffusion–reaction problem, which preserves key conservation properties of the continuum problem. Casting the model problem into a first-order system for two scalar and two vector variables shifts material properties from the differential equations to a pair of constitutive relations. We also use this system to motivate a new least-squares functional involving all four fields and show that its minimizer satisfies the differential equations exactly. Discretization of the four-field least-squares functional by spectral spaces compatible with the differential operators leads to a least-squares method in which the differential equations are also satisfied exactly. Additionally, the latter are reduced to purely topological relationships for the degrees of freedom that can be satisfied without reference to basis functions. Furthermore, numerical experiments confirm the spectral accuracy of the method and its local conservation.
Current commercial software tools for transmission and generation investment planning have limited stochastic modeling capabilities. Because of this limitation, electric power utilities generally rely on scenario planning heuristics to identify potentially robust and cost effective investment plans for a broad range of system, economic, and policy conditions. Several research studies have shown that stochastic models perform significantly better than deterministic or heuristic approaches, in terms of overall costs. However, there is a lack of practical solution approaches to solve such models. In this paper we propose a scalable decomposition algorithm to solve stochastic transmission and generation planning problems, respectively considering discrete and continuous decision variables for transmission and generation investments. Given stochasticity restricted to loads and wind, solar, and hydro power output, we develop a simple scenario reduction framework based on a clustering algorithm, to yield a more tractable model. The resulting stochastic optimization model is decomposed on a scenario basis and solved using a variant of the Progressive Hedging (PH) algorithm. We perform numerical experiments using a 240-bus network representation of the Western Electricity Coordinating Council in the US. Although convergence of PH to an optimal solution is not guaranteed for mixed-integer linear optimization models, we find that it is possible to obtain solutions with acceptable optimality gaps for practical applications. Our numerical simulations are performed both on a commodity workstation and on a high-performance cluster. The results indicate that large-scale problems can be solved to a high degree of accuracy in at most two hours of wall clock time.
The metrics used for evaluating energy saving techniques for future HPC systems are critical to the correct assessment of proposed methods. Current predictions forecast that overcoming reduced system reliability, increased power requirements and energy consumption will be a major design challenge for future systems. Modern runtime energy-saving research efforts do not take into account the energy spent providing reliability. They also do not account for the increase in the probability of failure during application execution due to runtime overhead from energy saving methods. While this is very reasonable for current systems, it is insufficient for future generation systems. By taking into account the energy consumption ramifications of increased runtimes on system reliability, better energy saving techniques can be developed. This paper demonstrates how to determine the impact of runtime energy conservation methods within the context of failure-prone large scale systems. In addition, a survey of several energy savings methodologies is conducted and an analysis is performed with respect to their effectiveness in an environment in which failures occur.
Power and energy concerns are motivating chip manufacturers to consider future hybrid-core processor designs that may combine a small number of traditional cores optimized for single-thread performance with a large number of simpler cores optimized for throughput performance. This trend is likely to impact the way in which compute resources for network protocol processing functions are allocated and managed. In particular, the performance of MPI match processing is critical to achieving high message throughput. In this paper, we analyze the ability of simple and more complex cores to perform MPI matching operations for various scenarios in order to gain insight into how MPI implementations for future hybrid-core processors should be designed.
We use Sandia's Z machine and magnetically accelerated flyer plates to shock compress liquid krypton to 850 GPa and compare with results from density-functional theory (DFT) based simulations using the AM05 functional. We also employ quantum Monte Carlo calculations to motivate the choice of AM05. We conclude that the DFT results are sensitive to the quality of the pseudopotential in terms of scattering properties at high energy/temperature. A new Kr projector augmented wave potential was constructed with improved scattering properties which resulted in excellent agreement with the experimental results to 850 GPa and temperatures above 10 eV (110 kK). Finally, we present comparisons of our data from the Z experiments and DFT calculations to current equation of state models of krypton to determine the best model for high energy-density applications.
In this report we derive frequency-domain methods for inverse characterization of the constitutive parameters of viscoelastic materials. The inverse problem is cast in a PDE-constrained optimization framework with efficient computation of gradients and Hessian vector products through matrix free operations. The abstract optimization operators for first and second derivatives are derived from first principles. Various methods from the Rapid Optimization Library (ROL) are tested on the viscoelastic inversion problem. The methods described herein are applied to compute the viscoelastic bulk and shear moduli of a foam block model, which was recently used in experimental testing for viscoelastic property characterization.
The goal of the workshop and this report is to identify common themes and standardize concepts for locality-preserving abstractions for exascale programming models.
Under shock compression, most porous materials exhibit lower densities for a given pressure than that of a full-dense sample of the same material. However, some porous materials exhibit an anomalous, or enhanced, densification under shock compression. We demonstrate a molecular mechanism that drives this behavior. We also present evidence from atomistic simulation that silicon belongs to this anomalous class of materials. Atomistic simulations indicate that local shear strain in the neighborhood of collapsing pores nucleates a local solid-solid phase transformation even when bulk pressures are below the thermodynamic phase transformation pressure. This metastable, local, and partial, solid-solid phase transformation, which accounts for the enhanced densification in silicon, is driven by the local stress state near the void, not equilibrium thermodynamics. This mechanism may also explain the phenomenon in other covalently bonded materials.
Adult neurogenesis in the hippocampus is a notable process due not only to its uniqueness and potential impact on cognition but also to its localized vertical integration of different scales of neuroscience, ranging from molecular and cellular biology to behavior. Our review summarizes the recent research regarding the process of adult neurogenesis from these different perspectives, with particular emphasis on the differentiation and development of new neurons, the regulation of the process by extrinsic and intrinsic factors, and their ultimate function in the hippocampus circuit. Arising from a local neural stem cell population, new neurons progress through several stages of maturation, ultimately integrating into the adult dentate gyrus network. Furthermore, the increased appreciation of the full neurogenesis process, from genes and cells to behavior and cognition, makes neurogenesis both a unique case study for how scales in neuroscience can link together and suggests neurogenesis as a potential target for therapeutic intervention for a number of disorders.