The fundamental ideas of the high order compact method are combined with the generalized finite difference method. The result is a finite difference method that works on unstructured, nonuniform grids, and is more accurate than one would classically expect from the number of grid points employed.
The neocortex is perhaps the highest region of the human brain, where audio and visual perception takes place along with many important cognitive functions. An important research goal is to describe the mechanisms implemented by the neocortex. There is an apparent regularity in the structure of the neocortex [Brodmann 1909, Mountcastle 1957] which may help simplify this task. The work reported here addresses the problem of how to describe the putative repeated units ('cortical circuits') in a manner that is easily understood and manipulated, with the long-term goal of developing a mathematical and algorithmic description of their function. The approach is to reduce each algorithm to an enhanced perceptron-like structure and describe its computation using difference equations. We organize this algorithmic processing into larger structures based on physiological observations, and implement key modeling concepts in software which runs on parallel computing hardware.
QMU stands for 'Quantification of Margins and Uncertainties'. QMU is a basic framework for consistency in integrating simulation, data, and/or subject matter expertise to provide input into a risk-informed decision-making process. QMU is being applied to a wide range of NNSA stockpile issues, from performance to safety. The implementation of QMU varies with lab and application focus. The Advanced Simulation and Computing (ASC) Program develops validated computational simulation tools to be applied in the context of QMU. QMU provides input into a risk-informed decision making process. The completeness aspect of QMU can benefit from the structured methodology and discipline of quantitative risk assessment (QRA)/probabilistic risk assessment (PRA). In characterizing uncertainties it is important to pay attention to the distinction between those arising from incomplete knowledge ('epistemic' or systematic), and those arising from device-to-device variation ('aleatory' or random). The national security labs should investigate the utility of a probability of frequency (PoF) approach in presenting uncertainties in the stockpile. A QMU methodology is connected if the interactions between failure modes are included. The design labs should continue to focus attention on quantifying uncertainties that arise from epistemic uncertainties such as poorly-modeled phenomena, numerical errors, coding errors, and systematic uncertainties in experiment. The NNSA and design labs should ensure that the certification plan for any RRW is supported by strong, timely peer review and by an ongoing, transparent QMU-based documentation and analysis in order to permit a confidence level necessary for eventual certification.
Training simulators have become increasingly popular tools for instructing humans on performance in complex environments. However, the question of how to provide individualized and scenario-specific assessment and feedback to students remains largely an open question. In this work, we follow-up on previous evaluations of the Automated Expert Modeling and Automated Student Evaluation (AEMASE) system, which automatically assesses student performance based on observed examples of good and bad performance in a given domain. The current study provides a rigorous empirical evaluation of the enhanced training effectiveness achievable with this technology. In particular, we found that students given feedback via the AEMASE-based debrief tool performed significantly better than students given only instructor feedback on two out of three domain-specific performance metrics.
Training simulators have become increasingly popular tools for instructing humans on performance in complex environments. However, the question of how to provide individualized and scenario-specific assessment and feedback to students remains largely an open question. To maximize training efficiency, new technologies are required that assist instructors in providing individually relevant instruction. Sandia National Laboratories has shown the feasibility of automated performance assessment tools, such as the Sandia-developed Automated Expert Modeling and Student Evaluation (AEMASE) software, through proof-of-concept demonstrations, a pilot study, and an experiment. In the pilot study, the AEMASE system, which automatically assesses student performance based on observed examples of good and bad performance in a given domain, achieved a high degree of agreement with a human grader (89%) in assessing tactical air engagement scenarios. In more recent work, we found that AEMASE achieved a high degree of agreement with human graders (83-99%) for three Navy E-2 domain-relevant performance metrics. The current study provides a rigorous empirical evaluation of the enhanced training effectiveness achievable with this technology. In particular, we assessed whether giving students feedback based on automated metrics would enhance training effectiveness and improve student performance. We trained two groups of employees (differentiated by type of feedback) on a Navy E-2 simulator and assessed their performance on three domain-specific performance metrics. We found that students given feedback via the AEMASE-based debrief tool performed significantly better than students given only instructor feedback on two out of three metrics. Future work will focus on extending these developments for automated assessment of teamwork.
The peridynamic model of solid mechanics treats internal forces within a continuum through interactions across finite distances. These forces are determined through a constitutive model that, in the case of an elastic material, permits the strain energy density at a point to depend on the collective deformation of all the material within some finite distance of it. The forces between points are evaluated from the Frechet derivative of this strain energy density with respect to the deformation map. The resulting equation of motion is an integro-differential equation written in terms of these interparticle forces, rather than the traditional stress tensor field. Recent work on peridynamics has elucidated the energy balance in the presence of these long-range forces. We have derived the appropriate analogue of stress power, called absorbed power, that leads to a satisfactory definition of internal energy. This internal energy is additive, allowing us to meaningfully define an internal energy density field in the body. An expression for the local first law of thermodynamics within peridynamics combines this mechanical component, the absorbed power, with heat transport. The global statement of the energy balance over a subregion can be expressed in a form in which the mechanical and thermal terms contain only interactions between the interior of the subregion and the exterior, in a form anticipated by Noll in 1955. The local form of this first law within peridynamics, coupled with the second law as expressed in the Clausius-Duhem inequality, is amenable to the Coleman-Noll procedure for deriving restrictions on the constitutive model for thermomechanical response. Using an idea suggested by Fried in the context of systems of discrete particles, this procedure leads to a dissipation inequality for peridynamics that has a surprising form. It also leads to a thermodynamically consistent way to treat damage within the theory, shedding light on how damage, including the nucleation and advance of cracks, should be incorporated into a constitutive model.
Data-Intensive Computing is parallel computing where you design your algorithms and your software around efficient access and traversal of a data set; where hardware requirements are dictated by data size as much as by desired run times usually distilling compact results from massive data.
This LDRD 149045 final report describes work that Sandians Scott A. Mitchell, Randall Laviolette, Shawn Martin, Warren Davis, Cindy Philips and Danny Dunlavy performed in 2010. Prof. Afra Zomorodian provided insight. This was a small late-start LDRD. Several other ongoing efforts were leveraged, including the Networks Grand Challenge LDRD, and the Computational Topology CSRF project, and the some of the leveraged work is described here. We proposed a sentence mining technique that exploited both the distribution and the order of parts-of-speech (POS) in sentences in English language documents. The ultimate goal was to be able to discover 'call-to-action' framing documents hidden within a corpus of mostly expository documents, even if the documents were all on the same topic and used the same vocabulary. Using POS was novel. We also took a novel approach to analyzing POS. We used the hypothesis that English follows a dynamical system and the POS are trajectories from one state to another. We analyzed the sequences of POS using support vector machines and the cycles of POS using computational homology. We discovered that the POS were a very weak signal and did not support our hypothesis well. Our original goal appeared to be unobtainable with our original approach. We turned our attention to study an aspect of a more traditional approach to distinguishing documents. Latent Dirichlet Allocation (LDA) turns documents into bags-of-words then into mixture-model points. A distance function is used to cluster groups of points to discover relatedness between documents. We performed a geometric and algebraic analysis of the most popular distance functions and made some significant and surprising discoveries, described in a separate technical report.
We implemented two numerical simulation capabilities essential to reliably predicting the effect of non-ideal explosives (NXs). To begin to be able to treat the multiple, competing, multi-step reaction paths and slower kinetics of NXs, Sandia's CTH shock physics code was extended to include the TIGER thermochemical equilibrium solver as an in-line routine. To facilitate efficient exploration of reaction pathways that need to be identified for the CTH simulations, we implemented in Sandia's LAMMPS molecular dynamics code the MSST method, which is a reactive molecular dynamics technique for simulating steady shock wave response. Our preliminary demonstrations of these two capabilities serve several purposes: (i) they demonstrate proof-of-principle for our approach; (ii) they provide illustration of the applicability of the new functionality; and (iii) they begin to characterize the use of the new functionality and identify where improvements will be needed for the ultimate capability to meet national security needs. Next steps are discussed.
The simplest conceptual model of cybersecurity implicitly views attackers and defenders as acting in isolation from one another: an attacker seeks to penetrate or disrupt a system that has been protected to a given level, while a defender attempts to thwart particular attacks. Such a model also views all non-malicious parties as having the same goal of preventing all attacks. But in fact, attackers and defenders are interacting parts of the same system, and different defenders have their own individual interests: defenders may be willing to accept some risk of successful attack if the cost of defense is too high. We have used game theory to develop models of how non-cooperative but non-malicious players in a network interact when there is a substantial cost associated with effective defensive measures. Although game theory has been applied in this area before, we have introduced some novel aspects of player behavior in our work, including: (1) A model of how players attempt to avoid the costs of defense and force others to assume these costs; (2) A model of how players interact when the cost of defending one node can be shared by other nodes; and (3) A model of the incentives for a defender to choose less expensive, but less effective, defensive actions.
This report describes the specification of a challenge problem and associated challenge milestones for the Waste Integrated Performance and Safety Codes (IPSC) supporting the U.S. Department of Energy (DOE) Office of Nuclear Energy Advanced Modeling and Simulation (NEAMS) Campaign. The NEAMS challenge problems are designed to demonstrate proof of concept and progress towards IPSC goals. The goal of the Waste IPSC is to develop an integrated suite of modeling and simulation capabilities to quantitatively assess the long-term performance of waste forms in the engineered and geologic environments of a radioactive waste storage or disposal system. The Waste IPSC will provide this simulation capability (1) for a range of disposal concepts, waste form types, engineered repository designs, and geologic settings, (2) for a range of time scales and distances, (3) with appropriate consideration of the inherent uncertainties, and (4) in accordance with robust verification, validation, and software quality requirements. To demonstrate proof of concept and progress towards these goals and requirements, a Waste IPSC challenge problem is specified that includes coupled thermal-hydrologic-chemical-mechanical (THCM) processes that describe (1) the degradation of a borosilicate glass waste form and the corresponding mobilization of radionuclides (i.e., the processes that produce the radionuclide source term), (2) the associated near-field physical and chemical environment for waste emplacement within a salt formation, and (3) radionuclide transport in the near field (i.e., through the engineered components - waste form, waste package, and backfill - and the immediately adjacent salt). The initial details of a set of challenge milestones that collectively comprise the full challenge problem are also specified.
Scientific applications use highly specialized data structures that require complex, latency sensitive graphs of integer instructions for memory address calculations. Working with the Univeristy of Wisconsin, we have demonstrated significant differences between the Sandia's applications and the industry standard SPEC-FP (standard performance evaluation corporation-floating point) suite. Specifically, integer dataflow performance is critical to overall system performance. To improve this performance, we have developed a configurable functional unit design that is capable of accelerating integer dataflow.
Diffusion, lossy wave, and Klein–Gordon equations find numerous applications in practical problems across a range of diverse disciplines. The temporal dependence of all three Green’s functions are characterized by an infinite tail. This implies that the cost complexity of the spatio-temporal convolutions, associated with evaluating the potentials, scales as O(Ns2Nt2), where Ns and Nt are the number of spatial and temporal degrees of freedom, respectively. In this paper, we discuss two new methods to rapidly evaluate these spatio-temporal convolutions by exploiting their block-Toeplitz nature within the framework of accelerated Cartesian expansions (ACE). The first scheme identifies a convolution relation in time amongst ACE harmonics and the fast Fourier transform (FFT) is used for efficient evaluation of these convolutions. The second method exploits the rank deficiency of the ACE translation operators with respect to time and develops a recursive numerical compression scheme for the efficient representation and evaluation of temporal convolutions. It is shown that the cost of both methods scales as O(NsNtlog2Nt). Furthermore, several numerical results are presented for the diffusion equation to validate the accuracy and efficacy of the fast algorithms developed here.
The shallow water equations are used as a test for many atmospheric models because the solution mimics the horizontal aspects of atmospheric dynamics while the simplicity of the equations make them useful for numerical experiments. This study describes a high-order element-based Galerkin method for the global shallow water equations using absolute vorticity, divergence, and fluid depth (atmospheric thickness) as the prognostic variables, while the wind field is a diagnostic variable that can be calculated from the stream function and velocity potential (the Laplacians of which are the vorticity and divergence, respectively). The numerical method employed to solve the shallow water system is based on the discontinuous Galerkin and spectral element methods. The discontinuous Galerkin method, which is inherently conservative, is used to solve the equations governing two conservative variables - absolute vorticity and atmospheric thickness (mass). The spectral element method is used to solve the divergence equation and the Poisson equations for the velocity potential and the stream function. Time integration is done with an explicit strong stability-preserving second-order Runge-Kutta scheme and the wind field is updated directly from the vorticity and divergence at each stage, and the computational domain is the cubed sphere. A stable steady-state test is run and convergence results are provided, showing that the method is high-order accurate. Additionally, two tests without analytic solutions are run with comparable results to previous high-resolution runs found in the literature.
The outline of this presentation is: (1) High-level view of Zoltan; (2) Requirements, data models, and interface; (3) Load Balancing and Partitioning; (4) Matrix Ordering, Graph Coloring; (5) Utilities; (6) Isorropia; and (7) Zoltan2.
The objectives of this presentation are: (1) Learn how to partition a problem using Zoltan; (2) Understand the following (a) Basic process of partitioning with Zoltan, (b) Setting Zoltan parameters, (c) Registering query functions, (d) Writing query functions, (e) Zoltan-LB-Partition and its input/output; and (3) Be able to integrate Zoltan into your own applications.
We fabricated a split-gate defined point contact in a double gate enhancement mode Si-MOS device, and implanted Sb donor atoms using a self-aligned process. E-beam lithography in combination with a timed implant gives us excellent control over the placement of dopant atoms, and acts as a stepping stone to focused ion beam implantation of single donors. Our approach allows us considerable latitude in experimental design in-situ. We have identified two resonance conditions in the point contact conductance as a function of split gate voltage. Using tunneling spectroscopy, we probed their electronic structure as a function of temperature and magnetic field. We also determine the capacitive coupling between the resonant feature and several gates. Comparison between experimental values and extensive quasi-classical simulations constrain the location and energy of the resonant level. We discuss our results and how they may apply to resonant tunneling through a single donor.
The cubed sphere geometry, obtained by inscribing a cube in a sphere and mapping points between the two surfaces using a gnomonic (central) projection, is commonly used in atmospheric models because it is free of polar singularities and is well-suited for parallel computing. Global meshes on the cubed-sphere typically project uniform (square) grids from each face of the cube onto the sphere, and if refinement is desired then it is done with non-conforming meshes - overlaying the area of interest with a finer uniform mesh, which introduces so-called hanging nodes on edges along the boundary of the fine resolution area. An alternate technique is to tile each face of the cube with quadrilaterals without requiring the quads to be rectangular. These meshes allow for refinement in areas of interest with a conforming mesh, providing a smoother transition between high and low resolution portions of the grid than non-conforming refinement. The conforming meshes are demonstrated in HOMME, NCAR's High Order Method Modeling Environment, where two modifications have been made: the dependence on uniform meshes has been removed, and the ability to read arbitrary quadrilateral meshes from a previously-generated file has been added. Numerical results come from a conservative spectral element method modeling a selection of the standard shallow water test cases.
Periodic, coordinated, checkpointing to disk is the most prevalent fault tolerance method used in modern large-scale, capability class, high-performance computing (HPC) systems. Previous work has shown that as the system grows in size, the inherent synchronization of coordinated checkpoint/restart (CR) limits application scalability; at large node counts the application spends most of its time checkpointing instead of executing useful work. Furthermore, a single component failure forces an application restart from the last correct checkpoint. Suggested alternatives to coordinated CR include uncoordinated CR with message logging, redundant computation, and RAID-inspired, in-memory distributed checkpointing schemes. Each of these alternatives have differing overheads that are dependent on both the scale and communication characteristics of the application. In this work, using the Structural Simulation Toolkit (SST) simulator, we compare the performance characteristics of each of these resilience methods for a number of HPC application patterns on a number of proposed exascale machines. The result of this work provides valuable guidance on the most efficient resilience methods for exascale systems.
Trilinos is an object-oriented software framework to enabled the solution of large-scale, complex multiphysics engineering and scientific problems. Different Trilinos packages build on each other to create a stack providing the necessary capability: (1) Non-linear solver; (2) Linear solver/preconditioner; (3) Distributed linear algebra; and (4) Local linear algebra.
Enumerating triangles (3-cycles) in graphs is a kernel operation for social network analysis. For example, many community detection methods depend upon finding common neighbors of two related entities. We consider Cohen's simple and elegant solution for listing triangles: give each node a 'bucket.' Place each edge into the bucket of its endpoint of lowest degree, breaking ties consistently. Each node then checks each pair of edges in its bucket, testing for the adjacency that would complete that triangle. Cohen presents an informal argument that his algorithm should run well on real graphs. We formalize this argument by providing an analysis for the expected running time on a class of random graphs, including power law graphs. We consider a rigorously defined method for generating a random simple graph, the erased configuration model (ECM). In the ECM each node draws a degree independently from a marginal degree distribution, endpoints pair randomly, and we erase self loops and multiedges. If the marginal degree distribution has a finite second moment, it follows immediately that Cohen's algorithm runs in expected linear time. Furthermore, it can still run in expected linear time even when the degree distribution has such a heavy tail that the second moment is not finite. We prove that Cohen's algorithm runs in expected linear time when the marginal degree distribution has finite 4/3 moment and no vertex has degree larger than {radical}n. In fact we give the precise asymptotic value of the expected number of edge pairs per bucket. A finite 4/3 moment is required; if it is unbounded, then so is the number of pairs. The marginal degree distribution of a power law graph has bounded 4/3 moment when its exponent {alpha} is more than 7/3. Thus for this class of power law graphs, with degree at most {radical}n, Cohen's algorithm runs in expected linear time. This is precisely the value of {alpha} for which the clustering coefficient tends to zero asymptotically, and it is in the range that is relevant for the degree distribution of the World-Wide Web.