This presentation included a discussion of challenges arising in parallel mesh management, as well as demonstrated solutions. They also described the broad range of software for mesh management and modification developed by the Interoperable Technologies for Advanced Petascale Simulations (ITAPS) team, and highlighted applications successfully using the ITAPS tool suite.
The Trilinos Project started approximately nine years ago as a small effort to enable research, development and ongoing support of small, related solver software efforts. The 'Tri' in Trilinos was intended to indicate the eventual three packages we planned to develop. In 2007 the project expanded its scope to include any package that was an enabling technology for technical computing. Presently the Trilinos repository contains over 55 packages covering a broad spectrum of reusable tools for constructing full-featured scalable scientific and engineering applications. Trilinos usage is now worldwide, and many applications have an explicit dependence on Trilinos for essential capabilities. Users come from other US laboratories, universities, industry and international research groups. Awareness and use of Trilinos is growing rapidly outside of Sandia. Members of the external research community are becoming more familiar with Trilinos, its design and collaborative nature. As a result, the Trilinos project is receiving an increasing number of requests from external community members who want to contribute to Trilinos as developers. To-date we have worked with external developers in an ad hoc fashion. Going forward, we want to develop a set of policies, procedures, tools and infrastructure to simplify interactions with external developers. As we go forward with multi-laboratory efforts such as CASL and X-Stack, and international projects such as IESP, we will need a more streamlined and explicit process for making external developers 'first-class citizens' in the Trilinos development community. This document is intended to frame the discussion for expanding the Trilinos community to all strategically important external members, while at the same time preserving Sandia's primary leadership role in the project.
This talk highlights some multigrid challenges that arise from several application areas including structural dynamics, fluid flow, and electromagnetics. A general framework is presented to help introduce and understand algebraic multigrid methods based on energy minimization concepts. Connections between algebraic multigrid prolongators and finite element basis functions are made to explored. It is shown how the general algebraic multigrid framework allows one to adapt multigrid ideas to a number of different situations. Examples are given corresponding to linear elasticity and specifically in the solution of linear systems associated with extended finite elements for fracture problems.
The most feasible way to disperse particles in a bulk material or control their packing at a substrate is through fluidization in a carrier that can be processed with well-known techniques such as spin, drip and spray coating, fiber drawing, and casting. The next stage in the processing is often solidification involving drying by solvent evaporation. While there has been significant progress in the past few years in developing discrete element numerical methods to model dense nanoparticle dispersion/suspension rheology which properly treat the hydrodynamic interactions of the solvent, these methods cannot at present account for the volume reduction of the suspension due to solvent evaporation. As part of LDRD project FY-101285 we have developed and implemented methods in the current suite of discrete element methods to remove solvent particles and volume, and hence solvent mass from the liquid/vapor interface of a suspension to account for volume reduction (solvent drying) effects. To validate the methods large scale molecular dynamics simulations have been carried out to follow the evaporation process at the microscopic scale.
Predictive Capability Maturity Model (PCMM) is a communication tool that must include a dicussion of the supporting evidence. PCMM is a tool for managing risk in the use of modeling and simulation. PCMM is in the service of organizing evidence to help tell the modeling and simulation (M&S) story. PCMM table describes what activities within each element are undertaken at each of the levels of maturity. Target levels of maturity can be established based on the intended application. The assessment is to inform what level has been achieved compared to the desired level, to help prioritize the VU activities & to allocate resources.
In this work, we developed a self-organizing map (SOM) technique for using web-based text analysis to forecast when a group is undergoing a phase change. By 'phase change', we mean that an organization has fundamentally shifted attitudes or behaviors. For instance, when ice melts into water, the characteristics of the substance change. A formerly peaceful group may suddenly adopt violence, or a violent organization may unexpectedly agree to a ceasefire. SOM techniques were used to analyze text obtained from organization postings on the world-wide web. Results suggest it may be possible to forecast phase changes, and determine if an example of writing can be attributed to a group of interest.
Imported oil exacerabates our trade deficit and funds anti-American regimes. Nuclear Energy (NE) is a demonstrated technology with high efficiency. NE's two biggest political detriments are possible accidents and nuclear waste disposal. For NE policy, proliferation is the biggest obstacle. Nuclear waste can be reduced through reprocessing, where fuel rods are separated into various streams, some of which can be reused in reactors. Current process developed in the 1950s is dirty and expensive, U/Pu separation is the most critical. Fuel rods are sheared and dissolved in acid to extract fissile material in a centrifugal contactor. Plants have many contacts in series with other separations. We have taken a science and simulation-based approach to develop a modern reprocessing plant. Models of reprocessing plants are needed to support nuclear materials accountancy, nonproliferation, plant design, and plant scale-up.
This report provides documentation for the completion of the Sandia portion of the ASC Level II Visualization on the platform milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratories and Los Alamos National Laboratories. This milestone contains functionality required for performing visualization directly on a supercomputing platform, which is necessary for peta-scale visualization. Sandia's contribution concerns in-situ visualization, running a visualization in tandem with a solver. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is most computationally intensive portion of the visualization process. For terascale platforms, commodity clusters with graphics processors(GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the performance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. Scientific simulation on parallel supercomputers is traditionally performed in four sequential steps: meshing, partitioning, solver, and visualization. Not all of these components are necessarily run on the supercomputer. In particular, the meshing and visualization typically happen on smaller but more interactive computing resources. However, the previous decade has seen a growth in both the need and ability to perform scalable parallel analysis, and this gives motivation for coupling the solver and visualization.
The peridynamic model of solid mechanics is a mathematical theory designed to provide consistent mathematical treatment of deformations involving discontinuities, especially cracks. Unlike the partial differential equations (PDEs) of the standard theory, the fundamental equations of the peridynamic theory remain applicable on singularities such as crack surfaces and tips. These basic relations are integro-differential equations that do not require the existence of spatial derivatives of the deformation, or even continuity of the deformation. In the peridynamic theory, material points in a continuous body separated from each other by finite distances can interact directly through force densities. The interaction between each pair of points is called a bond. The dependence of the force density in a bond on the deformation provides the constitutive model for a material. By allowing the force density in a bond to depend on the deformation of other nearby bonds, as well as its own deformation, a wide spectrum of material response can be modelled. Damage is included in the constitutive model through the irreversible breakage of bonds according to some criterion. This criterion determines the critical energy release rate for a peridynamic material. In this talk, we present a general discussion of the peridynamic method and recent progress in its application to penetration and fracture in nonlinearly elastic solids. Constitutive models are presented for rubbery materials, including damage evolution laws. The deformation near a crack tip is discussed and compared with results from the standard theory. Examples demonstrating the spontaneous nucleation and growth of cracks are presented. It is also shown how the method can be applied to anisotropic media, including fiber reinforced composites. Examples show prediction of impact damage in composites and comparison against experimental measurements of damage and delamination.
In a (future) quantum computer a single logical quantum bit (qubit) will be made of multiple physical qubits. These extra physical qubits implement mandatory extensive error checking. The efficiency of error correction will fundamentally influence the performance of a future quantum computer, both in latency/speed and in error threshold (the worst error tolerated for an individual gate). Executing this quantum error correction requires scheduling the individual operations subject to architectural constraints. Since our last talk on this subject, a team of researchers at Sandia National Labortories has designed a logical qubit architecture that considers all relevant architectural issues including layout, the effects of supporting classical electronics, and the types of gates that the underlying physical qubit implementation supports most naturally. This is a two-dimensional system where 2-qubit operations occur locally, so there is no need to calculate more complex qubit/information transportation. Using integer programming, we found a schedule of qubit operations that obeys the hardware constraints, implements the local-check code in the native gate set, and minimizes qubit idle periods. Even with an optimal schedule, however, parallel Monte Carlo simulation shows that there is no finite error probability for the native gates such that the error-correction system would be benecial. However, by adding dynamic decoupling, a series of timed pulses that can reverse some errors, we found that there may be a threshold. Thus finding optimal schedules for increasingly-refined scheduling problems has proven critical for the overall design of the logical qubit system. We describe the evolving scheduling problems and the ideas behind the integer programming-based solution methods. This talk assumes no prior knowledge of quantum computing.
Arctic sea ice is an important component of the global climate system and due to feedback effects the Arctic ice cover is changing rapidly. Predictive mathematical models are of paramount importance for accurate estimates of the future ice trajectory. However, the sea ice components of Global Climate Models (GCMs) vary significantly in their prediction of the future state of Arctic sea ice and have generally underestimated the rate of decline in minimum sea ice extent seen over the past thirty years. One of the contributing factors to this variability is the sensitivity of the sea ice to model physical parameters. A new sea ice model that has the potential to improve sea ice predictions incorporates an anisotropic elastic-decohesive rheology and dynamics solved using the material-point method (MPM), which combines Lagrangian particles for advection with a background grid for gradient computations. We evaluate the variability of the Los Alamos National Laboratory CICE code and the MPM sea ice code for a single year simulation of the Arctic basin using consistent ocean and atmospheric forcing. Sensitivities of ice volume, ice area, ice extent, root mean square (RMS) ice speed, central Arctic ice thickness, and central Arctic ice speed with respect to ten different dynamic and thermodynamic parameters are evaluated both individually and in combination using the Design Analysis Kit for Optimization and Terascale Applications (DAKOTA). We find similar responses for the two codes and some interesting seasonal variability in the strength of the parameters on the solution.
The two primary objectives of this LDRD project were to create a lightweight kernel (LWK) operating system(OS) designed to take maximum advantage of multi-core processors, and to leverage the virtualization capabilities in modern multi-core processors to create a more flexible and adaptable LWK environment. The most significant technical accomplishments of this project were the development of the Kitten lightweight kernel, the co-development of the SMARTMAP intra-node memory mapping technique, and the development and demonstration of a scalable virtualization environment for HPC. Each of these topics is presented in this report by the inclusion of a published or submitted research paper. The results of this project are being leveraged by several ongoing and new research projects.
MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.
This report is a summary of the accomplishments of the 'Scalable Solutions for Processing and Searching Very Large Document Collections' LDRD, which ran from FY08 through FY10. Our goal was to investigate scalable text analysis; specifically, methods for information retrieval and visualization that could scale to extremely large document collections. Towards that end, we designed, implemented, and demonstrated a scalable framework for text analysis - ParaText - as a major project deliverable. Further, we demonstrated the benefits of using visual analysis in text analysis algorithm development, improved performance of heterogeneous ensemble models in data classification problems, and the advantages of information theoretic methods in user analysis and interpretation in cross language information retrieval. The project involved 5 members of the technical staff and 3 summer interns (including one who worked two summers). It resulted in a total of 14 publications, 3 new software libraries (2 open source and 1 internal to Sandia), several new end-user software applications, and over 20 presentations. Several follow-on projects have already begun or will start in FY11, with additional projects currently in proposal.
Statistical Latent Dirichlet Analysis produces mixture model data that are geometrically equivalent to points lying on a regular simplex in moderate to high dimensions. Numerous other statistical models and techniques also produce data in this geometric category, even though the meaning of the axes and coordinate values differs significantly. A distance function is used to further analyze these points, for example to cluster them. Several different distance functions are popular amongst statisticians; which distance function is chosen is usually driven by the historical preference of the application domain, information-theoretic considerations, or by the desirability of the clustering results. Relatively little consideration is usually given to how distance functions geometrically transform data, or the distances algebraic properties. Here we take a look at these issues, in the hope of providing complementary insight and inspiring further geometric thought. Several popular distances, {chi}{sup 2}, Jensen - Shannon divergence, and the square of the Hellinger distance, are shown to be nearly equivalent; in terms of functional forms after transformations, factorizations, and series expansions; and in terms of the shape and proximity of constant-value contours. This is somewhat surprising given that their original functional forms look quite different. Cosine similarity is the square of the Euclidean distance, and a similar geometric relationship is shown with Hellinger and another cosine. We suggest a geodesic variation of Hellinger. The square-root projection that arises in Hellinger distance is briefly compared to standard normalization for Euclidean distance. We include detailed derivations of some ratio and difference bounds for illustrative purposes. We provide some constructions that nearly achieve the worst-case ratios, relevant for contours.
While individual neurons function at relatively low firing rates, naturally-occurring nervous systems not only surpass manmade systems in computing power, but accomplish this feat using relatively little energy. It is asserted that the next major breakthrough in computing power will be achieved through application of neuromorphic approaches that mimic the mechanisms by which neural systems integrate and store massive quantities of data for real-time decision making. The proposed LDRD provides a conceptual foundation for SNL to make unique advances toward exascale computing. First, a team consisting of experts from the HPC, MESA, cognitive and biological sciences and nanotechnology domains will be coordinated to conduct an exercise with the outcome being a concept for applying neuromorphic computing to achieve exascale computing. It is anticipated that this concept will involve innovative extension and integration of SNL capabilities in MicroFab, material sciences, high-performance computing, and modeling and simulation of neural processes/systems.
This report summarizes activities undertaken during FY08-FY10 for the LDRD Peridynamics as a Rigorous Coarse-Graining of Atomistics for Multiscale Materials Design. The goal of our project was to develop a coarse-graining of finite temperature molecular dynamics (MD) that successfully transitions from statistical mechanics to continuum mechanics. The goal of our project is to develop a coarse-graining of finite temperature molecular dynamics (MD) that successfully transitions from statistical mechanics to continuum mechanics. Our coarse-graining overcomes the intrinsic limitation of coupling atomistics with classical continuum mechanics via the FEM (finite element method), SPH (smoothed particle hydrodynamics), or MPM (material point method); namely, that classical continuum mechanics assumes a local force interaction that is incompatible with the nonlocal force model of atomistic methods. Therefore FEM, SPH, and MPM inherit this limitation. This seemingly innocuous dichotomy has far reaching consequences; for example, classical continuum mechanics cannot resolve the short wavelength behavior associated with atomistics. Other consequences include spurious forces, invalid phonon dispersion relationships, and irreconcilable descriptions/treatments of temperature. We propose a statistically based coarse-graining of atomistics via peridynamics and so develop a first of a kind mesoscopic capability to enable consistent, thermodynamically sound, atomistic-to-continuum (AtC) multiscale material simulation. Peridynamics (PD) is a microcontinuum theory that assumes nonlocal forces for describing long-range material interaction. The force interactions occurring at finite distances are naturally accounted for in PD. Moreover, PDs nonlocal force model is entirely consistent with those used by atomistics methods, in stark contrast to classical continuum mechanics. Hence, PD can be employed for mesoscopic phenomena that are beyond the realms of classical continuum mechanics and atomistic simulations, e.g., molecular dynamics and density functional theory (DFT). The latter two atomistic techniques are handicapped by the onerous length and time scales associated with simulating mesoscopic materials. Simulating such mesoscopic materials is likely to require, and greatly benefit from multiscale simulations coupling DFT, MD, PD, and explicit transient dynamic finite element methods FEM (e.g., Presto). The proposed work fills the gap needed to enable multiscale materials simulations.
This report is a summary of the accomplishments of the 'Leveraging Multi-way Linkages on Heterogeneous Data' which ran from FY08 through FY10. The goal was to investigate scalable and robust methods for multi-way data analysis. We developed a new optimization-based method called CPOPT for fitting a particular type of tensor factorization to data; CPOPT was compared against existing methods and found to be more accurate than any faster method and faster than any equally accurate method. We extended this method to computing tensor factorizations for problems with incomplete data; our results show that you can recover scientifically meaningfully factorizations with large amounts of missing data (50% or more). The project has involved 5 members of the technical staff, 2 postdocs, and 1 summer intern. It has resulted in a total of 13 publications, 2 software releases, and over 30 presentations. Several follow-on projects have already begun, with more potential projects in development.