Publications Search

Weak scaling studies were performed for the explicit solid dynamics component of the ALEGRA code on two Cray supercomputer platforms during the period 2012-2015, involving a production-oriented hypervelocity impact problem. Results from these studies are presented, with analysis of the performance, scaling, and throughput of the code on these machines. The analysis demonstrates logarithmic scaling of the average CPU time per cycle up to core counts on the order of 10,000. At higher core counts, variable performance is observed, with significant upward excursions in compute time from the logarithmic trend. However, for core counts less than 10,000, the results show a 3 × improvement in simulation throughput, and a 2 × improvement in logarithmic scaling. This improvement is linked to improved memory performance on the Cray platforms, and to significant improvements made over this period to the data layout used by ALEGRA.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Center for Computing Research Summer Research Proceedings 2015

Bradley, Andrew M.; Parks, Michael L.

The Center for Computing Research (CCR) at Sandia National Laboratories organizes a summer student program each summer, in coordination with the Computer Science Research Institute (CSRI) and Cyber Engineering Research Institute (CERI).

More Details

TYPE Other Report YEAR 2015

DOI OSTI

Time series discord detection in medical data using a parallel relational database

Proceedings - 2015 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2015

Woodbridge, Diane M.; Wilson, Andrew T.; Foulk, James W.; Goldstein, Richard H.

Recent advances in sensor technology have made continuous real-time health monitoring available in both hospital and non-hospital settings. Since data collected from high frequency medical sensors includes a huge amount of data, storing and processing continuous medical data is an emerging big data area. Especially detecting anomaly in real time is important for patients' emergency detection and prevention. A time series discord indicates a subsequence that has the maximum difference to the rest of the time series subsequences, meaning that it has abnormal or unusual data trends. In this study, we implemented two versions of time series discord detection algorithms on a high performance parallel database management system (DBMS) and applied them to 240 Hz waveform data collected from 9,723 patients. The initial brute force version of the discord detection algorithm takes each possible subsequence and calculates a distance to the nearest non-self match to find the biggest discords in time series. For the heuristic version of the algorithm, a combination of an array and a trie structure was applied to order time series data for enhancing time efficiency. The study results showed efficient data loading, decoding and discord searches in a large amount of data, benefiting from the time series discord detection algorithm and the architectural characteristics of the parallel DBMS including data compression, data pipe-lining, and task scheduling.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Applying uncertainty quantification and sensitivity analysis to large-scale hippocampal brain models [Poster]

Carlson, Kristofor D.; Aimone, James B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Peridynamic Multiscale Finite Element Methods

Costa, Timothy; Bond, Stephen D.; Littlewood, David J.; Moore, Stan G.

The problem of computing quantum-accurate design-scale solutions to mechanics problems is rich with applications and serves as the background to modern multiscale science research. The prob- lem can be broken into component problems comprised of communicating across adjacent scales, which when strung together create a pipeline for information to travel from quantum scales to design scales. Traditionally, this involves connections between a) quantum electronic structure calculations and molecular dynamics and between b) molecular dynamics and local partial differ- ential equation models at the design scale. The second step, b), is particularly challenging since the appropriate scales of molecular dynamic and local partial differential equation models do not overlap. The peridynamic model for continuum mechanics provides an advantage in this endeavor, as the basic equations of peridynamics are valid at a wide range of scales limiting from the classical partial differential equation models valid at the design scale to the scale of molecular dynamics. In this work we focus on the development of multiscale finite element methods for the peridynamic model, in an effort to create a mathematically consistent channel for microscale information to travel from the upper limits of the molecular dynamics scale to the design scale. In particular, we first develop a Nonlocal Multiscale Finite Element Method which solves the peridynamic model at multiple scales to include microscale information at the coarse-scale. We then consider a method that solves a fine-scale peridynamic model to build element-support basis functions for a coarse- scale local partial differential equation model, called the Mixed Locality Multiscale Finite Element Method. Given decades of research and development into finite element codes for the local partial differential equation models of continuum mechanics there is a strong desire to couple local and nonlocal models to leverage the speed and state of the art of local models with the flexibility and accuracy of the nonlocal peridynamic model. In the mixed locality method this coupling occurs across scales, so that the nonlocal model can be used to communicate material heterogeneity at scales inappropriate to local partial differential equation models. Additionally, the computational burden of the weak form of the peridynamic model is reduced dramatically by only requiring that the model be solved on local patches of the simulation domain which may be computed in parallel, taking advantage of the heterogeneous nature of next generation computing platforms. Addition- ally, we present a novel Galerkin framework, the 'Ambulant Galerkin Method', which represents a first step towards a unified mathematical analysis of local and nonlocal multiscale finite element methods, and whose future extension will allow the analysis of multiscale finite element methods that mix models across scales under certain assumptions of the consistency of those models.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Construction of Mixed Formulations and Scalable Preconditioners for Magnetohydrodynamics

Cyr, Eric C.; Shadid, John N.; Phillips, Edward; Pawlowski, Roger; Seefeldt, Ben

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Acceleration of Neural Algorithms using Nanoelectronic Resistive Memory Crossbars

Marinella, Matthew; Agarwal, Sapan; Hughart, David R.; Plimpton, Steven J.; Parekh, Ojas D.; Quach, Tu T.; Debenedictis, Erik; Goeke, Ronald S.; Hsia, Alexander W.; Aimone, James B.; James, Conrad D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Heterogeneous Domain Decomposition methods for Nonlocal problems

D'Elia, Marta; Bochev, Pavel B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures

IEEE Transactions on Parallel and Distributed Systems

Shan, Tzu-Ray; Aktulga, Hasan M.; Knight, Chris; Coffman, Paul; Jiang, Wei

Hybrid parallelism allows high performance computing applications to better leverage the increasing on-node parallelism of modern supercomputers. In this paper, we present a hybrid parallel implementation of the widely used LAMMPS/ReaxC package, where the construction of bonded and nonbonded lists and evaluation of complex ReaxFF interactions are implemented efficiently using OpenMP parallelism. Additionally, the performance of the QEq charge equilibration scheme is examined and a dual-solver is implemented. We present the performance of the resulting ReaxC-OMP package on a state-of-the-art multi-core architecture Mira, an IBM BlueGene/Q supercomputer. For system sizes ranging from 32 thousand to 16.6 million particles, speedups in the range of 1.5-4.5x are observed using the new ReaxC-OMP software. Sustained performance improvements have been observed for up to 262,144 cores (1,048,576 processes) of Mira with a weak scaling efficiency of 91.5% in larger simulations containing 16.6 million particles.

More Details

TYPE Journal Article YEAR 2015

OSTI