We present a new approach to the simulation of gravity-driven viscous fingering instabilities in porous media flow. These instabilities play a very important role during carbon sequestration processes in brine aquifers. Our approach is based on a nonlinear implementation of the discontinuous Galerkin method, and possesses a number of key features. First, the method developed is inherently high order, and is therefore well suited to study unstable flow mechanisms. Secondly, it maintains high-order accuracy on completely unstructured meshes. The combination of these two features makes it a very appealing strategy in simulating the challenging flow patterns and very complex geometries of actual reservoirs and aquifers. This article includes an extensive set of verification studies on the stability and accuracy of the method, and also features a number of computations with unstructured grids and non-standard geometries.
Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report a preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.
Mathematical modeling of anatomically-constrained neural networks has provided significant insights regarding the response of networks to neurological disorders or injury. A logical extension of these models is to incorporate treatment regimens to investigate network responses to intervention. The addition of nascent neurons from stem cell precursors into damaged or diseased tissue has been used as a successful therapeutic tool in recent decades. Interestingly, models have been developed to examine the incorporation of new neurons into intact adult structures, particularly the dentate granule neurons of the hippocampus. These studies suggest that the unique properties of maturing neurons, can impact circuit behavior in unanticipated ways. In this perspective, we review the current status of models used to examine damaged CNS structures with particular focus on cortical damage due to stroke. Secondly, we suggest that computational modeling of cell replacement therapies can be made feasible by implementing approaches taken by current models of adult neurogenesis. The development of these models is critical for generating hypotheses regarding transplant therapies and improving outcomes by tailoring transplants to desired effects.
Power and energy concerns are motivating chip manufacturers to consider future hybrid-core processor designs that combine a small number of traditional cores optimized for single-thread performance with a large number of simpler cores optimized for throughput performance. This trend is likely to impact the way compute resources for network protocol processing functions are allocated and managed. In particular, the performance of MPI match processing is critical to achieving high message throughput. In this paper, we analyze the ability of simple and more complex cores to perform MPI matching operations for various scenarios in order to gain insight into how MPI implementations for future hybrid-core processors should be designed.
Scalable parallel computing is essential for processing large scale-free (power-law) graphs. The distribution of data across processes becomes important on distributed-memory computers with thousands of cores. It has been shown that two dimensional layouts (edge partitioning) can have significant advantages over traditional one-dimensional layouts. However, simple 2D block distribution does not use the structure of the graph, and more advanced 2D partitioning methods are too expensive for large graphs. We propose a new two-dimensional partitioning algorithm that combines graph partitioning with 2D block distribution. The computational cost of the algorithm is essentially the same as 1D graph partitioning. We study the performance of sparse matrix-vector multiplication (SpMV) for scale-free graphs from the web and social networks using several different partitioners and both 1D and 2D data layouts. We show that SpMV run time is reduced by exploiting the graph's structure. Contrary to popular belief, we observe that current graph and hypergraph partitioners often yield relatively good partitions on scale-free graphs. We demonstrate that our new 2D partitioning method consistently outperforms the other methods considered, for both SpMV and an eigensolver, on matrices with up to 1.6 billion nonzeros using up to 16,384 cores. Copyright 2013 ACM.
11th International Probabilistic Safety Assessment and Management Conference and the Annual European Safety and Reliability Conference 2012, PSAM11 ESREL 2012
Many of the Performance Shaping Factors (PSFs) used in Human Reliability Analysis (HRA) methods are not directly measurable or observable. Methods like SPAR-H require the analyst to assign values for all of the PSFs, regardless of the PSF observability; this introduces subjectivity into the human error probability (HEP) calculation. One method to reduce the subjectivity of HRA estimates is to formally incorporate information about the probability of the PSFs into the methodology for calculating the HEP. This can be accomplished by encoding prior information in a Bayesian Network (BN) and updating the network using available observations. We translated an existing HRA methodology, SPAR-H, into a Bayesian Network to demonstrate the usefulness of the BN framework. We focus on the ability to incorporate prior information about PSF probabilities into the HRA process. This paper discusses how we produced the model by combining information from two sources, and how the BN model can be used to estimate HEPs despite missing observations. Use of the prior information allows HRA analysts to use partial information to estimate HEPs, and to rely on the prior information (from data or cognitive literature) when they are unable to gather information about the state of a particular PSF. The SPAR-H BN model is a starting point for future research activities to create a more robust HRA BN model using data from multiple sources.
The rise in node-level parallelism has increased interest in task-based parallel runtimes for a wide array of application areas. Applications have a wide variety of task spawning patterns which frequently change during the course of application execution, based on the algorithm or solver kernel in use. Task scheduling and load balance regimes, however, are often highly optimized for specific patterns. This paper uses four basic task spawning patterns to quantify the impact of specific scheduling policy decisions on execution time. We compare the behavior of six publicly available tasking runtimes: Intel Cilk, Intel Threading Building Blocks (TBB), Intel OpenMP, GCC OpenMP, Qthreads, and High Performance ParalleX (HPX). With the exception of Qthreads, the runtimes prove to have schedulers that are highly sensitive to application structure. No runtime is able to provide the best performance in all cases, and those that do provide the best performance in some cases, unfortunately, provide extremely poor performance when application structure does not match the schedulers assumptions.
Sandias parallel circuit simulator, Xyce, can address large scale neuron simulations in a new way extending the range within which one can perform high-fidelity, multi-compartment neuron simulations. This report documents the implementation of neuron devices in Xyce, their use in simulation and analysis of neuron systems.
This report presents a specification for the Portals 4.0 network programming interface. Portals 4.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Portals 4.0 is well suited to massively parallel processing and embedded systems. Portals 4.0 represents an adaption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine. Sandias Cplant cluster project motivated the development of Version 3.0, which was later extended to Version 3.3 as part of the Cray Red Storm machine and XT line. Version 4.0 is targeted to the next generation of machines employing advanced network interface architectures that support enhanced offload capabilities.