We present a shape-first approach to finding automobiles and trucks in overhead images and include results from our analysis of an image from the Overhead Imaging Research Dataset [1]. For the OIRDS, our shape-first approach traces candidate vehicle outlines by exploiting knowledge about an overhead image of a vehicle: a vehicle's outline fits into a rectangle, this rectangle is sized to allow vehicles to use local roads, and rectangles from two different vehicles are disjoint. Our shape-first approach can efficiently process high-resolution overhead imaging over wide areas to provide tips and cues for human analysts, or for subsequent automatic processing using machine learning or other analysis based on color, tone, pattern, texture, size, and/or location (shape first). In fact, computationally-intensive complex structural, syntactic, and statistical analysis may be possible when a shape-first work flow sends a list of specific tips and cues down a processing pipeline rather than sending the whole of wide area imaging information. This data flow may fit well when bandwidth is limited between computers delivering ad hoc image exploitation and an imaging sensor. As expected, our early computational experiments find that the shape-first processing stage appears to reliably detect rectangular shapes from vehicles. More intriguing is that our computational experiments with six-inch GSD OIRDS benchmark images show that the shape-first stage can be efficient, and that candidate vehicle locations corresponding to features that do not include vehicles are unlikely to trigger tips and cues. We found that stopping with just the shape-first list of candidate vehicle locations, and then solving a weighted, maximal independent vertex set problem to resolve conflicts among candidate vehicle locations, often correctly traces the vehicles in an OIRDS scene.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a user's manual for the DAKOTA software and provides capability overviews and procedures for software execution, as well as a variety of example studies.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a reference manual for the commands specification for the DAKOTA software, providing input overviews, option descriptions, and example specifications.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a developers manual for the DAKOTA software and describes the DAKOTA class hierarchies and their interrelationships. It derives directly from annotation of the actual source code and provides detailed class documentation, including all member functions and attributes.
The ubiquitous use of raw pointers in higher-level code is the primary cause of all memory usage problems and memory leaks in C++ programs. This paper describes what might be considered a radical approach to the problem which is to encapsulate the use of all raw pointers and all raw calls to new and delete in higher-level C++ code. Instead, a set of cooperating template classes developed in the Trilinos package Teuchos are used to encapsulate every use of raw C++ pointers in every use case where it appears in high-level code. Included in the set of memory management classes is the typical reference-counted smart pointer class similar to boost::shared ptr (and therefore C++0x std::shared ptr). However, what is missing in boost and the new standard library are non-reference counted classes for remaining use cases where raw C++ pointers would need to be used. These classes have a debug build mode where nearly all programmer errors are caught and gracefully reported at runtime. The default optimized build mode strips all runtime checks and allows the code to perform as efficiently as raw C++ pointers with reasonable usage. Also included is a novel approach for dealing with the circular references problem that imparts little extra overhead and is almost completely invisible to most of the code (unlike the boost and therefore C++0x approach). Rather than being a radical approach, encapsulating all raw C++ pointers is simply the logical progression of a trend in the C++ development and standards community that started with std::auto ptr and is continued (but not finished) with std::shared ptr in C++0x. Using the Teuchos reference-counted memory management classes allows one to remove unnecessary constraints in the use of objects by removing arbitrary lifetime ordering constraints which are a type of unnecessary coupling [23]. The code one writes with these classes will be more likely to be correct on first writing, will be less likely to contain silent (but deadly) memory usage errors, and will be much more robust to later refactoring and maintenance. The level of debug-mode runtime checking provided by the Teuchos memory management classes is stronger in many respects than what is provided by memory checking tools like Valgrind and Purify while being much less expensive. However, tools like Valgrind and Purify perform a number of types of checks (like usage of uninitialized memory) that makes these tools very valuable and therefore complement the Teuchos memory management debug-mode runtime checking. The Teuchos memory management classes and idioms largely address the technical issues in resolving the fragile built-in C++ memory management model (with the exception of circular references which has no easy solution but can be managed as discussed). All that remains is to teach these classes and idioms and expand their usage in C++ codes. The long-term viability of C++ as a usable and productive language depends on it. Otherwise, if C++ is no safer than C, then is the greater complexity of C++ worth what one gets as extra features? Given that C is smaller and easier to learn than C++ and since most programmers don't know object-orientation (or templates or X, Y, and Z features of C++) all that well anyway, then what really are most programmers getting extra out of C++ that would outweigh the extra complexity of C++ over C? C++ zealots will argue this point but the reality is that C++ popularity has peaked and is becoming less popular while the popularity of C has remained fairly stable over the last decade22. Idioms like are advocated in this paper can help to avert this trend but it will require wide community buy-in and a change in the way C++ is taught in order to have the greatest impact. To make these programs more secure, compiler vendors or static analysis tools (e.g. klocwork23) could implement a preprocessor-like language similar to OpenMP24 that would allow the programmer to declare (in comments) that certain blocks of code should be ''pointer-free'' or allow smaller blocks to be 'pointers allowed'. This would significantly improve the robustness of code that uses the memory management classes described here.
Achieving the next three orders of magnitude performance increase to move from petascale to exascale computing will require a significant advancements in several fundamental areas. Recent studies have outlined many of the challenges in hardware and software that will be needed. In this paper, we examine these challenges with respect to high-performance networking. We describe the repercussions of anticipated changes to computing and networking hardware and discuss the impact that alternative parallel programming models will have on the network software stack. We also present some ideas on possible approaches that address some of these challenges.
We present a new model for closing a system of Lagrangian hydrodynamics equations for a two-material cell with a single velocity model. We describe a new approach that is motivated by earlier work of Delov and Sadchikov and of Goncharov and Yanilkin. Using a linearized Riemann problem to initialize volume fraction changes, we require that each material satisfy its own pdV equation, which breaks the overall energy balance in the mixed cell. To enforce this balance, we redistribute the energy discrepancy by assuming that the corresponding pressure change in each material is equal. This multiple-material model is packaged as part of a two-step time integration scheme. We compare results of our approach with other models and with corresponding pure-material calculations, on two-material test problems with ideal-gas or stiffened-gas equations of state.
We demonstrate a new semantic method for automatic analysis of wide-area, high-resolution overhead imagery to tip and cue human intelligence analysts to human activity. In the open demonstration, we find and trace cars and rooftops. Our methodology, extended to analysis of voxels, may be applicable to understanding morphology and to automatic tracing of neurons in large-scale, serial-section TEM datasets. We defined an algorithm and software implementation that efficiently finds all combinations of image blobs that satisfy given shape semantics, where image blobs are formed as a general-purpose, first step that 'oversegments' image pixels into blobs of similar pixels. We will demonstrate the remarkable power (ROC) of this combinatorial-based work flow for automatically tracing any automobiles in a scene by applying semantics that require a subset of image blobs to fill out a rectangular shape, with width and height in given intervals. In most applications we find that the new combinatorial-based work flow produces alternative (overlapping) tracings of possible objects (e.g. cars) in a scene. To force an estimation (tracing) of a consistent collection of objects (cars), a quick-and-simple greedy algorithm is often sufficient. We will demonstrate a more powerful resolution method: we produce a weighted graph from the conflicts in all of our enumerated hypotheses, and then solve a maximal independent vertex set problem on this graph to resolve conflicting hypotheses. This graph computation is almost certain to be necessary to adequately resolve multiple, conflicting neuron topologies into a set that is most consistent with a TEM dataset.
Los Alamos and Sandia National Laboratories have formed a new high performance computing center, the Alliance for Computing at the Extreme Scale (ACES). The two labs will jointly architect, develop, procure and operate capability systems for DOE's Advanced Simulation and Computing Program. This presentation will discuss a petascale production capability system, Cielo, that will be deployed in late 2010, and a new partnership with Cray on advanced interconnect technologies.
Finding the optimal (lightest, least expensive, etc.) design for an engineered component that meets or exceeds a specified level of reliability is a problem of obvious interest across a wide spectrum of engineering fields. Various methods for this reliability-based design optimization problem have been proposed. Unfortunately, this problem is rarely solved in practice because, regardless of the method used, solving the problem is too expensive or the final solution is too inaccurate to ensure that the reliability constraint is actually satisfied. This is especially true for engineering applications involving expensive, implicit, and possibly nonlinear performance functions (such as large finite element models). The Efficient Global Reliability Analysis method was recently introduced to improve both the accuracy and efficiency of reliability analysis for this type of performance function. This paper explores how this new reliability analysis method can be used in a design optimization context to create a method of sufficient accuracy and efficiency to enable the use of reliability-based design optimization as a practical design tool.
We consider the problem of placing sensors in a municipal water network when we can choose both the location of sensors and the sensitivity and specificity of the contamination warning system. Sensor stations in a municipal water distribution network continuously send sensor output information to a centralized computing facility, and event detection systems at the control center determine when to signal an anomaly worthy of response. Although most sensor placement research has assumed perfect anomaly detection, signal analysis software has parameters that control the tradeoff between false alarms and false negatives. We describe a nonlinear sensor placement formulation, which we heuristically optimize with a linear approximation that can be solved as a mixed-integer linear program. We report the results of initial experiments on a real network and discuss tradeoffs between early detection of contamination incidents, and control of false alarms.
We will discuss general mathematical ideas arising in the problems of Laser beam shaping and splitting. We will be particularly concerned with questions concerning the scaling and symmetry of such systems.
A method is obtained for deriving peridynamic material models for a sequence of increasingly coarsened descriptions of a body. The starting point is a known detailed, small scale linearized state-based description. Each successively coarsened model excludes some of the aterial present in the previous model, and the length scale increases accordingly. This excluded material, while not present explicitly in the coarsened model, is nevertheless taken into account implicitly through its effect on the forces in the coarsened material. Numerical examples emonstrate that the method accurately reproduces the effective elastic properties of a composite as well as the effect of a small defect in a homogeneous medium.
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.
The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.
The noble gas xenon is a particularly interesting element. At standard pressure xenon is an fcc solid which melts at 161 K and then boils at 165 K, thus displaying a rather narrow liquid range on the phase diagram. On the other hand, under pressure the melting point is significantly higher: 3000 K at 30 GPa. Under shock compression, electronic excitations become important at 40 GPa. Finally, xenon forms stable molecules with fluorine (XeF{sub 2}) suggesting that the electronic structure is significantly more complex than expected for a noble gas. With these reasons in mind, we studied the xenon Hugoniot using DFT/QMD and validated the simulations with multi-Mbar shock compression experiments. The results show that existing equation of state models lack fidelity and so we developed a wide-range free-energy based equation of state using experimental data and results from first-principles simulations.
Developing quantum chemistry programs on the coming generation of exascale computers will be a difficult task. The programs will need to be fault-tolerant and minimize the use of global operations. This work explores the use a task-based model that uses a data-centric approach to allocate work to different processes as it applies to quantum chemistry. After introducing the key problems that appear when trying to parallelize a complicated quantum chemistry method such as coupled-cluster theory, we discuss the implications of that model as it pertains to the computational kernel of a coupled-cluster program - matrix multiplication. Also, we discuss the extensions that would required to build a full coupled-cluster program using the task-based model. Current programming models for high-performance computing are fault-intolerant and use global operations. Those properties are unsustainable as computers scale to millions of CPUs; instead one must recognize that these systems will be hierarchical in structure, prone to constant faults, and global operations will be infeasible. The FAST-OS HARE project is introducing a scale-free computing model to address these issues. This model is hierarchical and fault-tolerant by design, allows for the clean overlap of computation and communication, reducing the network load, does not require checkpointing, and avoids the complexity of many HPC runtimes. Development of an algorithm within this model requires a change in focus from imperative programming to a data-centric approach. Quantum chemistry (QC) algorithms, in particular electronic structure methods, are an ideal test bed for this computing model. These methods describe the distribution of electrons in a molecule, which determine the properties of the molecule. The computational cost of these methods is high, scaling quartically or higher in the size of the molecule, which is why QC applications are major users of HPC resources. The complexity of these algorithms means that MPI alone is insufficient to achieve parallel scaling; QC developers have been forced to use alternative approaches to achieve scalability and would be receptive to radical shifts in the programming paradigm. Initial work in adapting the simplest QC method, Hartree-Fock, to this the new programming model indicates that the approach is beneficial for QC applications. However, the advantages to being able to scale to exascale computers are greatest for the computationally most expensive algorithms; within QC these are the high-accuracy coupled-cluster (CC) methods. Parallel coupledcluster programs are available, however they are based on the conventional MPI paradigm. Much of the effort is spent handling the complicated data dependencies between the various processors, especially as the size of the problem becomes large. The current paradigm will not survive the move to exascale computers. Here we discuss the initial steps toward designing and implementing a CC method within this model. First, we introduce the general concepts behind a CC method, focusing on the aspects that make these methods difficult to parallelize with conventional techniques. Then we outline what is the computational core of the CC method - a matrix multiply - within the task-based approach that the FAST-OS project is designed to take advantage of. Finally we outline the general setup to implement the simplest CC method in this model, linearized CC doubles (LinCC).
In this work, we developed a self-organizing map (SOM) technique for using web-based text analysis to forecast when a group is undergoing a phase change. By 'phase change', we mean that an organization has fundamentally shifted attitudes or behaviors. For instance, when ice melts into water, the characteristics of the substance change. A formerly peaceful group may suddenly adopt violence, or a violent organization may unexpectedly agree to a ceasefire. SOM techniques were used to analyze text obtained from organization postings on the world-wide web. Results suggest it may be possible to forecast phase changes, and determine if an example of writing can be attributed to a group of interest.
In order to provide large quantities of high-reliability disk-based storage, it has become necessary to aggregate disks into fault-tolerant groups based on the RAID methodology. Most RAID levels do provide some fault tolerance, but there are certain classes of applications that require increased levels of fault tolerance within an array. Some of these applications include embedded systems in harsh environments that have a low level of serviceability, or uninhabited data centers servicing cloud computing. When describing RAID reliability, the Mean Time To Data Loss (MTTDL) calculations will often assume that the time to replace a failed disk is relatively low, or even negligible compared to rebuild time. For platforms that are in remote areas collecting and processing data, it may be impossible to access the system to perform system maintenance for long periods. A disk may fail early in a platform's life, but not be replaceable for much longer than typical for RAID arrays. Service periods may be scheduled at intervals on the order of months, or the platform may not be serviced until the end of a mission in progress. Further, this platform may be subject to extreme conditions that can accelerate wear and tear on a disk, requiring even more protection from failures. We have created a high parity RAID implementation that uses a Graphics Processing Unit (GPU) to compute more than two blocks of parity information per stripe, allowing extra parity to eliminate or reduce the requirement for rebuilding data between service periods. While this type of controller is highly effective for RAID 6 systems, an important benefit is the ability to incorporate more parity into a RAID storage system. Such RAID levels, as yet unnamed, can tolerate the failure of three or more disks (depending on configuration) without data loss. While this RAID system certainly has applications in embedded systems running applications in the field, similar benefits can be obtained for servers that are engineered for storage density, with less regard for serviceability or maintainability. A storage brick can be designed to have a MTTDL that extends well beyond the useful lifetime of the hardware used, allowing the disk subsystem to require less service throughout the lifetime of a compute resource. This approach is similar to the Xiotech ISE. Such a design can be deliberately placed remotely (without frequent support) in order to provide colocation, or meet cost goals. For workloads where reliability is key, but conditions are sub-optimal for routine serviceability, a high-parity RAID can provide extra reliability in extraordinary situations. For example, for installations requiring very high Mean Time To Repair, the extra parity can eliminate certain problems with maintaining hot spares, increasing overall reliability. Furthermore, in situations where disk reliability is reduced because of harsh conditions, extra parity can guard against early data loss due to lowered Mean Time To Failure. If used through an iSCSI interface with a streaming workload, it is possible to gain all of these benefits without impacting performance.
In a recent acquisition by DOE/NNSA several large capacity computing clusters called TLCC have been installed at the DOE labs: SNL, LANL and LLNL. TLCC architecture with ccNUMA, multi-socket, multi-core nodes, and InfiniBand interconnect, is representative of the trend in HPC architectures. This paper examines application performance on TLCC contrasting them with Red Storm/Cray XT4. TLCC and Red Storm share similar AMD processors and memory DIMMs. Red Storm however has single socket nodes and custom interconnect. Micro-benchmarks and performance analysis tools help understand the causes for the observed performance differences. Control of processor and memory affinity on TLCC with the numactl utility is shown to result in significant performance gains and is essential to attenuate the detrimental impact of OS interference and cache-coherency overhead. While previous studies have investigated impact of affinity control mostly in the context of small SMP systems, the focus of this paper is on highly parallel MPI applications.
There has been a concerted effort since 2007 to establish a dashboard of metrics for the Science, Technology, and Engineering (ST&E) work at Sandia National Laboratories. These metrics are to provide a self assessment mechanism for the ST&E Strategic Management Unit (SMU) to complement external expert review and advice and various internal self assessment processes. The data and analysis will help ST&E Managers plan, implement, and track strategies and work in order to support the critical success factors of nurturing core science and enabling laboratory missions. The purpose of this SAND report is to provide a guide for those who want to understand the ST&E SMU metrics process. This report provides an overview of why the ST&E SMU wants a dashboard of metrics, some background on metrics for ST&E programs from existing literature and past Sandia metrics efforts, a summary of work completed to date, specifics on the portfolio of metrics that have been chosen and the implementation process that has been followed, and plans for the coming year to improve the ST&E SMU metrics process.
This abstract explores the potential advantages of discontinuous Galerkin (DG) methods for the time-domain inversion of media parameters within the earth's interior. In particular, DG methods enable local polynomial refinement to better capture localized geological features within an area of interest while also allowing the use of unstructured meshes that can accurately capture discontinuous material interfaces. This abstract describes our initial findings when using DG methods combined with Runge-Kutta time integration and adjoint-based optimization algorithms for full-waveform inversion. Our initial results suggest that DG methods allow great flexibility in matching the media characteristics (faults, ocean bottom and salt structures) while also providing higher fidelity representations in target regions. Time-domain inversion using discontinuous Galerkin on unstructured meshes and with local polynomial refinement is shown to better capture localized geological features and accurately capture discontinuous-material interfaces. These approaches provide the ability to surgically refine representations in order to improve predicted models for specific geological features. Our future work will entail automated extensions to directly incorporate local refinement and adaptive unstructured meshes within the inversion process.
Importance sampling is an unbiased sampling method used to sample random variables from different densities than originally defined. These importance sampling densities are constructed to pick 'important' values of input random variables to improve the estimation of a statistical response of interest, such as a mean or probability of failure. Conceptually, importance sampling is very attractive: for example one wants to generate more samples in a failure region when estimating failure probabilities. In practice, however, importance sampling can be challenging to implement efficiently, especially in a general framework that will allow solutions for many classes of problems. We are interested in the promises and limitations of importance sampling as applied to computationally expensive finite element simulations which are treated as 'black-box' codes. In this paper, we present a customized importance sampler that is meant to be used after an initial set of Latin Hypercube samples has been taken, to help refine a failure probability estimate. The importance sampling densities are constructed based on kernel density estimators. We examine importance sampling with respect to two main questions: is importance sampling efficient and accurate for situations where we can only afford small numbers of samples? And does importance sampling require the use of surrogate methods to generate a sufficient number of samples so that the importance sampling process does increase the accuracy of the failure probability estimate? We present various case studies to address these questions.
Extreme-scale parallel systems will require alternative methods for applications to maintain current levels of uninterrupted execution. Redundant computation is one approach to consider, if the benefits of increased resiliency outweigh the cost of consuming additional resources. We describe a transparent redundancy approach for MPI applications and detail two different implementations that provide the ability to tolerate a range of failure scenarios, including loss of application processes and connectivity.We compare these two approaches and show performance results from micro-benchmarks that bound worst-case message passing performance degradation.We propose several enhancements that could lower the overhead of providing resiliency through redundancy.
Arctic sea ice plays an important role in global climate by reflecting solar radiation and insulating the ocean from the atmosphere. Due to feedback effects, the Arctic sea ice cover is changing rapidly. To accurately model this change, high-resolution calculations must incorporate: (1) annual cycle of growth and melt due to radiative forcing; (2) mechanical deformation due to surface winds, ocean currents and Coriolis forces; and (3) localized effects of leads and ridges. We have demonstrated a new mathematical algorithm for solving the sea ice governing equations using the material-point method with an elastic-decohesive constitutive model. An initial comparison with the LANL CICE code indicates that the ice edge is sharper using Materials-Point Method (MPM), but that many of the overall features are similar.
The problem of missing data is ubiquitous in domains such as biomedical signal processing, network traffic analysis, bibliometrics, social network analysis, chemometrics, computer vision, and communication networks|all domains in which data collection is subject to occasional errors. Moreover, these data sets can be quite large and have more than two axes of variation, e.g., sender, receiver, time. Many applications in those domains aim to capture the underlying latent structure of the data; in other words, they need to factorize data sets with missing entries. If we cannot address the problem of missing data, many important data sets will be discarded or improperly analyzed. Therefore, we need a robust and scalable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. We focus on one of the most well-known tensor factorizations, CANDECOMP/PARAFAC (CP), and formulate the CP model as a weighted least squares problem that models only the known entries. We develop an algorithm called CP-WOPT (CP Weighted OPTimization) using a first-order optimization approach to solve the weighted least squares problem. Based on extensive numerical experiments, our algorithm is shown to successfully factor tensors with noise and up to 70% missing data. Moreover, our approach is significantly faster than the leading alternative and scales to larger problems. To show the real-world usefulness of CP-WOPT, we illustrate its applicability on a novel EEG (electroencephalogram) application where missing data is frequently encountered due to disconnections of electrodes.
We examine several conducting spheres moving through a magnetic field gradient. An analytical approximation is derived and an experiment is conducted to verify the analytical solution. The experiment is simulated as well to produce a numerical result. Both the low and high magnetic Reynolds number regimes are studied. Deformation of the sphere is noted in the high Reynolds number case. It is suggested that this deformation effect could be useful for designing or enhancing present protection systems against space debris.
This talk discusses the unique demands that informatics applications, particularly graph-theoretic applications, place on computer systems. These applications tend to pose significant data movement challenges for conventional systems. Worse, underlying technology trends are moving computers to cost-driven optimization points that exacerbate the problem. The X-caliber architecture is an economically viable counter-example to conventional architectures based on the integration of innovative technologies that support the data movement requirements of large-scale informatics applications. This talk will discuss the technology drivers and architectural features of the platform, and present analysis showing the benefits for informatics applications, as well as our traditional science and engineering HPC applications.
There is considerable interest in achieving a 1000 fold increase in supercomputing power in the next decade, but the challenges are formidable. In this paper, the authors discuss some of the driving science and security applications that require Exascale computing (a million, trillion operations per second). Key architectural challenges include power, memory, interconnection networks and resilience. The paper summarizes ongoing research aimed at overcoming these hurdles. Topics of interest are architecture aware and scalable algorithms, system simulation, 3D integration, new approaches to system-directed resilience and new benchmarks. Although significant progress is being made, a broader international program is needed.
The image created in reflected light DIC can often be interpreted as a true three-dimensional representation of the surface geometry, provided a clear distinction can be realized between raised and lowered regions in the specimen. It may be helpful if our definition of saliency embraces work on the human visual system (HVS) as well as the more abstract work on saliency, as it is certain that understanding by humans will always stand between recording of a useful signal from all manner of sensors and so-called actionable intelligence. A DARPA/DSO program lays down this requirement in a current program (Kruse 2010): The vision for the Neurotechnology for Intelligence Analysts (NIA) Program is to revolutionize the way that analysts handle intelligence imagery, increasing both the throughput of imagery to the analyst and overall accuracy of the assessments. Current computer-based target detection capabilities cannot process vast volumes of imagery with the speed, flexibility, and precision of the human visual system.
The objective of this project is to investigate the complex fracture of ice and understand its role within larger ice sheet simulations and global climate change. At the present time, ice fracture is not explicitly considered within ice sheet models due in part to large computational costs associated with the accurate modeling of this complex phenomena. However, fracture not only plays an extremely important role in regional behavior but also influences ice dynamics over much larger zones in ways that are currently not well understood. Dramatic illustrations of fracture-induced phenomena most notably include the recent collapse of ice shelves in Antarctica (e.g. partial collapse of the Wilkins shelf in March of 2008 and the diminishing extent of the Larsen B shelf from 1998 to 2002). Other fracture examples include ice calving (fracture of icebergs) which is presently approximated in simplistic ways within ice sheet models, and the draining of supraglacial lakes through a complex network of cracks, a so called ice sheet plumbing system, that is believed to cause accelerated ice sheet flows due essentially to lubrication of the contact surface with the ground. These dramatic changes are emblematic of the ongoing change in the Earth's polar regions and highlight the important role of fracturing ice. To model ice fracture, a simulation capability will be designed centered around extended finite elements and solved by specialized multigrid methods on parallel computers. In addition, appropriate dynamic load balancing techniques will be employed to ensure an approximate equal amount of work for each processor.