Resource Monitoring and Management with OVIS to Enable HPC in Cloud Computing Environments
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This document describes how to obtain, install, use, and enjoy a better life with OVIS version 2.0. The OVIS project targets scalable, real-time analysis of very large data sets. We characterize the behaviors of elements and aggregations of elements (e.g., across space and time) in data sets in order to detect anomalous behaviors. We are particularly interested in determining anomalous behaviors that can be used as advance indicators of significant events of which notification can be made or upon which action can be taken or invoked. The OVIS open source tool (BSD license) is available for download at ovis.ca.sandia.gov. While we intend for it to support a variety of application domains, the OVIS tool was initially developed for, and continues to be primarily tuned for, the investigation of High Performance Compute (HPC) cluster system health. In this application it is intended to be both a system administrator tool for monitoring and a system engineer tool for exploring the system state in depth. OVIS 2.0 provides a variety of statistical tools for examining the behavior of elements in a cluster (e.g., nodes, racks) and associated resources (e.g., storage appliances and network switches). It calculates and reports model values and outliers relative to those models. Additionally, it provides an interactive 3D physical view in which the cluster elements can be colored by raw element values (e.g., temperatures, memory errors) or by the comparison of those values to a given model. The analysis tools and the visual display allow the user to easily determine abnormal or outlier behaviors. The OVIS project envisions the OVIS tool, when applied to compute cluster monitoring, to be used in conjunction with the scheduler or resource manager in order to enable intelligent resource utilization. For example, nodes that are deemed less healthy, that is, nodes that exhibit outlier behavior in some variable, or set of variables, that has shown to be correlated with future failure, can be discovered and assigned to shorter duration or less important jobs. Further, applications with fault-tolerant capabilities can invoke those mechanisms on demand, based upon notification of a node exhibiting impending failure conditions, rather than performing such mechanisms (e.g. checkpointing) at regular intervals unnecessarily.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This is a progress report on polynomial system solving for statistical modeling. This is a progress report on polynomial system solving for statistical modeling. This quarter we have developed our first model of shock response data and an algorithm for identifying the chamber cone containing a polynomial system in n variables with n+k terms within polynomial time - a significant improvement over previous algorithms, all having exponential worst-case complexity. We have implemented and verified the chamber cone algorithm for n+3 and are working to extend the implementation to handle arbitrary k. Later sections of this report explain chamber cones in more detail; the next section provides an overview of the project and how the current progress fits into it.
In this report, we present the novel functionality of parallel tetrahedral mesh refinement which we have implemented in MOAB. This report details work done to implement parallel, edge-based, tetrahedral refinement into MOAB. The theoretical basis for this work is contained in [PT04, PT05, TP06] while information on design, performance, and operation specific to MOAB are contained herein. As MOAB is intended mainly for use in pre-processing and simulation (as opposed to the post-processing bent of previous papers), the primary use case is different: rather than refining elements with non-linear basis functions, the goal is to increase the number of degrees of freedom in some region in order to more accurately represent the solution to some system of equations that cannot be solved analytically. Also, MOAB has a unique mesh representation which impacts the algorithm. This introduction contains a brief review of streaming edge-based tetrahedral refinement. The remainder of the report is broken into three sections: design and implementation, performance, and conclusions. Appendix A contains instructions for end users (simulation authors) on how to employ the refiner.
Abstract not provided.
This report summarizes the existing statistical engines in VTK/Titan and presents the parallel versions thereof which have already been implemented. The ease of use of these parallel engines is illustrated by the means of C++ code snippets. Furthermore, this report justifies the design of these engines with parallel scalability in mind; then, this theoretical property is verified with test runs that demonstrate optimal parallel speed-up with up to 200 processors.
We present a formula for the pairwise update of arbitrary-order centered statistical moments. This formula is of particular interest to compute such moments in parallel for large-scale, distributed data sets. As a corollary, we indicate a specialization of this formula for incremental updates, of particular interest to streaming implementations. Finally, we provide pairwise and incremental update formulas for the covariance. Centered statistical moments are one of the most widely used tools in descriptive statistics. It is therefore essential for statistical analysis packages that robust and efficient algorithms be devised and implemented. However, robustness and speed of execution, in this context as well as in others, tend to be orthogonal. For instance, it is well known1 that algorithms for calculating centered statistical moments that utilize sum of powers for the sake of execution speed (one-pass algorithms) lead to unacceptable numerical instability.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report specifies the way in which Gauss points shall be named and ordered when storing them in an EXODUS II file so that they may be properly interpreted by visualization tools. This naming convention covers hexahedra and tetrahedra. Future revisions of this document will cover quadrilaterals, triangles, and shell elements.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Current work on the Integrated Stockpile Evaluation (ISE) project is evidence of Sandia's commitment to maintaining the integrity of the nuclear weapons stockpile. In this report, we undertake a key element in that process: development of an analytical framework for determining the reliability of the stockpile in a realistic environment of time-variance, inherent uncertainty, and sparse available information. This framework is probabilistic in nature and is founded on a novel combination of classical and computational Bayesian analysis, Bayesian networks, and polynomial chaos expansions. We note that, while the focus of the effort is stockpile-related, it is applicable to any reasonably-structured hierarchical system, including systems with feedback.
The purpose of this report is to define a standard interface for storing and retrieving novel, non-traditional partial differential equation (PDE) discretizations. Although it focuses specifically on finite elements where state is associated with edges and faces of volumetric elements rather than nodes and the elements themselves (as implemented in ALEGRA), the proposed interface should be general enough to accommodate most discretizations, including hp-adaptive finite elements and even mimetic techniques that define fields over arbitrary polyhedra. This report reviews the representation of edge and face elements as implemented by ALEGRA. It then specifies a convention for storing these elements in EXODUS files by extending the EXODUS API to include edge and face blocks in addition to element blocks. Finally, it presents several techniques for rendering edge and face elements using VTK and ParaView, including the use of VTK's generic dataset interface for interpolating values interior to edges and faces.