Albany: Integrating Algorithmic Components to Build Advanced Applications
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Using a novel formal methods approach, we have generated computer-veri ed proofs of major theorems pertinent to the quantum phase estimation algorithm. This was accomplished using our Prove-It software package in Python. While many formal methods tools are available, their practical utility is limited. Translating a problem of interest into these systems and working through the steps of a proof is an art form that requires much expertise. One must surrender to the preferences and restrictions of the tool regarding how mathematical notions are expressed and what deductions are allowed. Automation is a major driver that forces restrictions. Our focus, on the other hand, is to produce a tool that allows users the ability to con rm proofs that are essentially known already. This goal is valuable in itself. We demonstrate the viability of our approach that allows the user great exibility in expressing state- ments and composing derivations. There were no major obstacles in following a textbook proof of the quantum phase estimation algorithm. There were tedious details of algebraic manipulations that we needed to implement (and a few that we did not have time to enter into our system) and some basic components that we needed to rethink, but there were no serious roadblocks. In the process, we made a number of convenient additions to our Prove-It package that will make certain algebraic manipulations easier to perform in the future. In fact, our intent is for our system to build upon itself in this manner.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proceedings - IEEE International Conference on Cluster Computing, ICCC
A broad range of physical phenomena in science and engineering can be explored using finite difference and volume based application codes. Incorporating Adaptive Mesh Refinement (AMR) into these codes focuses attention on the most critical parts of a simulation, enabling increased numerical accuracy of the solution while limiting memory consumption. However, adaptivity comes at the cost of increased runtime complexity, which is particularly challenging on emerging and expected future architectures. In order to explore the design space offered by new computing environments, we have developed a proxy application called miniAMR. MiniAMR exposes a range of the important issues that will significantly impact the performance potential of full application codes. In this paper, we describe miniAMR, demonstrate what is designed to represent in a full application code, and illustrate how it can be used to exploit future high performance computing architectures. To ensure an accurate understanding of what miniAMR is intended to represent, we compare it with CTH, a shock hydrodynamics code in heavy use throughout several computational science and engineering communities.
Proceedings - IEEE International Conference on Cluster Computing, ICCC
High-performance computing systems are shifting away from traditional interconnect topologies to exploit new technologies and to reduce interconnect power consumption. The Dragonfly topology is one promising candidate for new systems, with several variations already in production. It is hierarchical, with local links forming groups and global links joining the groups. At each level, the interconnect is a clique, with a link between each pair of switches in a group and a link between each pair of groups. This paper shows that the intergroup links can be made in meaningfully different ways. We evaluate three previously-proposed approaches for link organization (called global link arrangements) in two ways. First, we use bisection bandwidth, an important and commonly-used measure of the potential for communication bottlenecks. We show that the global link arrangements often give bisection bandwidths differing by 10s of percent, with the specific separation varying based on the relative bandwidths of local and global links. For the link bandwidths used in a current Dragonfly implementation, it is 33%. Second, we show that the choice of global link arrangement can greatly impact the regularity of task mappings for nearest neighbor stencil communication patterns, an important pattern in scientific applications.
ACM International Conference Proceeding Series
In recent work we quantified the anticipated performance boost when a sorting algorithm is modified to leverage user- Addressable "near-memory," which we call scratchpad. This architectural feature is expected in the Intel Knight's Land- ing processors that will be used in DOE's next large-scale supercomputer. This paper expands our analytical study of the scratch- pad to consider k-means clustering, a classical data-analysis technique that is ubiquitous in the literature and in prac- Tice. We present new theoretical results using the model introduced in [13], which measures memory transfers and assumes that computations are memory-bound. Our the- oretical results indicate that scratchpad-aware versions of k-means clustering can expect performance boosts for high- dimensional instances with relatively few cluster centers. These constraints may limit the practical impact of scratch- pad for k-means acceleration, so we discuss their origins and practical implications. We corroborate our theory with ex- perimental runs on a system instrumented to mimic one with scratchpad memory. We also contribute a semi-formalization of the computa- Tional properties that are necessary and sufficient to predict a performance boost from scratchpad-aware variants of al- gorithms. We have observed and studied these properties in the context of sorting, and now clustering. We conclude with some thoughts on the application of these properties to new areas. Specifically, we believe that dense linear algebra has similar properties to k-means, while sparse linear algebra and FFT computations are more sim-ilar to sorting. The sparse operations are more common in scientific computing, so we expect scratchpad to have signif- icant impact in that area.
Statistical Analysis and Data Mining
This study aimed to organize a body of trajectories in order to identify, search for and classify both common and uncommon behaviors among objects such as aircraft and ships. Existing comparison functions such as the Fréchet distance are computationally expensive and yield counterintuitive results in some cases. We propose an approach using feature vectors whose components represent succinctly the salient information in trajectories. These features incorporate basic information such as the total distance traveled and the distance between start/stop points as well as geometric features related to the properties of the convex hull, trajectory curvature and general distance geometry. Additionally, these features can generally be mapped easily to behaviors of interest to humans who are searching large databases. Most of these geometric features are invariant under rigid transformation. Furthermore, we demonstrate the use of different subsets of these features to identify trajectories similar to an exemplar, cluster a database of several hundred thousand trajectories and identify outliers.