Publications Search

Cooperative computing for autonomous data centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

Finding Non-Human Nodes in Social Networks Using Only Topology

Berry, Jonathan; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

More Details

TYPE Conference Poster YEAR 2015

OSTI

A task-based linear algebra Building Blocks approach for scalable graph analytics

2015 IEEE High Performance Extreme Computing Conference, HPEC 2015

Wolf, Michael; Berry, Jonathan; Stark, Dylan T.

It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

A task-based linear algebra Building Blocks approach for scalable graph analytics

2015 IEEE High Performance Extreme Computing Conference Hpec 2015

Wolf, Michael; Berry, Jonathan; Stark, Dylan T.

It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

k-Means Clustering for Two-Level Memory Systems

Berry, Jonathan; Bender, M.; Hammond, Simon; Moore, B.; Phillips, Cynthia; Moseley, B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

K-Means clustering on two-level memory systems

ACM International Conference Proceeding Series

Bender, Michael A.; Berry, Jonathan; Hammond, Simon; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.

In recent work we quantified the anticipated performance boost when a sorting algorithm is modified to leverage user- Addressable "near-memory," which we call scratchpad. This architectural feature is expected in the Intel Knight's Land- ing processors that will be used in DOE's next large-scale supercomputer. This paper expands our analytical study of the scratch- pad to consider k-means clustering, a classical data-analysis technique that is ubiquitous in the literature and in prac- Tice. We present new theoretical results using the model introduced in [13], which measures memory transfers and assumes that computations are memory-bound. Our the- oretical results indicate that scratchpad-aware versions of k-means clustering can expect performance boosts for high- dimensional instances with relatively few cluster centers. These constraints may limit the practical impact of scratch- pad for k-means acceleration, so we discuss their origins and practical implications. We corroborate our theory with ex- perimental runs on a system instrumented to mimic one with scratchpad memory. We also contribute a semi-formalization of the computa- Tional properties that are necessary and sufficient to predict a performance boost from scratchpad-aware variants of al- gorithms. We have observed and studied these properties in the context of sorting, and now clustering. We conclude with some thoughts on the application of these properties to new areas. Specifically, we believe that dense linear algebra has similar properties to k-means, while sparse linear algebra and FFT computations are more sim-ilar to sorting. The sparse operations are more common in scientific computing, so we expect scratchpad to have signif- icant impact in that area.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015

Bender, Michael A.; Berry, Jonathan; Hammond, Simon; Hemmert, Karl S.; Mccauley, Samuel; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.; Resnick, David R.; Rodrigues, Arun

A fundamental challenge for supercomputer architecture is that processors cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. As the number of cores per chip increases, and traditional DDR DRAM speeds stagnate, the problem is only getting worse. A variety of non-DDR 3D memory technologies (Wide I/O 2, HBM) offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. However, such a packaging scheme cannot contain sufficient memory capacity for a node. It seems likely that future systems will require at least two levels of main memory: high-bandwidth, low-power memory near the processor and low-bandwidth high-capacity memory further away. This near memory will probably not have significantly faster latency than the far memory. This, combined with the large size of the near memory (multiple GB) and power constraints, may make it difficult to treat it as a standard cache. In this paper, we explore some of the design space for a user-controlled multi-level main memory. We present algorithms designed for the heterogeneous bandwidth, using streaming to exploit data locality. We consider algorithms for the fundamental application of sorting. Our algorithms asymptotically reduce memory-block transfers under certain architectural parameter settings. We use and extend Sandia National Laboratories' SST simulation capability to demonstrate the relationship between increased bandwidth and improved algorithmic performance. Memory access counts from simulations corroborate predicted performance. This co-design effort suggests implementing two-level main memory systems may improve memory performance in fundamental applications.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Cooperative Computing for Autonomous Data Centers

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

We present a new distributed model for graph computations motivated by limited information sharing. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a poly logarithmic size subgraph, 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centres' have results for both models for s-t connectivity, one of the simplest graph problems that requires global information in the worst case. In the limited-sharing model, our results exploit social network structure. Standard communication complexity gives polynomial lower bounds on s-t connectivity for general graphs. However, if the graph for each data centre has a giant component and these giant components intersect, then we can overcome this lower bound, computing-t connectivity while exchanging O(log 2 n) bits for a constant number of data centers. We can also test the assumption that the giant components overlap using O(log 2 n) bits provided the (unknown) overlap is sufficiently large. The second result is in the low trust model. We give a secure multi-party computation (MPC) algorithm that 1) does not make cryptographic assumptions when there are 3 or more entities, and 2) is efficient, especially when compared to the usual garbled circuit approach. The entities learn only the yes/no answer. No party learns anything about the others' graph, not even node names. This algorithm does not require any special graph structure. This secure MPC result for s-t connectivity is one of the first that involves a few parties computing on large inputs, instead of many parties computing on a few local values.

More Details

TYPE Conference Poster YEAR 2015

OSTI Scopus

Cooperative computing for autonomous data centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Cyber Graph Queries for Geographically Distributed Data Centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly-detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove performance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Why do simple algorithms for triangle enumeration work in the real world?

Internet Mathematics

Berry, Jonathan; Fostvedt, Luke A.; Nordman, Daniel J.; Phillips, Cynthia A.; Comandur, Seshadhri; Wilson, Alyson G.

Listing all triangles is a fundamental graph operation. Triangles can have important interpretations in real-world graphs, especially social and other interaction networks. Despite the lack of provably efficient (linear, or slightly super linear) worst-case algorithms for this problem, practitioners run simple, efficient heuristics to find all triangles in graphs with millions of vertices. How are these heuristics exploiting the structure of these special graphs to provide major speedups in running time? We study one of the most prevalent algorithms used by practitioners. A trivial algorithm enumerates all paths of length 2, and checks if each such path is incident to a triangle. A good heuristic is to enumerate only those paths of length 2 in which the middle vertex has the lowest degree. It is easily implemented and is empirically known to give remarkable speedups over the trivial algorithm. We study the behavior of this algorithm over graphs with heavy-tailed degree distributions, a defining feature of real-world graphs. The erased configuration model (ECM) efficiently generates a graph with asymptotically (almost) any desired degree sequence. We show that the expected running time of this algorithm over the distribution of graphs created by the ECM is controlled by the l4/3-norm of the degree sequence. Norms of the degree sequence are a measure of the heaviness of the tail, and it is precisely this feature that allows non trivial speedups of simple triangle enumeration algorithms. As a corollary of our main theorem, we prove expected linear-time performance for degree sequences following a power law with exponent α ≥ 7/3, and non trivial speedup whenever α ∈ (2, 3).

More Details

TYPE Conference YEAR 2015

Scopus OSTI

Workshop on Streaming Graph Algorithms (WSGA) Overview

Berry, Jonathan; Phillips, Cynthia A.; Hendrickson, Bruce A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Finding a planted clique in a distributed social network

Berry, Jonathan; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

More Details

TYPE Presentation YEAR 2014

OSTI

Sandia Software for Networks from DARPA GRAPHS Program

Kolda, Tamara G.; Plantenga, Todd; Pinar, Ali P.; Comandur, Seshadhri; Berry, Jonathan; Jha, Madhav; Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Graph Exploration: to Linear Algebra (and Beyond?)

Berry, Jonathan; Wolf, Michael; Stark, Dylan T.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Graph Algorithms for Autonomous Distributed Data Centers

Berry, Jonathan; Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Statistically significant relational data mining :

Berry, Jonathan; Leung, Vitus J.; Phillips, Cynthia A.; Pinar, Ali P.; Robinson, David G.

This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.

More Details

TYPE SAND Report YEAR 2014

DOI OSTI

Maintaining Connected Components for Infinite Graph Streams

Berry, Jonathan; Phillips, Cynthia A.; Plimpton, Steven J.; Shead, Timothy M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI DOI

Community Detection: A Bayesian Approach and the Challenge of Evaluation

Berry, Jonathan; Dunlavy, Daniel M.; Phillips, Cynthia A.; Robinson, David G.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Sensor Placement for Municipal Water Networks

Berry, Jonathan; Hart, William E.; Phillips, Cynthia A.; Watson, Jean-Paul

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI DOI

Maintaining connected components for infinite graph streams

Proc. of 2nd Int. Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, BigMine 2013 - Held in Conj. with SIGKDD 2013 Conf.

Berry, Jonathan; Phillips, Cynthia A.; Plimpton, Steven J.; Shead, Timothy M.

We present an algorithm to maintain the connected components of a graph that arrives as an infinite stream of edges. We formalize the algorithm on X-stream, a new parallel theoretical computational model for infinite streams. Connectivity-related queries, including component spanning trees, are supported with some latency, returning the state of the graph at the time of the query. Because an infinite stream may eventually exceed the storage limits of any number of finite-memory processors, we assume an aging command or daemon where "uninteresting" edges are removed when the system nears capacity. Following an aging command the system will block queries until its data structures are repaired, but edges will continue to be accepted from the stream, never dropped. The algorithm will not fail unless a model-specific constant fraction of the aggregate memory across all processors is full. In normal operation, it will not fail unless aggregate memory is completely full. Unlike previous theoretical streaming models designed for finite graphs that assume a single shared memory machine or require arbitrary-size intemediate files, X-stream distributes a graph over a ring network of finite-memory processors. Though the model is synchronous and reminiscent of systolic algorithms, our implementation uses an asynchronous message-passing system. We argue the correctness of our X-stream connected components algorithm, and give preliminary experimental results on synthetic and real graph streams.

More Details

TYPE Presentation YEAR 2013

OSTI Scopus