Publications Search

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Journal of Parallel and Distributed Computing

Berry, Jonathan; Bender, Michael A.; Hammond, Simon; Hemmert, Karl S.; Mccauley, Samuel; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.; Resnick, David R.; Rodrigues, Arun

A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” and if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Thus, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient. In this paper, we explore the design space for a user-controlled multi-level main memory. Our work identifies situations in which rewriting application kernels can provide significant performance gains when using near-memory. We present algorithms designed for two-level main memory, using divide-and-conquer to partition computations and streaming to exploit data locality. We consider algorithms for the fundamental application of sorting and for the data analysis kernel k-means. Our algorithms asymptotically reduce memory-block transfers under certain architectural parameter settings. We use and extend Sandia National Laboratories’ SST simulation capability to demonstrate the relationship between increased bandwidth and improved algorithmic performance. Memory access counts from simulations corroborate predicted performance improvements for our sorting algorithm. In contrast, the k-means algorithm is generally CPU bound and does not improve when using near-memory except under extreme conditions. These conditions require large instances that rule out SST simulation, but we demonstrate improvements by running on a customized machine with high and low bandwidth memory. These case studies in co-design serve as positive and cautionary templates, respectively, for the major task of optimizing the computational kernels of many fundamental applications for two-level main memory systems.

More Details

TYPE Journal Article YEAR 2017

DOI OSTI Scopus

Stateful Streaming in Distributed Memory Supercomputers

Berry, Jonathan

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Hierarchical Task-Data Parallelism using Kokkos and Qthreads

Edwards, Harold C.; Olivier, Stephen L.; Berry, Jonathan; Mackey, Greg E.; Rajamanickam, Sivasankaran; Wolf, Michael; Kim, Kyungjoo; Stelle, George W.

This report describes a new capability for hierarchical task-data parallelism using Sandia's Kokkos and Qthreads, and evaluation of this capability with sparse matrix Cholesky factorization and social network triangle enumeration mini-applications. Hierarchical task-data parallelism consists of a collection of tasks with executes-after dependences where each task contains data parallel operations performed on a team of hardware threads. The collection of tasks and dependences form a directed acyclic graph of tasks - a task DAG. Major challenges of this research and development effort include: portability and performance across multicore CPU; manycore Intel Xeon Phi, and NVIDIA GPU architectures; scalability with respect to hardware concurrency and size of the task DAG; and usability of the application programmer interface (API).

More Details

TYPE SAND Report YEAR 2016

DOI OSTI

Cooperative Computing for Autonomous Data Centers Storing Social Network Data

Berry, Jonathan

More Details

TYPE Conference Poster YEAR 2016

OSTI

Social Network Analysis Research at Sandia (a non-comprehensive survey)

Berry, Jonathan

More Details

TYPE Presentation YEAR 2016

OSTI

Anti-persistence on persistent storage: History-independent sparse tables and dictionaries

Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Bender, Michael A.; Berry, Jonathan; Johnson, Rob; Kroeger, Thomas; Mccauley, Samuel; Phillips, Cynthia A.; Simon, Bertrand; Singh, Shikha; Zage, David J.

We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API. We show how to build a history-independent cache-oblivious B-tree and a history-independent external-memory skip list. One of the main contributions is a data structure we build on the way - a history-independent packed-memory array (PMA). The PMA supports efficient range queries, one of the most important operations for answering database queries. Our HI PMA matches the asymptotic bounds of prior non-HI packed-memory arrays and sparse tables. Specifically, a PMA maintains a dynamic set of elements in sorted order in a linearsized array. Inserts and deletes take an amortized O(log2 N) element moves with high probability. Simple experiments with our implementation of HI PMAs corroborate our theoretical analysis. Comparisons to regular PMAs give preliminary indications that the practical cost of adding history-independence is not too large. Our HI cache-oblivious B-tree bounds match those of prior non-HI cache-oblivious B-trees. Searches take O(logB N) I/Os; inserts and deletes take O(log2N/B + logB N) amortized I/Os with high probability; and range queries returning k elements take O(logB N + k/B) I/Os. Our HI external-memory skip list achieves optimal bounds with high probability, analogous to in-memory skip lists: O(logB N) I/Os for point queries and amortized O(logB N) I/Os for inserts/deletes. Range queries returning k elements run in O(logB N + k/B) I/Os. In contrast, the best possible high-probability bounds for inserting into the folklore B-skip list, which promotes elements with probability 1/B, is just Θ(log N) I/Os. This is no better than the bounds one gets from running an inmemory skip list in external memory.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

Cooperative computing for autonomous data centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI

Finding Non-Human Nodes in Social Networks Using Only Topology

Berry, Jonathan; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

More Details

TYPE Conference Poster YEAR 2015

OSTI

A task-based linear algebra Building Blocks approach for scalable graph analytics

2015 IEEE High Performance Extreme Computing Conference, HPEC 2015

Wolf, Michael; Berry, Jonathan; Stark, Dylan T.

It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

A task-based linear algebra Building Blocks approach for scalable graph analytics

2015 IEEE High Performance Extreme Computing Conference Hpec 2015

Wolf, Michael; Berry, Jonathan; Stark, Dylan T.

It is challenging to obtain scalable HPC performance on real applications, especially for data science applications with irregular memory access and computation patterns. To drive co-design efforts in architecture, system, and application design, we are developing miniapps representative of data science workloads. These in turn stress the state of the art in Graph BLAS-like Graph Algorithm Building Blocks (GABB). In this work, we outline a Graph BLAS-like, linear algebra based approach to miniTri, one such miniapp. We describe a task-based prototype implementation and give initial scalability results.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

k-Means Clustering for Two-Level Memory Systems

Berry, Jonathan; Bender, M.; Hammond, Simon; Moore, B.; Phillips, Cynthia; Moseley, B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

K-Means clustering on two-level memory systems

ACM International Conference Proceeding Series

Bender, Michael A.; Berry, Jonathan; Hammond, Simon; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.

In recent work we quantified the anticipated performance boost when a sorting algorithm is modified to leverage user- Addressable "near-memory," which we call scratchpad. This architectural feature is expected in the Intel Knight's Land- ing processors that will be used in DOE's next large-scale supercomputer. This paper expands our analytical study of the scratch- pad to consider k-means clustering, a classical data-analysis technique that is ubiquitous in the literature and in prac- Tice. We present new theoretical results using the model introduced in [13], which measures memory transfers and assumes that computations are memory-bound. Our the- oretical results indicate that scratchpad-aware versions of k-means clustering can expect performance boosts for high- dimensional instances with relatively few cluster centers. These constraints may limit the practical impact of scratch- pad for k-means acceleration, so we discuss their origins and practical implications. We corroborate our theory with ex- perimental runs on a system instrumented to mimic one with scratchpad memory. We also contribute a semi-formalization of the computa- Tional properties that are necessary and sufficient to predict a performance boost from scratchpad-aware variants of al- gorithms. We have observed and studied these properties in the context of sorting, and now clustering. We conclude with some thoughts on the application of these properties to new areas. Specifically, we believe that dense linear algebra has similar properties to k-means, while sparse linear algebra and FFT computations are more sim-ilar to sorting. The sparse operations are more common in scientific computing, so we expect scratchpad to have signif- icant impact in that area.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015

Bender, Michael A.; Berry, Jonathan; Hammond, Simon; Hemmert, Karl S.; Mccauley, Samuel; Moore, Branden J.; Moseley, Benjamin; Phillips, Cynthia A.; Resnick, David R.; Rodrigues, Arun

A fundamental challenge for supercomputer architecture is that processors cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. As the number of cores per chip increases, and traditional DDR DRAM speeds stagnate, the problem is only getting worse. A variety of non-DDR 3D memory technologies (Wide I/O 2, HBM) offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. However, such a packaging scheme cannot contain sufficient memory capacity for a node. It seems likely that future systems will require at least two levels of main memory: high-bandwidth, low-power memory near the processor and low-bandwidth high-capacity memory further away. This near memory will probably not have significantly faster latency than the far memory. This, combined with the large size of the near memory (multiple GB) and power constraints, may make it difficult to treat it as a standard cache. In this paper, we explore some of the design space for a user-controlled multi-level main memory. We present algorithms designed for the heterogeneous bandwidth, using streaming to exploit data locality. We consider algorithms for the fundamental application of sorting. Our algorithms asymptotically reduce memory-block transfers under certain architectural parameter settings. We use and extend Sandia National Laboratories' SST simulation capability to demonstrate the relationship between increased bandwidth and improved algorithmic performance. Memory access counts from simulations corroborate predicted performance. This co-design effort suggests implementing two-level main memory systems may improve memory performance in fundamental applications.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Cooperative Computing for Autonomous Data Centers

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

We present a new distributed model for graph computations motivated by limited information sharing. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a poly logarithmic size subgraph, 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centres' have results for both models for s-t connectivity, one of the simplest graph problems that requires global information in the worst case. In the limited-sharing model, our results exploit social network structure. Standard communication complexity gives polynomial lower bounds on s-t connectivity for general graphs. However, if the graph for each data centre has a giant component and these giant components intersect, then we can overcome this lower bound, computing-t connectivity while exchanging O(log 2 n) bits for a constant number of data centers. We can also test the assumption that the giant components overlap using O(log 2 n) bits provided the (unknown) overlap is sufficiently large. The second result is in the low trust model. We give a secure multi-party computation (MPC) algorithm that 1) does not make cryptographic assumptions when there are 3 or more entities, and 2) is efficient, especially when compared to the usual garbled circuit approach. The entities learn only the yes/no answer. No party learns anything about the others' graph, not even node names. This algorithm does not require any special graph structure. This secure MPC result for s-t connectivity is one of the first that involves a few parties computing on large inputs, instead of many parties computing on a few local values.

More Details

TYPE Conference Poster YEAR 2015

OSTI Scopus

Cooperative computing for autonomous data centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared; Smith, Randy D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Cyber Graph Queries for Geographically Distributed Data Centers

Berry, Jonathan; Collins, Michael; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

We present new algorithms for a distributed model for graph computations motivated by limited information sharing we first discussed. Two or more independent entities have collected large social graphs. They wish to compute the result of running graph algorithms on the entire set of relationships. Because the information is sensitive or economically valuable, they do not wish to simply combine the information in a single location. We consider two models for computing the solution to graph algorithms in this setting: 1) limited-sharing: the two entities can share only a polylogarithmic size subgraph; 2) low-trust: the entities must not reveal any information beyond the query answer, assuming they are all honest but curious. We believe this model captures realistic constraints on cooperating autonomous data centers. We have algorithms in both setting for s - t connectivity in both models. We also give an algorithm in the low-communication model for finding a planted clique. This is an anomaly-detection problem, finding a subgraph that is larger and denser than expected. For both the low- communication algorithms, we exploit structural properties of social networks to prove performance bounds better than what is possible for general graphs. For s - t connectivity, we use known properties. For planted clique, we propose a new property: bounded number of triangles per node. This property is based upon evidence from the social science literature. We found that classic examples of social networks do not have the bounded-triangles property. This is because many social networks contain elements that are non-human, such as accounts for a business, or other automated accounts. We describe some initial attempts to distinguish human nodes from automated nodes in social networks based only on topological properties.

More Details

TYPE SAND Report YEAR 2015

DOI OSTI

Why do simple algorithms for triangle enumeration work in the real world?

Internet Mathematics

Berry, Jonathan; Fostvedt, Luke A.; Nordman, Daniel J.; Phillips, Cynthia A.; Comandur, Seshadhri; Wilson, Alyson G.

Listing all triangles is a fundamental graph operation. Triangles can have important interpretations in real-world graphs, especially social and other interaction networks. Despite the lack of provably efficient (linear, or slightly super linear) worst-case algorithms for this problem, practitioners run simple, efficient heuristics to find all triangles in graphs with millions of vertices. How are these heuristics exploiting the structure of these special graphs to provide major speedups in running time? We study one of the most prevalent algorithms used by practitioners. A trivial algorithm enumerates all paths of length 2, and checks if each such path is incident to a triangle. A good heuristic is to enumerate only those paths of length 2 in which the middle vertex has the lowest degree. It is easily implemented and is empirically known to give remarkable speedups over the trivial algorithm. We study the behavior of this algorithm over graphs with heavy-tailed degree distributions, a defining feature of real-world graphs. The erased configuration model (ECM) efficiently generates a graph with asymptotically (almost) any desired degree sequence. We show that the expected running time of this algorithm over the distribution of graphs created by the ECM is controlled by the l4/3-norm of the degree sequence. Norms of the degree sequence are a measure of the heaviness of the tail, and it is precisely this feature that allows non trivial speedups of simple triangle enumeration algorithms. As a corollary of our main theorem, we prove expected linear-time performance for degree sequences following a power law with exponent α ≥ 7/3, and non trivial speedup whenever α ∈ (2, 3).

More Details

TYPE Conference YEAR 2015

Scopus OSTI

Workshop on Streaming Graph Algorithms (WSGA) Overview

Berry, Jonathan; Phillips, Cynthia A.; Hendrickson, Bruce A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Finding a planted clique in a distributed social network

Berry, Jonathan; Kearns, Aaron; Phillips, Cynthia A.; Saia, Jared

More Details

TYPE Presentation YEAR 2014

OSTI

Sandia Software for Networks from DARPA GRAPHS Program

Kolda, Tamara G.; Plantenga, Todd; Pinar, Ali P.; Comandur, Seshadhri; Berry, Jonathan; Jha, Madhav; Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Graph Exploration: to Linear Algebra (and Beyond?)

Berry, Jonathan; Wolf, Michael; Stark, Dylan T.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Graph Algorithms for Autonomous Distributed Data Centers

Berry, Jonathan; Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Statistically significant relational data mining :

Berry, Jonathan; Leung, Vitus J.; Phillips, Cynthia A.; Pinar, Ali P.; Robinson, David G.

This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.

More Details

TYPE SAND Report YEAR 2014

DOI OSTI

Maintaining Connected Components for Infinite Graph Streams

Berry, Jonathan; Phillips, Cynthia A.; Plimpton, Steven J.; Shead, Timothy M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI DOI

Community Detection: A Bayesian Approach and the Challenge of Evaluation

Berry, Jonathan; Dunlavy, Daniel M.; Phillips, Cynthia A.; Robinson, David G.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Sensor Placement for Municipal Water Networks

Berry, Jonathan; Hart, William E.; Phillips, Cynthia A.; Watson, Jean-Paul

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI DOI

Maintaining connected components for infinite graph streams

Proc. of 2nd Int. Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, BigMine 2013 - Held in Conj. with SIGKDD 2013 Conf.

Berry, Jonathan; Phillips, Cynthia A.; Plimpton, Steven J.; Shead, Timothy M.

We present an algorithm to maintain the connected components of a graph that arrives as an infinite stream of edges. We formalize the algorithm on X-stream, a new parallel theoretical computational model for infinite streams. Connectivity-related queries, including component spanning trees, are supported with some latency, returning the state of the graph at the time of the query. Because an infinite stream may eventually exceed the storage limits of any number of finite-memory processors, we assume an aging command or daemon where "uninteresting" edges are removed when the system nears capacity. Following an aging command the system will block queries until its data structures are repaired, but edges will continue to be accepted from the stream, never dropped. The algorithm will not fail unless a model-specific constant fraction of the aggregate memory across all processors is full. In normal operation, it will not fail unless aggregate memory is completely full. Unlike previous theoretical streaming models designed for finite graphs that assume a single shared memory machine or require arbitrary-size intemediate files, X-stream distributes a graph over a ring network of finite-memory processors. Though the model is synchronous and reminiscent of systolic algorithms, our implementation uses an asynchronous message-passing system. We argue the correctness of our X-stream connected components algorithm, and give preliminary experimental results on synthetic and real graph streams.

More Details

TYPE Presentation YEAR 2013

OSTI Scopus

Water security toolkit :

Klise, Katherine A.; Hart, William E.; Siirola, John D.; Phillips, Cynthia A.; Mckenna, Sean A.; Hart, David; Berry, Jonathan; Watson, Jean-Paul

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Graph Analysis with High-Performance Computing

Berry, Jonathan

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Challenges in Streaming Graph Analysis

Berry, Jonathan; Plimpton, Steven J.; Comandur, Seshadhri

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Community Detection: A Bayesian Approach and the Challenge of Evaluation

Berry, Jonathan; Dunlavy, Daniel M.; Phillips, Cynthia A.; Robinson, David G.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Parallel Bayesian Methods for Community Detection

Berry, Jonathan; Dunlavy, Daniel M.; Phillips, Cynthia A.; Robinson, David G.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

CHALLENGES IN PARALLEL GRAPH PROCESSING

Parallel Processing Letters

Hendrickson, Bruce A.; Berry, Jonathan

Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel computing resources are required to meet their computational and memory requirements. Unfortunately, the algorithms, software, and hardware that have worked well for developing mainstream parallel scientific applications are not necessarily effective for large-scale graph problems. In this paper we present the inter-relationships between graph problems, software, and parallel hardware in the current state of the art and discuss how those issues present inherent challenges in solving large-scale graph problems. The range of these challenges suggests a research agenda for the development of scalable high-performance software for graph problems.

More Details

TYPE Journal Article YEAR 2011

OSTI DOI

Challenges in Streaming Analysis

Berry, Jonathan; Plimpton, Steven J.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Challenges in Streaming Graph Analysis

Berry, Jonathan; Plimpton, Steven J.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Sensor placement for municipal water networks

Phillips, Cynthia A.; Boman, Erik G.; Carr, Robert D.; Hart, William E.; Berry, Jonathan; Watson, Jean-Paul; Hart, David; Mckenna, Sean A.; Riesen, Lee A.

We consider the problem of placing a limited number of sensors in a municipal water distribution network to minimize the impact over a given suite of contamination incidents. In its simplest form, the sensor placement problem is a p-median problem that has structure extremely amenable to exact and heuristic solution methods. We describe the solution of real-world instances using integer programming or local search or a Lagrangian method. The Lagrangian method is necessary for solution of large problems on small PCs. We summarize a number of other heuristic methods for effectively addressing issues such as sensor failures, tuning sensors based on local water quality variability, and problem size/approximation quality tradeoffs. These algorithms are incorporated into the TEVA-SPOT toolkit, a software suite that the US Environmental Protection Agency has used and is using to design contamination warning systems for US municipal water systems.

More Details

TYPE Conference YEAR 2010

OSTI

Accelerating multicore graph algorithms by trading latency for bandwidth

Stark, Dylan T.; Murphy, Richard C.; Barrett, Brian; Berry, Jonathan

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Listing triangles in expected linear time on a class of power law graphs

Berry, Jonathan

Enumerating triangles (3-cycles) in graphs is a kernel operation for social network analysis. For example, many community detection methods depend upon finding common neighbors of two related entities. We consider Cohen's simple and elegant solution for listing triangles: give each node a 'bucket.' Place each edge into the bucket of its endpoint of lowest degree, breaking ties consistently. Each node then checks each pair of edges in its bucket, testing for the adjacency that would complete that triangle. Cohen presents an informal argument that his algorithm should run well on real graphs. We formalize this argument by providing an analysis for the expected running time on a class of random graphs, including power law graphs. We consider a rigorously defined method for generating a random simple graph, the erased configuration model (ECM). In the ECM each node draws a degree independently from a marginal degree distribution, endpoints pair randomly, and we erase self loops and multiedges. If the marginal degree distribution has a finite second moment, it follows immediately that Cohen's algorithm runs in expected linear time. Furthermore, it can still run in expected linear time even when the degree distribution has such a heavy tail that the second moment is not finite. We prove that Cohen's algorithm runs in expected linear time when the marginal degree distribution has finite 4/3 moment and no vertex has degree larger than {radical}n. In fact we give the precise asymptotic value of the expected number of edge pairs per bucket. A finite 4/3 moment is required; if it is unbounded, then so is the number of pairs. The marginal degree distribution of a power law graph has bounded 4/3 moment when its exponent {alpha} is more than 7/3. Thus for this class of power law graphs, with degree at most {radical}n, Cohen's algorithm runs in expected linear time. This is precisely the value of {alpha} for which the clustering coefficient tends to zero asymptotically, and it is in the range that is relevant for the degree distribution of the World-Wide Web.

More Details

TYPE Conference YEAR 2010

OSTI

Comparing Programming Paradigms for Graph Algorithms

Devine, Karen; Plimpton, Steven J.; Bayer, Gregory; Barrett, Brian; Berry, Jonathan

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Implementing a portable multi-threaded graph library: The mtgl on qthreads

IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium

Barrett, Brian; Berry, Jonathan; Murphy, Richard C.; Wheeler, Kyle B.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

Scopus OSTI

Limited-memory techniques for sensor placement in water distribution networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Hart, William E.; Berry, Jonathan; Boman, Erik G.; Phillips, Cynthia A.; Riesen, Lee A.; Watson, Jean-Paul

The practical utility of optimization technologies is often impacted by factors that reflect how these tools are used in practice, including whether various real-world constraints can be adequately modeled, the sophistication of the analysts applying the optimizer, and related environmental factors (e.g. whether a company is willing to trust predictions from computational models). Other features are less appreciated, but of equal importance in terms of dictating the successful use of optimization. These include the scale of problem instances, which in practice drives the development of approximate solution techniques, and constraints imposed by the target computing platforms. End-users often lack state-of-the-art computers, and thus runtime and memory limitations are often a significant, limiting factor in algorithm design. When coupled with large problem scale, the result is a significant technological challenge. We describe our experience developing and deploying both exact and heuristic algorithms for placing sensors in water distribution networks to mitigate against damage due intentional or accidental introduction of contaminants. The target computing platforms for this application have motivated limited-memory techniques that can optimize large-scale sensor placement problems. © 2008 Springer Berlin Heidelberg.

More Details

TYPE Conference YEAR 2008

OSTI Scopus

Low-memory Lagrangian relaxation methods for sensor placement in municipal water networks

World Environmental and Water Resources Congress 2008: Ahupua'a - Proceedings of the World Environmental and Water Resources Congress 2008

Berry, Jonathan; Boman, Erik G.; Phillips, Cynthia A.; Riesen, Lee A.

Placing sensors in municipal water networks to protect against a set of contamination events is a classic p-median problem for most objectives when we assume that sensors are perfect. Many researchers have proposed exact and approximate solution methods for this p-median formulation. For full-scale networks with large contamination event suites, one must generally rely on heuristic methods to generate solutions. These heuristics provide feasible solutions, but give no quality guarantee relative to the optimal placement. In this paper we apply a Lagrangian relaxation method in order to compute lower bounds on the expected impact of suites of contamination events. In all of our experiments with single objectives, these lower bounds establish that the GRASP local search method generates solutions that are provably optimal to to within a fraction of a percentage point. Our Lagrangian heuristic also provides good solutions itself and requires only a fraction of the memory of GRASP. We conclude by describing two variations of the Lagrangian heuristic: an aggregated version that trades off solution quality for further memory savings, and a multi-objective version which balances objectives with additional goals. © 2008 ASCE.

More Details

TYPE Conference YEAR 2008

OSTI Scopus

Low-memory Lagrangian relaxation methods for sensor placement in municipal water networks

World Environmental and Water Resources Congress 2008: Ahupua'a - Proceedings of the World Environmental and Water Resources Congress 2008

Berry, Jonathan; Boman, Erik G.; Phillips, Cynthia A.; Riesen, Lee A.

Placing sensors in municipal water networks to protect against a set of contamination events is a classic p-median problem for most objectives when we assume that sensors are perfect. Many researchers have proposed exact and approximate solution methods for this p-median formulation. For full-scale networks with large contamination event suites, one must generally rely on heuristic methods to generate solutions. These heuristics provide feasible solutions, but give no quality guarantee relative to the optimal placement. In this paper we apply a Lagrangian relaxation method in order to compute lower bounds on the expected impact of suites of contamination events. In all of our experiments with single objectives, these lower bounds establish that the GRASP local search method generates solutions that are provably optimal to to within a fraction of a percentage point. Our Lagrangian heuristic also provides good solutions itself and requires only a fraction of the memory of GRASP. We conclude by describing two variations of the Lagrangian heuristic: an aggregated version that trades off solution quality for further memory savings, and a multi-objective version which balances objectives with additional goals. © 2008 ASCE.

More Details

TYPE Presentation YEAR 2008

DOI OSTI Scopus

The TEVA-SPOT toolkit for drinking water contaminant warning system design

World Environmental and Water Resources Congress 2008: Ahupua'a - Proceedings of the World Environmental and Water Resources Congress 2008

Hart, William E.; Berry, Jonathan; Boman, Erik G.; Murray, Regan; Phillips, Cynthia A.; Riesen, Lee A.; Watson, Jean-Paul

We present the TEVA-SPOT Toolkit, a sensor placement optimization tool developed within the USEPA TEVA program. The TEVA-SPOT Toolkit provides a sensor placement framework that facilitates research in sensor placement optimization and enables the practical application of sensor placement solvers to real-world CWS design applications. This paper provides an overview of its key features, and then illustrates how this tool can be flexibly applied to solve a variety of different types of sensor placement problems. © 2008 ASCE.

More Details

TYPE Conference YEAR 2008

Scopus OSTI

Tolerating the community detection resolution limit with edge weighting

Proposed for publication in the Proceedings of the National Academy of Sciences.

Hendrickson, Bruce A.; Laviolette, Randall A.; Phillips, Cynthia A.; Berry, Jonathan

Communities of vertices within a giant network such as the World-Wide-Web are likely to be vastly smaller than the network itself. However, Fortunato and Barthelemy have proved that modularity maximization algorithms for community detection may fail to resolve communities with fewer than {radical} L/2 edges, where L is the number of edges in the entire network. This resolution limit leads modularity maximization algorithms to have notoriously poor accuracy on many real networks. Fortunato and Barthelemy's argument can be extended to networks with weighted edges as well, and we derive this corollary argument. We conclude that weighted modularity algorithms may fail to resolve communities with fewer than {radical} W{epsilon}/2 total edge weight, where W is the total edge weight in the network and {epsilon} is the maximum weight of an inter-community edge. If {epsilon} is small, then small communities can be resolved. Given a weighted or unweighted network, we describe how to derive new edge weights in order to achieve a low {epsilon}, we modify the 'CNM' community detection algorithm to maximize weighted modularity, and show that the resulting algorithm has greatly improved accuracy. In experiments with an emerging community standard benchmark, we find that our simple CNM variant is competitive with the most accurate community detection methods yet proposed.

More Details

TYPE Journal Article YEAR 2008

OSTI