Publications

Results 26–50 of 57
Skip to search filters

Graph analysis with high-performance computing

Computing in Science and Engineering

Hendrickson, Bruce A.; Berry, Jonathan W.

Large, complex graphs arise in many settings including the Internet, social networks, and communication networks. To study such data sets, the authors explored the use of highperformance computing (HPC) for graph algorithms. They found that the challenges in these applications are quite different from those arising in traditional HPC applications and that massively multithreaded machines are well suited for graph problems. © 2008 IEEE.

More Details

Combinatorial scientific computing: The enabling power of discrete algorithms in computational science

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Hendrickson, Bruce A.; Pothen, Alex

Combinatorial algorithms have long played a crucial, albeit under-recognized role in scientific computing. This impact ranges well beyond the familiar applications of graph algorithms in sparse matrices to include mesh generation, optimization, computational biology and chemistry, data analysis and parallelization. Trends in science and in computing suggest strongly that the importance of discrete algorithms in computational science will continue to grow. This paper reviews some of these many past successes and highlights emerging areas of promise and opportunity. © Springer-Verlag Berlin Heidelberg 2007.

More Details

Algorithms and architectures for high performance analysis of semantic graphs

Hendrickson, Bruce A.

Semantic graphs offer one promising avenue for intelligence analysis in homeland security. They provide a mechanism for describing a wide variety of relationships between entities of potential interest. The vertices are nouns of various types, e.g. people, organizations, events, etc. Edges in the graph represent different types of relationships between entities, e.g. 'is friends with', 'belongs-to', etc. Semantic graphs offer a number of potential advantages as a knowledge representation system. They allow information of different kinds, and collected in differing ways, to be combined in a seamless manner. A semantic graph is a very compressed representation of some of relationship information. It has been reported that the semantic graph can be two orders of magnitude smaller than the processed intelligence data. This allows for much larger portions of the data universe to be resident in computer memory. Many intelligence queries that are relevant to the terrorist threat are naturally expressed in the language of semantic graphs. One example is the search for 'interesting' relationships between two individuals or between an individual and an event, which can be phrased as a search for short paths in the graph. Another example is the search for an analyst-specified threat pattern, which can be cast as an instance of subgraph isomorphism. It is important to note than many kinds of analysis are not relationship based, so these are not good candidates for semantic graphs. Thus, a semantic graph should always be used in conjunction with traditional knowledge representation and interface methods. Operations that involve looking for chains of relationships (e.g. friend of a friend) are not efficiently executable in a traditional relational database. However, the semantic graph can be thought of as a pre-join of the database, and it is ideally suited for these kinds of operations. Researchers at Sandia National Laboratories are working to facilitate semantic graph analysis. Since intelligence datasets can be extremely large, the focus of this work is on the use of parallel computers. We have been working to develop scalable parallel algorithms that will be at the core of a semantic graph analysis infrastructure. Our work has involved two different thrusts, corresponding to two different computer architectures. The first architecture of interest is distributed memory, message passing computers. These machines are ubiquitous and affordable, but they are challenging targets for graph algorithms. Much of our distributed-memory work to date has been collaborative with researchers at Lawrence Livermore National Laboratory and has focused on finding short paths on distributed memory parallel machines. Our implementation on 32K processors of BlueGene/Light finds shortest paths between two specified vertices in just over a second for random graphs with 4 billion vertices.

More Details

Parallel hypergraph partitioning for scientific computing

Boman, Erik G.; Devine, Karen D.; Heaphy, Robert T.; Hendrickson, Bruce A.

Graph partitioning is often used for load balancing in parallel computing, but it is known that hypergraph partitioning has several advantages. First, hypergraphs more accurately model communication volume, and second, they are more expressive and can better represent nonsymmetric problems. Hypergraph partitioning is particularly suited to parallel sparse matrix-vector multiplication, a common kernel in scientific computing. We present a parallel software package for hypergraph (and sparse matrix) partitioning developed at Sandia National Labs. The algorithm is a variation on multilevel partitioning. Our parallel implementation is novel in that it uses a two-dimensional data distribution among processors. We present empirical results that show our parallel implementation achieves good speedup on several large problems (up to 33 million nonzeros) with up to 64 processors on a Linux cluster.

More Details

Combinatorial parallel and scientific computing

Proposed for publication as a book chapter in "Parallel Scientific Computing".

Hendrickson, Bruce A.

Combinatorial algorithms have long played a pivotal enabling role in many applications of parallel computing. Graph algorithms in particular arise in load balancing, scheduling, mapping and many other aspects of the parallelization of irregular applications. These are still active research areas, mostly due to evolving computational techniques and rapidly changing computational platforms. But the relationship between parallel computing and discrete algorithms is much richer than the mere use of graph algorithms to support the parallelization of traditional scientific computations. Important, emerging areas of science are fundamentally discrete, and they are increasingly reliant on the power of parallel computing. Examples include computational biology, scientific data mining, and network analysis. These applications are changing the relationship between discrete algorithms and parallel computing. In addition to their traditional role as enablers of high performance, combinatorial algorithms are now customers for parallel computing. New parallelization techniques for combinatorial algorithms need to be developed to support these nontraditional scientific approaches. This chapter will describe some of the many areas of intersection between discrete algorithms and parallel scientific computing. Due to space limitations, this chapter is not a comprehensive survey, but rather an introduction to a diverse set of techniques and applications with a particular emphasis on work presented at the Eleventh SIAM Conference on Parallel Processing for Scientific Computing. Some topics highly relevant to this chapter (e.g. load balancing) are addressed elsewhere in this book, and so we will not discuss them here.

More Details

A scalable distributed parallel breadth-first search algorithm on BlueGene/L

Proceedings of the ACM/IEEE 2005 Supercomputing Conference, SC'05

Yoo, Andy; Chow, Edmond; Henderson, Keith; McLendon, William; Hendrickson, Bruce A.; Çatalyürek, Ümit

Many emerging large-scale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadth-first search (BFS) scheme that scales for random graphs with up to three billion vertices and 30 billion edges. Scalability was tested on IBM BlueGene/L with 32,768 nodes at the Lawrence Livermore National Laboratory. Scalability was obtained through a series of optimizations, in particular, those that ensure scalable use of memory. We use 2D (edge) partitioning of the graph instead of conventional ID (vertex) partitioning to reduce communication overhead. For Poisson random graphs, we show that the expected size of the messages is scalable for both 2D and ID partitionings. Finally, we have developed efficient collective communication functions for the 3D torus architecture of BlueGene/L that also take advantage of the structure in the problem. The performance and characteristics of the algorithm are measured and reported. © 2005 IEEE.

More Details

Solving elliptic finite element systems in near-linear time with support preconditioners

Proposed for publication in the SIAM Journal on Matrix Analysis.

Boman, Erik G.; Hendrickson, Bruce A.

We consider linear systems arising from the use of the finite element method for solving a certain class of linear elliptic problems. Our main result is that these linear systems, which are symmetric and positive semidefinite, are well approximated by symmetric diagonally dominant matrices. Our framework for defining matrix approximation is support theory. Significant graph theoretic work has already been developed in the support framework for preconditioners in the diagonally dominant case, and in particular it is known that such systems can be solved with iterative methods in nearly linear time. Thus, our approximation result implies that these graph theoretic techniques can also solve a class of finite element problems in nearly linear time. We show that the quality of our approximation, which controls the number of iterations in the preconditioned iterative solver, depends primarily on a mesh quality measure but not on the problem size or shape of the domain.

More Details
Results 26–50 of 57
Results 26–50 of 57