Publications

Results 51–100 of 200

Search results

Jump to search filters

Measuring and Modeling Bipartite Graphs with Community Structure

Kolda, Tamara G.; Pinar, Ali P.; Aksoy, Sinan

Network science is a powerful tool for analyzing complex systems in fields ranging from sociology to engineering to biology. This paper is focused on generative models of bipartite graphs, also known as two-way graphs. We propose two generative models that can be easily tuned to reproduce the characteristics of real-world networks, not just qualitatively, but quantitatively. The measurements we consider are the degree distributions and the bipartite clustering coefficient, which we refer to as the metamorphosis coefficient. We define edge, node, and degreewise metamorphosis coefficients, enabling a more detailed understand of the bipartite community structure. Our proposed bipartite Chung-Lu model is able to reproduce real-world degree distributions, and our proposed bipartite “BTER” model reproduces both the degree distributions as well as the degreewise metamorphosis coefficients. We demonstrate the effectiveness of these models on several real-world data sets.

More Details

Parallel Tensor Compression for Large-Scale Scientific Data

Kolda, Tamara G.; Ballard, Grey B.; Austin, Woody N.

As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memory parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.

More Details

A brief summary on formalizing parallel tensor distributions redistributions and algorithm derivations

Schatz, Martin D.; Kolda, Tamara G.; Van De Geijn, Robert

Large-scale datasets in computational chemistry typically require distributed-memory parallel methods to perform a special operation known as tensor contraction. Tensors are multidimensional arrays, and a tensor contraction is akin to matrix multiplication with special types of permutations. Creating an efficient algorithm and optimized im- plementation in this domain is complex, tedious, and error-prone. To address this, we develop a notation to express data distributions so that we can apply use automated methods to find optimized implementations for tensor contractions. We consider the spin-adapted coupled cluster singles and doubles method from computational chemistry and use our methodology to produce an efficient implementation. Experiments per- formed on the IBM Blue Gene/Q and Cray XC30 demonstrate impact both improved performance and reduced memory consumption.

More Details

Numerical optimization for symmetric tensor decomposition

Mathematical Programming

Kolda, Tamara G.

We consider the problem of decomposing a real-valued symmetric tensor as the sum of outer products of real-valued vectors. Algebraic methods exist for computing complex-valued decompositions of symmetric tensors, but here we focus on real-valued decompositions, both unconstrained and nonnegative, for problems with low-rank structure. We discuss when solutions exist and how to formulate the mathematical program. Numerical results show the properties of the proposed formulations (including one that ignores symmetry) on a set of test problems and illustrate that these straightforward formulations can be effective even though the problem is nonconvex.

More Details

Newton-based optimization for Kullback-Leibler nonnegative tensor factorizations

Optimization Methods and Software

Plantenga, Todd; Kolda, Tamara G.; Hansen, Samantha

Tensor factorizations with nonnegativity constraints have found application in analysing data from cyber traffic, social networks, and other areas. We consider application data best described as being generated by a Poisson process (e.g. count data), which leads to sparse tensors that can be modelled by sparse factor matrices. In this paper, we investigate efficient techniques for computing an appropriate canonical polyadic tensor factorization based on the Kullback–Leibler divergence function. We propose novel subproblem solvers within the standard alternating block variable approach. Our new methods exploit structure and reformulate the optimization problem as small independent subproblems. We employ bound-constrained Newton and quasi-Newton methods. Finally, we compare our algorithms against other codes, demonstrating superior speed for high accuracy results and the ability to quickly find sparse solutions.

More Details

Wedge sampling for computing clustering coefficients and triangle counts on large graphs

Statistical Analysis and Data Mining

Comandur, Seshadhri; Pinar, Ali P.; Kolda, Tamara G.

Graphs are used to model interactions in a variety of contexts, and there is a growing need to quickly assess the structure of such graphs. Some of the most useful graph metrics are based on triangles, such as those measuring social cohesion. Algorithms to compute them can be extremely expensive, even for moderately sized graphs with only millions of edges. Previous work has considered node and edge sampling; in contrast, we consider wedge sampling, which provides faster and more accurate approximations than competing techniques. Additionally, wedge sampling enables estimating local clustering coefficients, degree-wise clustering coefficients, uniform triangle sampling, and directed triangle counts. Our methods come with provable and practical probabilistic error estimates for all computations. We provide extensive results that show our methods are both more accurate and faster than state-of-the-art alternatives. © 2014 Wiley Periodicals, Inc.

More Details

An adaptive shifted power method for computing generalized tensor eigenpairs

SIAM Journal on Matrix Analysis and Applications

Kolda, Tamara G.; Mayo, Jackson R.

Several tensor eigenpair definitions have been put forth in the past decade, but these can all be unified under generalized tensor eigenpair framework, introduced by Chang, Pearson, and Zhang [J. Math. Anal. Appl., 350 (2009), pp. 416-422]. Given mth-order, n-dimensional realvalued symmetric tensors A and B, the goal is to find λ ε ℝ and x ε ℝn, x ≠= 0 such that Axm-1 = λBxm-1. Different choices for B yield different versions of the tensor eigenvalue problem. We present our generalized eigenproblem adaptive power (GEAP) method for solving the problem, which is an extension of the shifted symmetric higher-order power method (SS-HOPM) for finding Z-eigenpairs. A major drawback of SS-HOPM is that its performance depended on choosing an appropriate shift, but our GEAP method also includes an adaptive method for choosing the shift automatically.

More Details
Results 51–100 of 200
Results 51–100 of 200