Publications

Results 1–25 of 30

Search results

Jump to search filters

Newton-based optimization for Kullback-Leibler nonnegative tensor factorizations

Optimization Methods and Software

Plantenga, Todd P.; Kolda, Tamara G.; Hansen, Samantha

Tensor factorizations with nonnegativity constraints have found application in analysing data from cyber traffic, social networks, and other areas. We consider application data best described as being generated by a Poisson process (e.g. count data), which leads to sparse tensors that can be modelled by sparse factor matrices. In this paper, we investigate efficient techniques for computing an appropriate canonical polyadic tensor factorization based on the Kullback–Leibler divergence function. We propose novel subproblem solvers within the standard alternating block variable approach. Our new methods exploit structure and reformulate the optimization problem as small independent subproblems. We employ bound-constrained Newton and quasi-Newton methods. Finally, we compare our algorithms against other codes, demonstrating superior speed for high accuracy results and the ability to quickly find sparse solutions.

More Details

A scalable generative graph model with community structure

SIAM Journal on Scientific Computing

Kolda, Tamara G.; Pinar, Ali; Plantenga, Todd P.; Comandur, Seshadhri C.

Network data is ubiquitous and growing, yet we lack realistic generative network models that can be calibrated to match real-world data. The recently proposed block two-level Erd?os- Rényi (BTER) model can be tuned to capture two fundamental properties: degree distribution and clustering coefficients. The latter is particularly important for reproducing graphs with community structure, such as social networks. In this paper, we compare BTER to other scalable models and show that it gives a better fit to real data. We provide a scalable implementation that requires only O(dmax) storage, where dmax is the maximum number of neighbors for a single node. The generator is trivially parallelizable, and we show results for a Hadoop MapReduce implementation for modeling a real-worldWeb graph with over 4.6 billion edges. We propose that the BTER model can be used as a graph generator for benchmarking purposes and provide idealized degree distributions and clustering coefficient profiles that can be tuned for user specifications.

More Details

Inexact subgraph isomorphism in MapReduce

Journal of Parallel and Distributed Computing

Plantenga, Todd P.

Inexact subgraph matching based on type-isomorphism was introduced by Berry et al. [J. Berry, B. Hendrickson, S. Kahan, P. Konecny, Software and algorithms for graph queries on multithreaded architectures, in: Proc. IEEE International Parallel and Distributed Computing Symposium, IEEE, 2007, pp. 1-14] as a generalization of the exact subgraph matching problem. Enumerating small subgraph patterns in very large graphs is a core problem in the analysis of social networks, bioinformatics data sets, and other applications. This paper describes a MapReduce algorithm for subgraph type-isomorphism matching. The MapReduce computing framework is designed for distributed computing on massive data sets, and the new algorithm leverages MapReduce techniques to enable processing of graphs with billions of vertices. The paper also introduces a new class of walk-level constraints for narrowing the set of matches. Constraints meeting criteria defined in the paper are useful for specifying more precise patterns and for improving algorithm performance. Results are provided on a variety of graphs, with size ranging up to billions of vertices and edges, including graphs that follow a power law degree distribution. © 2012 Elsevier Inc. All rights reserved.

More Details

C%2B%2B tensor toolbox user manual

Plantenga, Todd P.; Kolda, Tamara G.

The C++ Tensor Toolbox is a software package for computing tensor decompositions. It is based on the Matlab Tensor Toolbox, and is particularly optimized for sparse data sets. This user manual briefly overviews tensor decomposition mathematics, software capabilities, and installation of the package. Tensors (also known as multidimensional arrays or N-way arrays) are used in a variety of applications ranging from chemometrics to network analysis. The Tensor Toolbox provides classes for manipulating dense, sparse, and structured tensors in C++. The Toolbox compiles into libraries and is intended for use with custom applications written by users.

More Details

Analytics for Cyber Network Defense

Plantenga, Todd P.; Kolda, Tamara G.

This report provides a brief survey of analytics tools considered relevant to cyber network defense (CND). Ideas and tools come from elds such as statistics, data mining, and knowledge discovery. Some analytics are considered standard mathematical or statistical techniques, while others re ect current research directions. In all cases the report attempts to explain the relevance to CND with brief examples.

More Details
Results 1–25 of 30
Results 1–25 of 30