Low-Communication Asynchronous Distributed Generalized Canonical Polyadic Tensor Decomposition
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
IEEE Transactions on Parallel and Distributed Systems
MTTKRP is the bottleneck operation in algorithms used to compute the CP tensor decomposition. For sparse tensors, utilizing the compressed sparse fibers (CSF) storage format and the CSF-oriented MTTKRP algorithms is important for both memory and computational efficiency on distributed-memory architectures. Existing intelligent tensor partitioning models assume the computational cost of MTTKRP to be proportional to the total number of nonzeros in the tensor. However, this is not the case for the CSF-oriented MTTKRP on distributed-memory architectures. We outline two deficiencies of nonzero-based intelligent partitioning models when CSF-oriented MTTKRP operations are performed locally: failure to encode processors' computational loads and increase in total computation due to fiber fragmentation. We focus on existing fine-grain hypergraph model and propose a novel vertex weighting scheme that enables this model encode correct computational loads of processors. We also propose to augment the fine-grain model by fiber nets for reducing the increase in total computational load via minimizing fiber fragmentation. In this way, the proposed model encodes minimizing the load of the bottleneck processor. Parallel experiments with real-world sparse tensors on up to 1024 processors prove the validity of the outlined deficiencies and demonstrate the merit of our proposed improvements in terms of parallel runtimes.
Automated vehicles (AV) hold great promise for improving safety, as well as reducing congestion and emissions. In order to make automated vehicles commercially viable, a reliable and highperformance vehicle-based computing platform that meets ever-increasing computational demands will be key. Given the state of existing digital computing technology, designers will face significant challenges in meeting the needs of highly automated vehicles without exceeding thermal constraints or consuming a large portion of the energy available on vehicles, thus reducing range between charges or refills. The accompanying increases in energy for AV use will place increased demand on energy production and distribution infrastructure, which also motivates increasing computational energy efficiency.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Numerical Linear Algebra with Applications
The generalized singular value decomposition (GSVD) is a valuable tool that has many applications in computational science. However, computing the GSVD for large-scale problems is challenging. Motivated by applications in hyper-differential sensitivity analysis (HDSA), we propose new randomized algorithms for computing the GSVD which use randomized subspace iteration and weighted QR factorization. Detailed error analysis is given which provides insight into the accuracy of the algorithms and the choice of the algorithmic parameters. We demonstrate the performance of our algorithms on test matrices and a large-scale model problem where HDSA is used to study subsurface flow.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.