Publications

Results 3201–3225 of 9,998

Search results

Jump to search filters

A distributed-memory hierarchical solver for general sparse linear systems

Parallel Computing

Rajamanickam, Sivasankaran R.; Chen, Chao; Pouransari, Hadi; Boman, Erik G.; Darve, Eric

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We present various numerical results to demonstrate the versatility and scalability of the parallel algorithm.

More Details

Simple effective conservative treatment of uncertainty from sparse samples of random functions

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems. Part B. Mechanical Engineering

Romero, Vicente J.; Schroeder, Benjamin B.; Dempsey, James F.; Lewis, John R.; Breivik, Nicole L.; Orient, George E.; Antoun, Bonnie R.; Winokur, Justin W.; Glickman, Matthew R.; Red-Horse, John R.

This paper examines the variability of predicted responses when multiple stress-strain curves (reflecting variability from replicate material tests) are propagated through a finite element model of a ductile steel can being slowly crushed. Over 140 response quantities of interest (including displacements, stresses, strains, and calculated measures of material damage) are tracked in the simulations. Each response quantity’s behavior varies according to the particular stress-strain curves used for the materials in the model. We desire to estimate response variability when only a few stress-strain curve samples are available from material testing. Propagation of just a few samples will usually result in significantly underestimated response uncertainty relative to propagation of a much larger population that adequately samples the presiding random-function source. A simple classical statistical method, Tolerance Intervals, is tested for effectively treating sparse stress-strain curve data. The method is found to perform well on the highly nonlinear input-to-output response mappings and non-standard response distributions in the can-crush problem. The results and discussion in this paper support a proposition that the method will apply similarly well for other sparsely sampled random variable or function data, whether from experiments or models. Finally, the simple Tolerance Interval method is also demonstrated to be very economical.

More Details

Formulation and computation of dynamic, interface-compatible Whitney complexes in three dimensions

Journal of Computational Physics

Siefert, Christopher S.; Kramer, Richard M.; Voth, Thomas E.; Bochev, Pavel B.

A discrete De Rham complex enables compatible, structure-preserving discretizations for a broad range of partial differential equations problems. Such discretizations can correctly reproduce the physics of interface problems, provided the grid conforms to the interface. However, large deformations, complex geometries, and evolving interfaces makes generation of such grids difficult. We develop and demonstrate two formally equivalent approaches that, for a given background mesh, dynamically construct an interface-conforming discrete De Rham complex. Both approaches start by dividing cut elements into interface-conforming subelements but differ in how they build the finite element basis on these subelements. The first approach discards the existing non-conforming basis of the parent element and replaces it by a dynamic set of degrees of freedom of the same kind. The second approach defines the interface-conforming degrees of freedom on the subelements as superpositions of the basis functions of the parent element. These approaches generalize the Conformal Decomposition Finite Element Method (CDFEM) and the extended finite element method with algebraic constraints (XFEM-AC), respectively, across the De Rham complex.

More Details

Sparse Matrix-Matrix Multiplication on Multilevel Memory Architectures: Algorithms and Experiments

Deveci, Mehmet D.; Hammond, Simon D.; Wolf, Michael W.; Rajamanickam, Sivasankaran R.

Architectures with multiple classes of memory media are becoming a common part of mainstream supercomputer deployments. So called multi-level memories offer differing characteristics for each memory component including variation in bandwidth, latency and capacity. This paper investigates the performance of sparse matrix multiplication kernels on two leading highperformance computing architectures — Intel's Knights Landing processor and NVIDIA's Pascal GPU. We describe a data placement method and a chunking-based algorithm for our kernels that exploits the existence of the multiple memory spaces in each hardware platform. We evaluate the performance of these methods w.r.t. standard algorithms using the auto-caching mechanisms Our results show that standard algorithms that exploit cache reuse performed as well as multi-memory-aware algorithms for architectures such as Ki\iLs where the memory subsystems have similar latencies. However, for architectures such as GPUS where memory subsystems differ significantly in both bandwidth and latency, multi-memory-aware methods are crucial for good performance. In addition, our new approaches permit the user to run problems that require larger capacities than the fastest memory of each compute node without depending on the software-managed cache mechanisms.

More Details
Results 3201–3225 of 9,998
Results 3201–3225 of 9,998