Publications

25 Results

Search results

Jump to search filters

Performance of Gather/Scatter Operations

Pase, Douglas M.; Agelastos, Anthony M.

In this paper we describe the performance of two operations, gather and scatter. The operations we describe are simplified versions of those used in the implementation of sparse matrix libraries such as sparse-blas. Similar operations can also be found in benchmarks such as HPCG. Gather and scatter operations are memory load and store operations that are related to other memory operations such as Stream Triad and DAXPY, but have an additional dependence between memory loads that affects performance. We describe how the operations behave on current technology and identify features that enhance their performance. However, our description of hardware is general rather than specific to a single architecture.

More Details

Application of Performance Analysis Tools on SNL ASC Codes

Agelastos, Anthony M.; Pase, Douglas M.; Amspaugh, Kathleen A.; Dinge, Dennis D.; Haskell, Karen H.; Ice, Lisa I.; Lamb, Justin M.; Rajan, Mahesh R.; Shaw, Ryan P.; Stevenson, Joel O.; Brunini, Victor B.; Clausen, Jonathan C.; Crawford, Martin J.; Valdez, Greg D.

This milestone 1) exercised a broad set of performance profiling and analysis tools, including tools whose development has been promoted by the ASC program; 2) exercised the tools on two different SNL ASC codes, one Sierra code (Sierra/Aria, a C++ codebase) and one RAMSES code (ITS, a Fortran codebase); and 3) exercised the tools on multiple platforms, including the CTS-1 (e.g., Serrano) and ATS-1 Trinity (e.g., Mutrino) platforms. The milestone generated a plethora of strong and weak scaling, trend and profile data for multiple versions and problem cases for each of the two codes. A wealth of experience was gained with the various tools that included identification of problems, an improved understanding of feature sets, enhanced usage documentation, and insights for future tool-development. Results are provided from a large number and variety of performance analysis runs with the target codes, together with instructions for how to make use of the tools with the codes.

More Details
25 Results
25 Results