Structural Simulation Toolkit (SST)

Rodrigues, Arun; Moore, Branden J.; Hammond, Simon; Hemmert, Karl S.; Voskuilen, Gwendolyn R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Performance Analysis for Using Non-Volatile Memory DIMMs: Opportunities and Challenges

Awad, Amro; Hammond, Simon; Hughes, Clayton; Rodrigues, Arun; Hemmert, Karl S.; Hoekstra, Robert J.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

DOI OSTI

Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation

Journal of Parallel and Distributed Computing

Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015

International Journal of High Performance Computing Applications

The impact of hybrid-core processors on MPI message rate

ACM International Conference Proceeding Series

Barrett, Brian; Brightwell, Ronald B.; Hammond, Simon; Hemmert, Karl S.

Power and energy concerns are motivating chip manufacturers to consider future hybrid-core processor designs that combine a small number of traditional cores optimized for single-thread performance with a large number of simpler cores optimized for throughput performance. This trend is likely to impact the way compute resources for network protocol processing functions are allocated and managed. In particular, the performance of MPI match processing is critical to achieving high message throughput. In this paper, we analyze the ability of simple and more complex cores to perform MPI matching operations for various scenarios in order to gain insight into how MPI implementations for future hybrid-core processors should be designed.

More Details

TYPE Conference YEAR 2013

OSTI Scopus

Using the Cray Gemini Performance Counters

Pedretti, Kevin; Vaughan, Courtenay T.; Hemmert, Karl S.; Barrett, Richard F.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

The Structural Simulation Toolkit

Proposed for publication in SIGMETRICS Performance Evaluation Review.

Proposed for publication in Future Generation Computer Systems.

Proposed for publication in Advances in Parallel Computing.

Ang, James A.; Brightwell, Ronald B.; Dosanjh, Sudip S.; Hemmert, Karl S.; Rodrigues, Arun

Abstract not provided.

More Details

TYPE Journal Article YEAR 2012

OSTI

Hemmert, Karl S.; Rodrigues, Arun; Underwood, Keith D.

The latency and throughput of MPI messages are critically important to a range of parallel scientific applications. In many modern networks, both of these performance characteristics are largely driven by the performance of a processor on the network interface. Because of the semantics of MPI, this embedded processor is forced to traverse a linked list of posted receives each time a message is received. As this list grows long, the latency of message reception grows and the throughput of MPI messages decreases. This paper presents a novel hardware feature to handle list management functions on a network interface. By moving functions such as list insertion, list traversal, and list deletion to the hardware unit, latencies are decreased by up to 20% in the zero length queue case with dramatic improvements in the presence of long queues. Similarly, the throughput is increased by up to 10% in the zero length queue case and by nearly 100% in the presence queues of 30 messages.

More Details

TYPE Conference YEAR 2005

OSTI

A hardware acceleration unit for MPE queue processing

Hemmert, Karl S.; Brightwell, Ronald B.; Rodrigues, Arun; Murphy, Richard C.; Underwood, Keith D.

Abstract not provided.

More Details

TYPE Conference YEAR 2004

OSTI

Publications

Search results