Publications Search

Application performance on the tri-lab linux capacity cluster -TLCC

International Journal of Distributed Systems and Technologies

Rajan, Mahesh; Doerfler, Douglas W.; Vaughan, Courtenay T.; Epperson, Marcus

In a recent acquisition by DOE/NNSA several large capacity computing clusters called TLCC have been installed at the DOE labs: SNL, LANL and LLNL. TLCC architecture with ccNUMA, multi-socket, multi-core nodes, and InfiniBand interconnect, is representative of the trend in HPC architectures. This paper examines application performance on TLCC contrasting them with Red Storm/Cray XT4. TLCC and Red Storm share similar AMD processors and memory DIMMs. Red Storm however has single socket nodes and custom interconnect. Micro-benchmarks and performance analysis tools help understand the causes for the observed performance differences. Control of processor and memory affinity on TLCC with the numactl utility is shown to result in significant performance gains and is essential to attenuate the detrimental impact of OS interference and cache-coherency overhead. While previous studies have investigated impact of affinity control mostly in the context of small SMP systems, the focus of this paper is on highly parallel MPI applications.

More Details

TYPE Report YEAR 2010

OSTI Scopus

HPC top 10 InfiniBand Machine : a 3D Torus IB interconnect on Red Sky

Naegle, John H.; Monk, Stephen T.; Schutt, James A.; Doerfler, Douglas W.; Rajan, Mahesh

This presentation discusses the following topics: (1) Red Sky Background; (2) 3D Torus Interconnect Concepts; (3) Difficulties of Torus in IB; (4) New Routing Code for IB a 3D Torus; (5) Red Sky 3D Torus Implementation; and (6) Managing a Large IB Machine. Computing at Sandia: (1) Capability Computing - Designed for scaling of single large runs, Usually proprietary for maximum performance, and Red Storm is Sandia's current capability machine; (2) Capacity Computing - Computing for the masses, 100s of jobs and 100s of users, Extreme reliability required, Flexibility for changing workload, Thunderbird will be decommissioned this quarter, Red Sky is our future capacity computing platform, and Red Mesa machine for National Renewable Energy Lab. Red Sky main themes are: (1) Cheaper - 5X capacity of Tbird at 2/3 the cost, Substantially cheaper per flop than our last large capacity machine purchase; (2) Leaner - Lower operational costs, Three security environments via modular fabric, Expandable, upgradeable, extensible, and Designed for 6yr. life cycle; and (3) Greener - 15% less power-1/6th power per flop, 40% less water-5M gallons saved annually, 10X better cooling efficiency, and 4x denser footprint.

More Details

TYPE Conference YEAR 2010

OSTI

Copy of Predicting AMD Magny-Cours Performance for a Suite of NNSA/ASC Applications

Doerfler, Douglas W.; Rajan, Mahesh; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Report YEAR 2010

OSTI

Evaluation of the impact chip multiprocessors have on SNL application performance

Doerfler, Douglas W.

This report describes trans-organizational efforts to investigate the impact of chip multiprocessors (CMPs) on the performance of important Sandia application codes. The impact of CMPs on the performance and applicability of Sandia's system software was also investigated. The goal of the investigation was to make algorithmic and architectural recommendations for next generation platform acquisitions.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Improving performance via mini-applications

Doerfler, Douglas W.; Crozier, Paul; Edwards, Harold C.; Williams, Alan B.; Rajan, Mahesh; Keiter, Eric R.; Thornquist, Heidi K.

Application performance is determined by a combination of many choices: hardware platform, runtime environment, languages and compilers used, algorithm choice and implementation, and more. In this complicated environment, we find that the use of mini-applications - small self-contained proxies for real applications - is an excellent approach for rapidly exploring the parameter space of all these choices. Furthermore, use of mini-applications enriches the interaction between application, library and computer system developers by providing explicit functioning software and concrete performance results that lead to detailed, focused discussions of design trade-offs, algorithm choices and runtime performance issues. In this paper we discuss a collection of mini-applications and demonstrate how we use them to analyze and improve application performance on new and future computer platforms.

More Details

TYPE SAND Report YEAR 2009

DOI OSTI

Recent Experiences on Performance and Scalability of SNL Applications on Red Storm and TLCC

Rajan, Mahesh; Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Report YEAR 2009

OSTI

Red Storm / Cray XT4: A Superior Architecture for Scalability

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

Red Storm/XT4: A Superior Architecture for Scalability

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2009

OSTI

HPC Architecture Research Presentations for Kansas State University

Doerfler, Douglas W.; Hemmert, Karl S.; Barrett, Brian; Kelly, Suzanne M.

Abstract not provided.

More Details

TYPE Report YEAR 2008

OSTI

Investigating the balance between capacity and capability workloads across large scale computing platforms

Rajan, Mahesh; Vaughan, Courtenay T.; Doerfler, Douglas W.; Benner, Robert E.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

MPI Task Placement on Multicores

Doerfler, Douglas W.

Abstract not provided.

More Details

TYPE Report YEAR 2008

OSTI

A preliminary evaluation of quad-core processors for Sandia applications

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI OSTI

Informatics : the rest of the story

Womble, David E.; Morgan, Harold S.; Doerfler, Douglas W.; Giunta, Anthony A.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

A Preliminary Evaluation of Quad-Core Processors for Sandia Applications

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2008

OSTI

Application Performance On Multicores

Doerfler, Douglas W.

Abstract not provided.

More Details

TYPE Report YEAR 2008

OSTI

Benchmarking Multicore Processors

Doerfler, Douglas W.; Rajan, Mahesh; Pedretti, Kevin

Abstract not provided.

More Details

TYPE Report YEAR 2007

OSTI

Application Performance Analysis and Modeling: An Overview of Recent SNL Initiatives

Doerfler, Douglas W.

Abstract not provided.

More Details

TYPE Report YEAR 2007

OSTI

ERDC has chosen to purchase a second machine based on the Red Storm architecture

Doerfler, Douglas W.

Abstract not provided.

More Details

TYPE Report YEAR 2007

OSTI

Supercomputer and cluster performance modeling and analysis efforts:2004-2006

Ang, James A.; Vaughan, Courtenay T.; Barnette, Daniel W.; Benner, Robert E.; Doerfler, Douglas W.; Ganti, Anand; Phelps, Sue C.; Rajan, Mahesh; Stevenson, Joel O.; Scott, Ryan T.

This report describes efforts by the Performance Modeling and Analysis Team to investigate performance characteristics of Sandia's engineering and scientific applications on the ASC capability and advanced architecture supercomputers, and Sandia's capacity Linux clusters. Efforts to model various aspects of these computers are also discussed. The goals of these efforts are to quantify and compare Sandia's supercomputer and cluster performance characteristics; to reveal strengths and weaknesses in such systems; and to predict performance characteristics of, and provide guidelines for, future acquisitions and follow-on systems. Described herein are the results obtained from running benchmarks and applications to extract performance characteristics and comparisons, as well as modeling efforts, obtained during the time period 2004-2006. The format of the report, with hypertext links to numerous additional documents, purposefully minimizes the document size needed to disseminate the extensive results from our research.

More Details

TYPE SAND Report YEAR 2007

DOI DOI OSTI OSTI

Red Storm: an Update on the Upgrade

Ballance, Robert A.; Doerfler, Douglas W.; Kelly, Suzanne M.; Tomkins, James L.; Stevenson, Joel O.

Abstract not provided.

More Details

TYPE Report YEAR 2006

OSTI

Measuring MPI send and receive overhead and application availability in high performance network interfaces

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Doerfler, Douglas W.; Brightwell, Ronald B.

In evaluating new high-speed network interfaces, the usual metrics of latency and bandwidth are commonly measured and reported. There are numerous other message passing characteristics that can have a dramatic effect on application performance that should be analyzed when evaluating a new interconnect. One such metric is overhead, which dictates the networks ability to allow the application to perform non-message passing work while a transfer is taking place. A method for measuring overhead, and hence calculating application availability, is presented. Results for several next-generation network interfaces are also presented. © Springer-Verlag Berlin Heidelberg 2006.

More Details

TYPE Conference YEAR 2006

OSTI Scopus

An Analysis of HyperTransport and Seastar Data Rates on Red Storm

Doerfler, Douglas W.

Abstract not provided.

More Details

TYPE SAND Report YEAR 2005

DOI OSTI

Characterizing compiler performance for the AMD Opteron processor on a parallel platform

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2005

OSTI

Characterizing compiler performance for the AMD Opteron processor on a parallel platform

Doerfler, Douglas W.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2005

OSTI

ATR2000 Mercury/MPI Real-Time ATR System User's Guide

Meyer, R.H.; Doerfler, Douglas W.

The Air Force's Electronic Systems Center has funded Sandia National Laboratories to develop an Automatic Target Recognition (ATR) System for the Air Force's Joint STARS platform using Mercury Computer systems hardware. This report provides general theory on the internal operations of the Real-Time ATR system and provides some basic techniques that can be used to reconfigure the system and monitor its runtime operation. In addition, general information on how to interface an image formation processor and a human machine interface to the ATR is provided. This report is not meant to be a tutorial on the ATR algorithms.

More Details

TYPE Report YEAR 2000

DOI OSTI

Publications

Search results