Trinity Architecture & Design
Abstract not provided.
Abstract not provided.
Concurreny and Computation: Practice and Experience
Abstract not provided.
Abstract not provided.
Abstract not provided.
Parallel Processing Letters
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
AIP Conference Proceedings
In this paper HPC architectural characteristics and their impact on application performance and scaling are investigated. Performance data gathered over several generations of very large HPC systems like: ASC Red Storm, ASC Purple, and a large InfiniBand cluster - Red Sky, are analyzed. As the number of cache coherent cores and number of NUMA domains at a compute node keeps increasing, we analyze their impact with a few simple benchmarks and several applications. We present bottlenecks and remedies examining production applications. We conclude with preliminary early-hardware performance data from the ASC Cielo, a petaFLOPS class future capability system. © 2010 American Institute of Physics.
Cielo, a Cray XE6, is the Department of Energy NNSA Advanced Simulation and Computing (ASC) campaign's newest capability machine. Rated at 1.37 PFLOPS, it consists of 8,944 dual-socket oct-core AMD Magny-Cours compute nodes, linked using Cray's Gemini interconnect. Its primary mission objective is to enable a suite of the ASC applications implemented using MPI to scale to tens of thousands of cores. Cielo is an evolutionary improvement to a successful architecture previously available to many of our codes, thus enabling a basis for understanding the capabilities of this new architecture. Using three codes strategically important to the ASC campaign, and supplemented with some micro-benchmarks that expose the fundamental capabilities of the XE6, we report on the performance characteristics and capabilities of Cielo.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Los Alamos and Sandia National Laboratories have formed a new high performance computing center, the Alliance for Computing at the Extreme Scale (ACES). The two labs will jointly architect, develop, procure and operate capability systems for DOE's Advanced Simulation and Computing Program. This presentation will discuss a petascale production capability system, Cielo, that will be deployed in late 2010, and a new partnership with Cray on advanced interconnect technologies.
Abstract not provided.
International Journal of Distributed Systems and Technologies
In a recent acquisition by DOE/NNSA several large capacity computing clusters called TLCC have been installed at the DOE labs: SNL, LANL and LLNL. TLCC architecture with ccNUMA, multi-socket, multi-core nodes, and InfiniBand interconnect, is representative of the trend in HPC architectures. This paper examines application performance on TLCC contrasting them with Red Storm/Cray XT4. TLCC and Red Storm share similar AMD processors and memory DIMMs. Red Storm however has single socket nodes and custom interconnect. Micro-benchmarks and performance analysis tools help understand the causes for the observed performance differences. Control of processor and memory affinity on TLCC with the numactl utility is shown to result in significant performance gains and is essential to attenuate the detrimental impact of OS interference and cache-coherency overhead. While previous studies have investigated impact of affinity control mostly in the context of small SMP systems, the focus of this paper is on highly parallel MPI applications.