Sandia LabNews

New Cielo supercomputer 10 times faster than current NNSA platform


Sandia and Los Alamos National Laboratory (LANL) researchers have jointly awarded a contract to Cray Inc. to build a supercomputer that will have more than 10 times the capability of NNSA’s current platform — the Purple supercomputer at Lawrence Livermore National Laboratory (LLNL).

The machine will support computations at Sandia and LANL, as well as at LLNL

 “Cielo will target extremely large problems that require petascale supercomputing,” says Sudip Dosanjh (1420), codirector of ACES (Alliance for Computing at Extreme Scale), a New-Mexico based partnership between Sandia and LANL. “This is the culmination of a two-year collaborative effort. We look forward to working with Cray to create an order-of-magnitude increase in capability for key NNSA national security applications.”

Because Cielo will be dedicated to running the largest and most demanding workloads involving modeling and simulation, it will support large single jobs capable of utilizing the entire platform.

This increased capability is expected to increase understanding of complex physics and improve confidence in the predictive capability of stockpile stewardship.

Additional capabilities in 2011

Installation is projected for the third quarter of 2010, with additional capability planned for 2011.

Design of the machine was led by Sandia in cooperation with LANL. The two labs will share day-to-day responsibilities for operation of the platform, which will be housed at LANL’s Strategic Computing Complex facility.

The selection of Cray — the industry partner chosen to build the approximately $54 million machine — was made through a competitive procurement process. The technical evaluation by members of the labs included design, procurement, and deployment.

 The ultimate design goal for the machine — part of NNSA’s Advanced Simulation and Computing (ASC) program — “is for Cielo’s increased capability to achieve higher degrees of fidelity in the models and reduce the total time to solution," says Doug Doerfler (1422), Cielo system architect.

ASC’s modeling & simulation applications “perform extremely well on the Cray XT architecture,” he says. “The XT has demonstrated fast execution times and excellent scaling characteristics while also providing a reliable and robust environment for our users.”

Based on next-gen Cray architecture

 Cielo will be based on Cray’s next-generation “Baker” architecture with a new high-speed interconnect named “Gemini” that, says Doug, “will provide a transparent transition for our users and give a significant boost in performance.”

Says NNSA Administrator Thomas D’Agostino, “Cielo will be an invaluable addition to our supercomputing program, which enables NNSA to ensure the safety, security, and effectiveness of the nuclear stockpile.”

The future will produce even greater challenges, says Doug, because Cielo — as good as it’s expected to be — may be the last of its line in providing major improvements in computing capabilities without a major investment in new computing codes.

"Supercomputers are at an inflection point due to the development of massively multicore and heterogeneous processor architectures,” Doug says. “This is a huge issue for our algorithm and application teams, and at this point in time it’s not clear what the right solution is and how the codes should be written to support these future machines.”

NNSA plans to achieve an exascale computer capability by 2018.