Sandia LabNews

Sandia turns on Sky Bridge supercomputer


COOL GUY — Dave Martinez (9324) shows off the new Sky Bridge installation. Dave is the Facilities project lead and is responsible for making sure the supercomputer gets power and cooling. He has been a key innovator for Sandia’s corporate data centers and has enabled us the Labs to deploy new technologies in power and cooling in machines such as Sky Bridge.

COOL GUY — Dave Martinez (9324) shows off the new Sky Bridge installation. Dave is the Facilities project lead and is responsible for making sure the supercomputer gets power and cooling. He has been a key innovator for Sandia’s corporate data centers and has enabled us the Labs to deploy new technologies in power and cooling in machines such as Sky Bridge.

A ribbon-cutting ceremony for the 600-teraflop Sky Bridge supercomputer, the most powerful institutional machine ever acquired by Sandia, will be held on Dec. 18.

Sky Bridge’s new home at one time housed ASCI Red, the world’s first teraflop computer. Technical advances have enabled Sky Bridge, with nearly 600 times the computational muscle, to draw only two-thirds the electrical power and require about half the space of its illustrious predecessor.

The efficiently water-cooled machine also should cost about 50 percent less to operate than comparable air-cooled machines, and will execute the newest computer programs well enough to “enable new solutions to difficult national security-related problems,” says John Noe, manager of Scientific Computing Systems (9328).

Sky Bridge will increase Sandia’s mission-computing capacity by nearly 40 percent, providing 259 million processor hours per year across its 1,848 nodes.

Sky Bridge was funded with $10 million through the newly launched Institutional Computing program, which itself was created by an executive leadership decision to support large-scale computing as an ongoing Laboratories capability. 

The machine is considered a capacity cluster, which means it can handle a broad range of small- to medium-size workloads while running multiple problems at the same time, says Steve Monk (9328).

“In dedicated access mode, it can be used to solve problems that require lots of compute capability, but that is not its normal operations model,” Steve says.

One factor in the acquisition decision was the cost savings associated with the liquid-cooled system, says John. “The facilities cost for a hybrid liquid/air-cooled system was 50 percent of the cost of a completely air-cooled system, because the latter would have required many computer-room air conditioners. And it should be cheaper to run.” 

The liquid-cooling option also reduces noise to less-than-hazardous levels, meaning that operations personnel do not require hearing protection to service Sky Bridge.

Lest anyone think that Sandians would jubilate over unproven cost savings, “We have a unique opportunity to measure identical systems, one air-cooled in another computer and one hybrid liquid/air cooled (Sky Bridge) to determine the exact operating cost differences,” says John.

Built by Cray Inc., Sky Bridge relies on the same generation of hardware found on the successful (though air-cooled) Tri-Lab Linux capacity cluster supercomputers installed at Sandia, Los Alamos, and Lawrence Livermore national laboratories.

Sandia’s new Institutional Computing program also provides funds to augment traditional scientific computing platforms with specialized systems that perform well on informatics, graph analysis, big data searches, “emulytics,” and other burgeoning problem areas.  Emulytics is a Sandia-coined term indicating “the practice of using a powerful computer or network of computers to emulate a highly complex but unmanageable system in an attempt to gain knowledge about the behavior of the larger system,” says John. 

Sky Bridge should be available to Sandia HPC users in January.