One million trillion computations per second envisioned by Sandia and Oak Ridge researchers

Ten years ago, people worldwide were astounded at the emergence of a teraflop supercomputer — that would be Sandia’s ASCI Red — able in one second to perform a trillion mathematical operations.

More recently, bloggers seem stunned that a machine capable of petaflop computing — a thousand times faster than a teraflop — could soon break the next barrier of a thousand trillion mathematical operations a second.

Now, almost without taking a breath, and before the world has actually achieved a petaflop supercomputer, a joint Institute for Advanced Architectures newly launched at Sandia and Oak Ridge national laboratories is charged with laying the groundwork for an exascale computer.

A thousand times faster than a petaflop, it would perform a million trillion arithmetic calculations per second.

A million trillion

What is the need for a machine to do that many calculations that fast?

Says Sandia Center 1400 Director James Peery, “An exascale computer is essential to perform more accurate simulations that, in turn, support solutions for emerging science and engineering challenges in national defense, energy assurance, advanced materials, climate, and medicine.”

Such machines would be better detectives of real-world conditions, able to help researchers more closely examine the interactions of larger numbers of particles over time periods divided into smaller segments.

Supported by NNSA and DOE’s Office of Science, the institute — a DOE Center of Excellence — is funded in FY08 by congressional mandate at $7.4 million.

The idea behind the institute — itself under consideration for a year and a half prior to its opening — is “to close critical gaps between theoretical peak performance and actual performance on current supercomputers,” says Sandia project lead Sudip Dosanjh (1420).

“We believe this can be done by developing novel and innovative computer architectures.”

One aim, he says, is to reduce or eliminate the growing mismatch between data movement and processing speeds.

Processing speed refers to the rapidity with which a processor can manipulate data to solve its part of a larger problem. Data movement refers to the act of getting data from a computer’s memory to its processing chip and then back again. The larger the machine, the farther away from a processor the data may be stored and the slower the movement of data.

“In an exascale computer, data might be tens of thousands of processors away from the processor that wants it,” says Sandia computer architect Doug Doerfler (1422). “But until that processor gets its data, it has nothing useful to do. One key to scalability is to make sure all processors have something to work on at all times.”

Splitting processors, increasing speed

Compounding the problem is new technology that has enabled designers to split a processor into first two, then four, and now eight cores on a single die. Some special-purpose processors have 24 or more cores on a die. Sudip suggests there might eventually be hundreds operating in parallel on a single chip.

“In order to continue to make progress in running scientific applications at these [very large] scales,” says Jeff Nichols, who heads the Oak Ridge branch of the institute, “we need to address our ability to maintain the balance between the hardware and the software. There are huge software and programming challenges and our goal is to do the critical R&D to close some of the gaps.”

Operating in parallel means that each core can work its part of the puzzle simultaneously with other cores on a chip, greatly increasing the speed at which a processor operates on data. The method does not require faster clock speeds, measured in faster gigahertz, which would generate unmanageable amounts of heat to dissipate as well as current leakage.

(As a side note, the new method bolsters the continued relevance of Moore’s Law, the 1965 observation of Intel cofounder Gordon Moore that the number of transistors placed on a single computer chip will double approximately every two years.)

Power considerations

Another problem for the institute is to reduce the amount of power needed to run a future exascale computer.

“The electrical power needed with today’s technologies would be many tens of megawatts — a significant fraction of a power plant. A megawatt can cost as much as a million dollars a year,” says Sudip. “We want to bring that down.”

Sandia and Oak Ridge will work together on these and other problems, he says. “Although all of our efforts will be collaborative, in some areas Sandia will take the lead and Oak Ridge may lead in others, depending on who has the most expertise in a given discipline.” In addition, a key component of the institute will be the involvement of industry and universities.

A spontaneous demonstration of wide interest in faster computing was evidenced in the response to an invitation-only workshop, “Memory Opportunities for High-Performing Computing,” sponsored in January by the institute.

Workshop organizers James Ang, Richard Murphy, and Arun Rodrigues (all 1422) planned for 25 participants but nearly 50 attended. Attendees represented the national labs, DOE, the National Science Foundation, the National Security Agency, the Defense Advanced Research Projects Agency, and leading manufacturers of processors and supercomputing systems. Robert Meisner (NNSA) and Fred Johnson (Office of Science) served on the program committee.

Sandia Lab News

A million trillion

Splitting processors, increasing speed

Power considerations