Massively Parallel Computing at Sandia
Last modified: August 6, 1997
Questions and Comments ||
Acknowledgment and Disclaimer
Considered a scientific novelty just a few years ago, massively parallel computing is being applied at Sandia National Laboratories to a broad array of typical engineering and science problems. Massive parallelism involves the use of a thousand or more computer processors working in parallel to solve large and complex problems very rapidly. It requires new forms of mathematics, algorithms, software, software tools, and new parallel computer architectures.
The more than 50 Sandia staff members involved in massively parallel computing are an interdisciplinary team with expertise in the physical and chemical sciences, materials science, engineering, astronomy, mathematics, and computer science.
There is a strong applications focus to the staff’s work. A range of problems is being solved on massively parallel computers, including initiatives in radar imaging, shock physics, structural mechanics, fluid mechanics, heat transfer, the motion of charged particle beams, electronic structure, and quantum chemistry.
Early collaboration with nCUBE Corporation demonstrated the practicality of massively parallel computing, helping generate the broad interest it now enjoys. A new partnership with Intel Corporation is assuring continued U.S. leadership in the field.
Sandia first captured national attention for its work in massively parallel processing in March 1988, when it won two supercomputing prizes -- the Karp Award for demonstrating unprecedented speedups using processors working together compared to processors running separately; and the Gordon Bell Prize for achieving a thousandfold speedup on three engineering problems analyzed with 1,024 processors working in parallel. Until the breakthrough by Sandia, most computer scientists believed that using even thousands of processors could speed up problem-solving by no more than 50 to 100 times the rate of a single processor. Sandia showed that when the problem size is increased in proportion to the number of processors – as almost always happens in real-world problems - the speed of solution can increase in proportion to the number of processors.
More recently, Sandia has been recognized for even greater achievements in massively parallel computing. In November 1994, a Sandia-led team, which included colleagues from the University of New Mexico and Intel, was awarded a second Gordon Bell Prize. The winning entry, titled "Applications of Boundary Element Methods on the Intel Paragon," was for a parallel dense equation solver. A second Sandia team, which included New Mexico State University researchers, was among four finalists in the Gordon Bell competition.
In December 1994, a team from Sandia and Intel regained the world record for computation speed. The record of 280 billion floating point operations per second (Gflops) surpassed the record of 170 Gflops that was established a few months earlier by a Japanese consortium, which in turn broke a record set by the Sandia/Intel team in May 1994. Higher speeds allow scientists to use computers to do such things as model structures in three dimensions, model complicated flows, and predict properties of materials. Sandia's goal is to reach a computer speed of one teraflop - or a trillion floating point operations per second.
The Intel Teraflops "Ultracomputer," which will have a peak performance capability of about 1.8 teraflops, or 1.8 trillion floating point operations per second, will be fully installed at Sandia by June 1997. The computer is a massively parallel system that will consist of 76 large computer cabinets, with 9,072 Pentium Pro processors and nearly 6 billion bytes of memory. It will cover about 1,600 square feet, enough to fill a moderate-sized home.
The $55 million teraflops computer represents the initial goal of the Accelerated Strategic Computing Initiative (ASCI), a 10-year program designed to move nuclear weapons design and maintenance from a test-based to simulation-based approach. The program could culminate in computers with hundreds of teraflops capabilities by 2005. Computers that powerful are needed to simulate the complex 3-D physics involved in nuclear-weapon performance, and to accurately predict the degradation of nuclear weapons components as they age in the stockpile.
Sandia has developed more than two dozen applications for massively parallel supercomputers. The Department of Defense has come to depend on Sandia's CTH shock physics code for testing the durability of armor. Sandia scientists applied the code in 1994 to simulation of the comet impact on Jupiter. The predictions, although controversial, were made before the impact and were very accurate.
Materials science codes are enabling the design of materials with application-specific properties. Efficient LED materials identified computationally are being grown by Hewlett-Packard for use in lasers and displays. A partnership with BIOSYM focuses on the design of catalysts for energy production and pollution minimization. And, Sandia has developed unique computational capabilities and software designed to analyze complex gases and surface chemistry in flowing systems. The software is used, for instance, by General Motors to help understand the chemical behavior of turbulent flames, at DuPont to understand the explosion and flammability limits of reagents in chemical processing, and at Intel to understand semiconductor fabrication chemistry.
Massively parallel computing and the invention of the "paving" algorithm have greatly advanced the science of mesh generation for modeling three-dimensional objects. Meshing tools have resulted in substantial productivity gains in U.S. industry. Ford Motor Co., for example, reported that meshing techniques reduce the time required to mesh a part from 36 hours to 30 minutes.
Interconnected software is being developed to solve large problems and to assure software interoperability. The Sandia-University of New Mexico Operating System (SUNMOS) has doubled the memory and throughput on Intel Paragon computers worldwide, and it forms the basis of a cooperative research and development agreement with Oracle and nCUBE.
Sandia's need to conduct large-scale simulations over long distances (from New Mexico to California) has driven work in high-speed networks and in high speed encryption and decryption. Sandia is a member of several gigabit testbeds, and has partnered with industry to help form the National Information Infrastructure Testbed. Sandia's technology focus areas include high performance, high reliability, secure communications infrastructure and heterogeneous, wide-area environments.
CLERVER, an amalgam of CLient-sERVER, is the core of Sandia-developed Interactive Collaborative Environment that allows workers in different locations to work on the same file in the same program. This solution is available in commercial products from Sun Solutions, a division of Sun Microsystems, Inc.
Sandia's Technology Information Environment with Industry (TIE-In) is creating a new mechanism for technology transfer. The program permits easy-to-use remote electronic access to national laboratory technology. DOE has designated TIE-In a user facility.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy.
Back to top of page || Back to RIE Home Page || Sandia Home Page
Last modified: August 6, 1997
Questions and Comments || Acknowledgment and Disclaimer