A publication of the Office of Advanced Simulation & Computing, NA-114, NNSA Defense Programs

December 2007

NA-ASC-500-07—Issue
Return to this issue’s stories

Common Tri-Lab Capacity Computing Moves Closer to Goal

prototype of Linux capacity cluster

Left: Prototype of a new tri-lab Linux capacity cluster. Scalable units (SU) will be aggregated into clusters of two, four, six, or eight SU, with each cluster available for computing across the three defense laboratories.

The NNSA initiative to establish common computational environments on tri-lab computing systems will take major steps forward in FY08. For several years, a tri-lab effort known as Tripod has been under way to promote cost-effective, portable environments on the Linux-based cluster systems at Livermore, Los Alamos, and Sandia national laboratories.

In FY07, a tri-lab procurement was led by Livermore for a common Linux hardware, called the Tri-Laboratory Linux Capacity Cluster (TLCC07) systems. NNSA headquarters selected five proposals for FY08 support to establish a common production software stack for the TLCC07 hardware. The largest was submitted by Livermore and Sandia to deploy a common cluster management software stack based on Red Hat Enterprise Linux (RHEL) on all the TLCC07 clusters. This system software, called the Tripod Operating System Software (TOSS), is a tri-lab packaging of the CHAOS/SLURM environment that has been in production on Livermore systems for several years. In addition to the system software stack, four smaller proposals were selected for support, pending funding.

With the introduction of TOSS, many of the software management processes will be broadened and formalized to embrace and support tri-lab computing. Accomplishments thus far include deploying the TOSS support infrastructure, including a server for the TOSS software repository and for bug reporting by the three laboratories; releasing the Alpha version of TOSS in mid October; each laboratory successfully booting TOSS on at least one Linux node; and successfully running all the synthetic workload applications using LANL’s Gazebo test harness on a test system. The initial generally available release of software will be deployed with the first TLCC07 cluster.

DOE Privacy Disclaimer | Sandia Privacy Disclaimer | SAND 2007-8167 W

ASCeNews Archive | Contact Us

sandia logo Developed and maintained by Sandia National Laboratories for NA.114