Moab Software to Standardize Workload and Resources Across NNSA
The ASC Program has selected Cluster Resources, Inc.’s, Moab workload and resource management software as a standard for use across NNSA’s high-performance computing systems. The contract is the largest cluster and grid management contract in history. “Cluster Resources is honored to be selected,” said David Jackson, CEO of Cluster Resources, Inc. “There is no organization in the world which matches the technical expertise and scope of compute systems found at ASC in terms of scalability and architectural complexity.”
Lawrence Livermore, Sandia, and Los Alamos national laboratories initiated the search for a common resource and workload management solution to improve usability and manageability of their diverse resources and to attain an improved return on their significant computing investment. In addition, the laboratories also sought to enhance reporting for managed resources and to optimize resource utilization while maintaining the flexibility required to meet the individual needs of each site and project. All three laboratories have highly heterogeneous environments with systems that range from large-scale Intel and AMD Opteron-based systems provided by IBM, HP, Dell, and others, to more exotic and powerful systems such as Cray’s XT3 and IBM’s BlueGene.
As part of its initial acceptance and deployment, Moab will first be installed on the new unclassified Atlas system at LLNL, obtained through the Peloton procurement. LLNL has started working with Cluster Resources on a migration and acceptance plan. LLNL will also develop a tri-lab support model to help leverage support across the ASC Program.
The Moab solution adds significant manageability and optimization to HPC resources, while providing deployment methods that effectively minimize the risk and cost of adoption. Unique Moab capabilities allow it to be transparently deployed with little or no impact on the end-user; these capabilities include system workload, resource, and policy simulation, batch language translation, capacity-planning diagnostics, nonintrusive test facilities, and infrastructure stress testing. |