Sandians invited to submit jobs on the new system

Sandia is beefing up its high-performance computing capabilities with a new, AI-forward system that recently became available to members of the workforce.
“Cronus is the largest current generation NVIDIA-based, AI-capable system at Sandia,” said Steve Monk, manager of the Labs’ high-performance computing team.
Sandia data centers house 18 computing clusters, eight of which were ranked in November 2025 as among the fastest supercomputers in the world by the organization Top 500. Put them all together, and Sandia computing facilities can crunch 160 quadrillion calculations per second, also called 160 petaflops.
And while the bulk of these calculations are devoted to high-precision modeling and simulation, researchers across the Labs are increasingly interested in training and using AI models. These tools benefit from different kinds of chips than you find in traditional supercomputing powerhouses like Sandia’s El Dorado, which was the world’s 20th fastest supercomputer when it launched in 2024.
Cronus balances these demands, expanding what’s possible for the Labs’ AI work and traditional scientific computing on a shared, versatile platform.
New system a response to evolving needs

The team modeled Cronus after another Sandia system called Hops. Both perform AI and HPC workloads, but Cronus has newer graphics processing units that accelerate computations, and more of them.
“Hops has four GPUs per node,” Steve said. “Cronus has eight.”
But building and running a supercomputer, according to Jeff Ogden, a software stack engineer on Steve’s team, is like getting an orchestra to perform a complicated piece. You don’t get better performance just by adding more instruments. “We’re trying to make them all play together in tune,” he said.
Jeff and the rest of the HPC staff are the conductors, floor managers and repair techs. They integrate hardware, software, networking, storage, schedulers and cybersecurity so the systems run reliably. They also regularly fix components that inevitably misbehave now and then.
Cronus has 16 nodes, each like its own section in the orchestra made of many chips. Sometimes a workload only needs one node, but other simulations require more, which means those nodes must work together.
One of the design goals was better node-to-node performance than earlier systems. On Hops, two cables send information in and out of each node. But on Cronus, “each GPU has a dedicated connection to the high-speed interconnect,” Steve said, which helps keep performance high when workloads scale across multiple nodes.
How high? Jeff said he has seen data transfer rates hit a terabyte per second.
“Any time you connect things as fast as possible together, it makes them appear closer” for computation, which can dramatically improve data-heavy workflows like AI and ModSim, Jeff said.
Testing and validation recently completed
Cronus was installed in late 2025. After extensive testing, Steve and Jeff began inviting select groups to use the system in February to benchmark and validate real workflows. In early May, it became available to all members of the workforce.

Sandia’s Atlas team, which develops and maintains Sandia’s homegrown, locally hosted generative AI tool by the same name, was one of the first users. Two of their bigger AI models needed more than four GPUs per node to run, exceeding the maximum capacity of Hops.
“State-of-the-art GPUs allow for quick model exploration, greatly increase uptime and ultimately limit supply chain risks by keeping our entire tech stack in-house,” said Atlas developer Shane Poldervaart.
Modeling and simulation teams are beginning to use the new nodes as well.
“Though Sierra porting to the new Cronus cluster is in its early stages, we are confident this new machine and its even more powerful H200 GPUs will further accelerate the trend towards real time computational informed decision making across Sandia engineering disciplines,” said Nate Crane from the Computational Simulation center.
For now, Steve said, the message from the HPC team is that if your group has an AI workload, a hybrid AI-simulation workflow, or a compute-heavy problem you’ve been shelving for lack of the right platform, this is your invitation to bring it to the orchestra.
Members of the workforce can click here for details on the system and how to access it.