
Figure 1a. Traditional 2.5D Integration. Traditional 2.5D designs place components side-by-side, resulting in long distance data transmission that increases power consumption and limits bandwidth scaling. While easier to cool, this horizontal layout creates large physical footprints and energy bottlenecks that challenge post-exascale performance.
Figure 1b. Advanced 3D Integration. Advanced 3D integration stacks memory directly on compute layers to significantly reduce data transmission distance and energy while maximizing available bandwidth. This vertical architecture requires innovative thermal strategies and next generation hybrid copper bonding solutions to enable post-exascale computing efficiency. (Graphic courtesy of NVIDIA)
Sandia has embarked on a second round of collaboration with NVIDIA as part of the Advanced Simulation and Computing’s Advanced Memory Technology program. The AMT program brings together industry and NNSA laboratories—Sandia, Lawrence Livermore and Los Alamos—to inspire and support technology research and advancement in critical areas that impact ASC mission needs. Building on investments made in the previous round, this phase will focus on developing future multi-layer memory architectures that address the growing energy cost of moving data, while improving bandwidth to meet the requirements of the post-exascale computing era.
“Although much of the attention today focuses on the compute capabilities of chips to support AI, our experience shows we continue to need significant innovation in the memory subsystem,” said Simon Hammond, program director for advanced computing in the Office of Advanced Simulation & Computing at the NNSA. “Efficiently feeding each compute engine with data requires the most advanced memory technologies to ensure we can maximize the performance of each device.”
With DOE’s announcement of the Genesis Mission in the fall of 2025, this collaboration is positioned to respond to growing needs in artificial intelligence, while meeting current modeling and simulation demands. Future systems that combine traditional Mod-Sim and AI workflows to support the ASC mission will require accelerators that use advanced 3D integration of ultra-high bandwidth, low-latency and energy-efficient memory designs to meet aggressive application performance and efficiency targets outlined at the inception of the AMT program.
“The merging of modeling and simulation with AI for science and engineering has reached escape velocity and is now driving scientific progress around the world. Because performance of all parts of these hybrid workflows is directly dependent on memory technologies, accelerating the adoption of 3D stacked memories that can scale bandwidth, while simultaneously reducing power, is essential for extending the NNSA’s supercomputing leadership into the future,” said Dan Ernst, Senior Director of Supercomputing Products at NVIDIA.
The energy cost of moving data in and out of the processor is a challenge the computing industry faces that this collaboration is working towards optimizing. The further the processor must go to get the data, the more time and energy are required.
In current 2D designs, where components are mostly side-by-side, the distance can be quite large but cooling solutions can be placed directly against the hot component. In a 3D design, where components are piled in layers on top of each other, such as high-bandwidth memory, the distance data must travel can be reduced significantly but each layer produces its own heat. The heat is currently limited to dissipation through a surface layer, increasing the complexity of the cooling design.
With current technology, the significant thermal complexities and manufacturing bottlenecks introduced by full 3D stacking would limit the performance of both AI and Mod-Sim workloads. This collaboration focused on overcoming these limitations to unlock future performance and energy efficiency. Clay Hughes, principal member of technical staff, said, “The tighter integration of logic and memory in 3D packages creates significant opportunities, but important challenges remain in thermal management, power delivery and yield.”
By investing in early research development and collaborating with industry partners like NVIDIA, the AMT program can advance future 2.5D and 3D memory architectures to address current and future performance requirements of ASC mission codes. In addition, researchers can explore advancements in power efficiency and cooling methodologies required for these future architectures. This early engagement is key to mitigating risks and ensuring the successful deployment of these technologies.
Though AMT’s primary focus currently remains on traditional Mod-Sim, better memory technologies will also have a significant impact on AI-driven applications like those in the Genesis Mission. The advancements in memory technology developed through this partnership will play a crucial role in shaping the future of high-performance computing, ensuring that the ASC mission is well-equipped to meet the challenges of tomorrow.
By taking a co-design approach together, Sandia, Los Alamos, Lawrence Livermore, and NVIDIA, can ensure the most important Mod-Sim and AI workloads run at peak performance and energy efficiency on future architectures.
“The Tri-labs have had a long successful relationship with NVIDIA, leveraging multiple generations of accelerator technology for production ASC mission cycles,” said James H. Laros III, Senior Scientist and AMT program lead. “We are very happy to continue our leading-edge research and development efforts with NVIDIA so future technologies can likewise benefit evolving ASC mission requirements.”
May 5, 2026