On-node Resource Management for Supercomputers

Researchers at Sandia National Laboratories are designing an integrated approach to resource management on the supercomputer node. Increasingly complex DOE applications and associated libraries create the problem of too many software components competing for shared on-node resources – system memory and processor cores on which to execute. Our resource management scheme will allow arbitration and tracking of resources to coordinate use of these resources, turning competition into cooperation. The approach combines interfaces to Sandia’s DARMA asynchronous many-task (AMT) software for policy decisions and to the Kokkos manycore performance portability library and Data Warehouse for resource deployment as clients of the resource manager.

Lifecycle of Supercomputer Node Resources
Lifecycle of Supercomputer Node Resources


Stephen Lecler Olivier, slolivi@sandia.gov

September 1, 2016