Publications Details

Publications / Other Report

A Need for Better Management of Heterogenous HPC Resources

Moreland, Kenneth D.; Atkins, Chuck

Achieving computational rates beyond the petascale requires an increasing amount of heterogeneity in our highperformance computing (HPC) hardware. This heterogeneity, in turn, means that a node of an HPC system can no longer be considered a monolithic resource. Rather, a node has many individual components such as processors, cores, SMT threads, accelerators, and tierd memories that must be further allocated and managed by... something. Currently, that something is an ad-hoc mix of arguments and environments when launching jobs. We follow the same process used on prior HPC systems with nodes of uniform components; unfortunately, the shims introduced to provide the additional specification of node-level components are inconsistent and unwieldy. As we shall see, even at our current moderate level of heterogeneity our effective utilization of HPC software is hampered by poor resource management. Future systems will continue to grow in heterogeneity both in number and type of resources. Our current approach to resource management cannot scale. We need a more cohesive approach to managing heterogeneous resources in HPC systems.