Evaluating energy and power profiling techniques for HPC workloads
Advanced power measurement capabilities are becoming available on large scale High Performance Computing (HPC) deployments. There exist several approaches to providing power measurements today, primarily through in-band (e.g. RAPL) and out-of-band measurements (e.g. power meters). Both types of measurement can be augmented with application-level profiling, however it can be difficult to assess the type and detail of measurement needed to obtain insight from the application power profile. This paper presents a taxonomy for classifying power profiling techniques on modern HPC platforms. Three HPC mini-applications are analyzed across three production HPC systems to examine the level of detail, scope, and complexity of these power profiles. We demonstrate that a combination of out-of-band measurement with in-band application region profiling can provide an accurate, detailed view of power usage without introducing overhead. This work also provides a set of recommendations for how to best profile HPC workloads.