Slide 3 of 24
Profiler FeaturesCpu time/HW counters
Low overhead: 1-10% overhead depending on features
Precise: not based on sampling
Function-level and basic-block-level profiling (R2.7)
-Mprof=func for function level profiling
-Mprof=lines profiles at the 'basic block' level
CPU time and hardware (HW) counters. (R2.7)
- Always get cpu time for function level-profiling. Elapsed time and self-time (elapsed minus profiled children).
Optionally specify 2 HW counters to monitor at function level.(R2.7)
setenv PROFILE_COUNTERS PP_FLOPS,PP_RESOURCE_STALLS
Get 'self' and elapsed counter values by routine. (R2.7)
Optionally specify that one of the counters be used for line level profiling instead of using cpu time. (R2.7)
setenv PROFILE_COUNTERS PP_FLOPS,LINE,PP_RESOURCE_STALLS
List of HW counters with descriptions is on the web:
- http://www.sandia.gov/ASCI/Red/usage/perfeva.htm
Acknowledgement and Disclaimer