Performance and Modeling Tools (SST & PerfMiner)
Abstract not provided.
Abstract not provided.
For the FY15 ASC L2 Trilab Codesign milestone Sandia National Laboratories performed two main studies. The first study investigated three topics (performance, cross-platform portability and programmer productivity) when using OpenMP directives and the RAJA and Kokkos programming models available from LLNL and SNL respectively. The focus of this first study was the LULESH mini-application developed and maintained by LLNL. In the coming sections of the report the reader will find performance comparisons (and a demonstration of portability) for a variety of mini-application implementations produced during this study with varying levels of optimization. Of note is that the implementations utilized including optimizations across a number of programming models to help ensure claims that Kokkos can provide native-class application performance are valid. The second study performed during FY15 is a performance assessment of the MiniAero mini-application developed by Sandia. This mini-application was developed by the SIERRA Thermal-Fluid team at Sandia for the purposes of learning the Kokkos programming model and so is available in only a single implementation. For this report we studied its performance and scaling on a number of machines with the intent of providing insight into potential performance issues that may be experienced when similar algorithms are deployed on the forthcoming Trinity ASC ATS platform.
Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures, NANOARCH 2015
We discuss a new approach to computing that retains the possibility of exponential growth while making substantial use of the existing technology. The exponential improvement path of Moore's Law has been the driver behind the computing approach of Turing, von Neumann, and FORTRAN-like languages. Performance growth is slowing at the system level, even though further exponential growth should be possible. We propose two technology shifts as a remedy, the first being the formulation of a scaling rule for scaling into the third dimension. This involves use of circuit-level energy efficiency increases using adiabatic circuits to avoid overheating. However, this scaling rule is incompatible with the von Neumann architecture. The second technology shift is a computer architecture and programming change to an extremely aggressive form of Processor-In-Memory (PIM) architecture, which we call Processor-In-Memory-and-Storage (PIMS). Theoretical analysis shows that the PIMS architecture is compatible with the 3D scaling rule, suggesting both immediate benefit and a long-term improvement path.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.