Daniel A. Reed – Keynote

Keynote: The Challenge of Scale
Daniel A. Reed
Chancellor’s Eminent Professor
University of North Carolina at Chapel Hill
Director, Renaissance Computing Institute
USA
As node counts for terascale systems grow to tens of thousands, with petascale system likely to contain hundreds of thousands of nodes, we must rethink traditional assumptions about software scaling and manageability and hardware reliability. These challenges are exacerbated by the appearance of multicore chips, two-way and four-way now, but with hundred-way cores projected. In addition, a tsunami of new experimental and computational data poses equally vexing problems in analysis, transport and visualization. Collectively, these scaling challenges create power, cooling, reliability and performance challenges that will require new approaches if we are to realize the potential of petascale systems. Our thesis is that the “two worlds” of software – distributed systems and parallel systems – must meet, embodying ideas from each, if we are to build resilient systems. This talk will describe recent experiments on power and environmental monitoring, statistical sampling and reliability that suggest possible solutions.