Dr. Horst Simon – Talk

Capability Machines Panel

Panel Chair: Dr. Horst Simon, Assoc. Lab Director for CS at Berkeley Lab, USA

The recent NRC study on the "Future of Supercomputing" defines capability computing as "enabling the solution of problems that cannot be otherwise solved in reasonable amount of time. … Capability computing also enables the solution of problems with real-time constraints." In my presentation I will describe the work load at NERSC, and how NERSC addresses the requirements of the Office of Science user community. While NERSC serves a large number of users (2400), I will argue that NERSC nevertheless provides capability computing. I will also discuss metrics that NERSC is using to assure that we meet capability requirements.

Questions for the Panelists:

Panelists,
(From the NRC report): "Two commonly used measures of the overall productivity of high end computing platforms are capacity and capability. The largest supercomputers are used for capability or turnaround computing where the maximum processing power is applied to a single problem. The goal is to solve a larger problem, or to solve a single problem in a shorter period of time. Capability computing enables the solution of problems that cannot otherwise be solved in a reasonable period of time (for example, by moving from a two-dimensional to a three-dimensional simulation, using finer grids, or using more realistic models). Capability computing also enables the solution of problems with real-time constraints (e.g., intelligence processing and analysis). The main figure of merit is time to solution. Smaller or cheaper systems are used for capacity computing, where smaller problems are solved. Capacity computing can be used to enable parametric studies or to explore design alternatives; it is often needed to prepare for more expensive runs on capability systems. Capacity systems will often run several jobs simultaneously. The main figure of merit is sustained performance per unit cost. There is often a trade-off between the two figures of merit, as further reduction in time to solution is achieved at the expense of increased cost per solution; different platforms exhibit different trade-offs. Capability systems are designed to offer the best possible capability, even at the expense of increased cost per sustained performance, while capacity systems are designed to offer a less aggressive reduction in time to solution but at a lower cost per sustained performance."

With this in mind, please answer the following questions/address the following topics in your 10 minute presentation:

  1. Briefly describe the capability resources at your site.
  2. By example describe one or two applications, where your unique capability platform was critical in providing a solution.
  3. Do you agree with the above distinction between capability and capacity? If not, how would you define these terms?
  4. Is the distinction between capability and capacity useful?
  5. What metrics do you use to measure "capability"?
  6. How could we as a community improve these metrics?