ORNL-Heterogeneous Distributed Computing Research

May 5th, 1998

Author: Jack Dongarra


Slide 1 – ORNL Heterogeneous Distributed Computing Research

ORNL Slide 1
  • Collaboration and “Grid” software
    • Electronic Notebook
    • Cumulvs – remote steering
    • NetSolve – transparent network computing
  • Numerical Linear Algebra
    • Scalapack – parallel NLA library
    • ATLAS – automatic NLA optimization
  • Heterogeneous Distributed Computing
    • PVM 3.4 – Unix and NT support
    • Harness – Next generation beyond PVM

Slide 2DOE 2000 Electronic Notebook Project

ORNL Slide 2
  • Goal: Design a common (open) Notebook Architecture
    • extensible as technology advances
    • interoperable with other notebook viewers
    • customizable for unique inputs of a given project
    • (Web Browser based)
    • Features: Can input anything a web page can hold and includes java applet sketchepad.
    • Can be shared w/ remote collaborators
    • Software is available to set up your own notebook www.epm.ornl.gov/~geist/java/applets/enote/
    • In use by over a 100 groups in research, education, and industry

Slide 3 – Cumulvs – Collaborative Computational Steering

ORNL Slide 3
  • Supports Multiple Viewers
  • Integrates w/ exiting viz/VR interfaces
    • AVS, vtk, EigenVR, etc.
  • Dynamic Attachment and Detachment
  • Collaborative Steering of Simulation
  • Automatic fault detection
  • Checkpoint/Restart of distributed run
  • Works with MPI and PVM applications
    • Environment for easy integration of interactive visualization computational steering, and fault tolerance into applications
    • Goal is to accelerate the process of Scientific Discovery
    • Remote steering of a Distributed Application www.epm.ornl.gov/cs/cumulvs.html

Slide 4 – NetSolve – Client/Server/Agent — Based Computing

ORNL Slide 4
  • Client-Server design
  • Network-enabled solvers
  • Seamless access to resources
  • Non-hierarchical system
  • Load Balancing
  • Fault Tolerance
  • Heterogeneous Environment Supported
    • Easy-to-use tool to provide efficient and uniform access to a variety of scientific packages on UNIX platforms
    • NetSolve Client
    • NetSolve Agent
    • Network Resources
    • Software Respository
    • Software is available www.cs.utk.edu/netsolve/

Slide 5 – ScaLAPACK – Numerical Linear Algebra Library

ORNL Slide 5
  • Follow on to LAPACK
  • Designed for MPPs – First math lib to do this
  • Numerical software that will work on a heterogeneous platform
  • Already in use by Cray, IBM, HP-Convex, Fujitsu, NEC, NAG, IMSL
    • Tailor performance & provide support
  • HPF interface available
  • “Out of Core” implementation
  • Still under development,
    • Java version, Sparse direct routines.

Slide 6 – ATLAS Project – Automatic Generation of Optimized NLA Kernels

ORNL Slide 6
  • Automatic generation of BLAS for RISC architectures.
  • Code generator 6K lines, takes about 1hr to run.
    • Done once for a new architecture.
  • Extension of BLAS to Sparse, Parallel and Mixed Precision Operations.
  • Extension of ATLAS to higher level operations.
    • SMPs
    • Pentium
    • SGI/Vector
    • DOD DSP
  • www.cs.utk.edu/atlas

Slide 7

ORNL Slide 7

Vendor: ATLAS


Slide 8 – TORC – Tennessee / Oak Ridge Cluster

ORNL Slide 8
  • Tennessee / Oak Ridge Cluster
  • Combining Wide-Area Pentium clusters with high-speed networking
  • ORNL Myrinet
    • 20x 266Mhz Pentium II
    • Fast Ether Giganet
  • UT
    • 8x 300Mhz Pentium II
    • Giganet Myrinet
    • 18x 200Mhz Pro
  • ATM
  • Mixed Linux/NT

Slide 9 – PVM 3.4 – Latest Version allows NT and Unix Custering

ORNL Slide 9
  • Computing across mixed Unix and NT clusters
  • Communication Context in dynamic environment
    • To allow multiple applications to safely interact
  • Persistent messages using tuple space
    • For tool and app discovery and attachment
  • Message handlers to allow environment extensions
  • User defined tracing to create performance tools
  • www.epm.ornl.gov/pvm
  • New capabilities for application developers

Slide 10 – HARNESS – Next Generation beyond PVM

ORNL Slide 10
  • Distributed plug-in interface
    • allows user to customize, optimize, and extend the environmentís features to match his applicationís needs.
  • Distributed Symmetric Control
    • Ultimate fault tolerance. Mobile applications.
  • Multiple distributed virtual machines
    • that can collaborate, merge, or split.
  • Dynamic application/tool plug-in
    • allows multiple parallel applications to attach, interact, and detach. Use in our next generation Cumulvs.
    • Pushing the Frontiers of Parallel Virtual Machines
    • www.epm.oml.gov/harness

Slide 11

ORNL Slide 11
  • Generalized
  • Plug-in
  • Machine
  • Multi-level hot-pluggability allows user to adaptability to a dynamic environment
  • Harness
  • Dynamically adaptable and extensible VM