Data Storage/Management & Visualization

March 31st, 1998


Slide 1 – Data Storage / Management & Visualization

Heermann-1

Charleston SOS Workshop


Slide 2 – Major Topics Discussed

Heermann-2
  • Parallel File Systems
  • Apps I/O Libraries
  • I/O Tradeoffs
  • Potential I/O Solutions

Slide 3 – Why Generate Output?

Heermann-3
  • Internal Uses
    • Checkpoint
    • Postprocessing
    • Interprocess/Partition Communication
  • External Uses
    • Use data for other uses/users
    • Visualization
    • Archival Storage

Slide 4 – Parallel File Systems

Heermann-4
  • PMF- Parallel Media Files
    • One file on many disks
  • PNF- Parallel Node Files
    • many nodes access one file
  • MNMF- Many Node to Many Files
    • many nodes accessing many files

Slide 5 – Alphabet Soup of File Systems

Heermann-5
  • PFS Intel
  • PIOFS/GPFS IBM
  • XFS/XLV SGI
  • AdFS DEC
  • DFS, Galley, Passion, Veritas, “Sun FS”, Unicos, UFS, . . .
  • Merits and Demerits of DFS, NFS, HDF
    • All must be tailored to parallel SOS needs

Slide 6 – Role of Library Linked to Application

Heermann-6
  • (e.g. Silo, Exodus, HDF,. . . )
  • Collective I/O
  • Tie to App specific view of data
  • Buffering
    • Async I/O
    • Blocking to match hardware needs
  • Data Format Conversion
    • Must consider performance

Slide 7 – Role of Library Linked to Application con’t

Heermann-7
  • Restructuring of Data
    • M to N mapping
    • Subsetting
  • Compression/Decompression

Slide 8 – Speed vs. Flexible I/O

Heermann-8
  • Intermediate machine to convert
  • Use most capable machine to restructure data
  • Different use Model (Experiment)
    • Careful Planning of Runs
  • Use of funds
    • $Cnodes + $Cnet + $IOsys
    • What is balance on current MPPs

Slide 9 – I/O Tradeoffs

Heermann-9
  • Data volumes – unload in real-time
  • Closeness to real-time viz
  • Can you predict users data requests
    • how to structure output
  • How many users requests must be satisfied

Slide 10 – I/O Tradeoffs con’t

Heermann-10
  • How many users?
    • where does the data reside
    • how many copies (where physically)
  • Funds
  • How many retrievals
    • How much effort should be directed to structuring data
  • Different structures for different applications

Slide 11 – Potential Solutions to I/O Problems

Heermann-11
  • “Nominal C-plant” I/O
    • 10 Compute Nodes/ One I/O node
    • Parallel Media File System
      • Block Server
      • Directory Server
      • Optimized for internal use, but also can be accessed externally
    • Control & Data Separate
    • Desire no local disk on compute nodes

Slide 12

Heermann-12
  • “Nominal C-plant” I/O con’t
    • “Read & Broadcast capability
    • Checkpoint mode
    • Users have private data or environment to be attached
    • Work still in progress

Slide 13 -Potential Solutions to I/O Problems con’t

Heermann-13
  • Third I/O Interconnect
  • control – boot, diagnostics
  • “backplane” – data net for computation
  • I/O network
    • Data path between Compute and I/O nodes
    • I/O coordination & metadata
    • To effect “Double Headedness” (I.e. non compute node access to storage media
    • Increase compute node determinism

Slide 14 – Slide 13 -Potential Solutions to I/O Problems con’t

Heermann-14
  • Integrated Archival Storage System
  • Specialized I/O nodes
    • DIGITAL Server 8000 per 4-10 scalable units
  • I/O nodes integrated at the scalable unit level

Slide 15 – Visualization

Heermann-15
  • ASCI
  • Machine(s)
  • ASCI
  • Machine(s)
  • Data
  • Server
  • buffer
  • seamless transmission

Slide 16 – Scalable Visualization

Heermann-16


Slide 17 – Sandia ASCI RED

Heermann-17

Sandia ASCI RED