Ron A. Oldfield

Scientific Machine Learning

Author profile picture

Scientific Machine Learning

raoldfi@sandia.gov

(505) 284-9153

Sandia National Laboratories, New Mexico
P.O. Box 5800
Albuquerque, NM 87185-1327

Biography

I manage the Scalable Analysis & Visualization department at Sandia National Laboratories. I joined Sandia National Laboratories in 1992 as a student intern from the University of New Mexico. After getting my B.S. from UNM in 1993, I joined the Computational Sciences department at Sandia and worked on a number of interesting projects writing programs for molecular visualization, data clustering, parallel acoustic wave propagation, and seismic imaging. In 1997, I left Sandia to attend graduate school at Dartmouth College, receiving a Ph.D. from the computer science department in May, 2003. Later in 2003, I returned to Sandia to join the scalable system software department. Since 2003, I have been actively engaged in a number of research areas, including system software, parallel file systems, scalable I/O libraries and middleware, resilience, performance modeling, and advanced architectures for analytics. In addition to managing the Scalable Analysis and Visualization department, I manage the ASC/CSSE projects in I/O and storage, and scalable data analysis; and I am the project lead for the ASC/ATDM data warehouse component.

Education

  • Ph.D., Computer Science, Dartmouth College, 2003
  • B.S., Computer Science, University of New Mexico, 1993

Publications

Ron A. Oldfield, (2022). The ASC Advanced Machine Learning Initiative at Sandia National Laboratories ASC/CEA Collaboration Meeting Document ID: 1539755

Ron A. Oldfield, Sharlotte LorraineBolyard Kramer, Ahmad Rushdi, Erin Acquesta, John M Emery, Paul Allen Kuberry, Jaideep Ray, Sarah Ackerman, Eric Christopher Cyr, Gary Joseph Saavedra, Clayton Hughes, Suma George Cardwell, John Darby Smith, (2021). The ASC Advanced Machine Learning Initiative at Sandia National Laboratories: FY21 Accomplishments and FY22 Plans ASC AMLI Program Review Document ID: 1381086

Ron A. Oldfield, Steven J. Plimpton, James H. Laros, David Zoeller Poliakoff, Andrew (LANL) Sornborger, (2021). ASC L2 Milestone 2840 Review Memo https://www.osti.gov/search/identifier:1825628 Document ID: 1355876

Ron A. Oldfield, (2021). The ASC Advanced Machine Learning Initiative at Sandia National Laboratories ASC PI Meeting https://www.osti.gov/search/identifier:1869391 Document ID: 1307089

Ron A. Oldfield, John T. Feddema, Justin T Newcomer, (2021). The ASC Advanced Machine Learning Initiative at Sandia National Laboratories ASC Advanced Machine Learning Initiative Workshop Document ID: 1294713

Craig D. Ulmer, Nathan D. Fabian, Todd Henry Kordenbrock, Shyamali Mukherjee, Ron A. Oldfield, Gary J. Templet, (2021). ATDM Data Management FY2015: Data Warehouse Progress Report https://www.osti.gov/search/identifier:1770714 Document ID: 342992

Ron A. Oldfield, Michael Wolf, Ronald B. Brightwell, (2020). ECP Capability Assessment Report for Software Technologies — Kokkos, Kokkos Kernels, VTK-m, Operating Systems and On-Node Runtimes https://www.osti.gov/search/identifier:1717885 Document ID: 1230336

Gary J. Templet Jr., Matthew R. Glickman, Todd Henry Kordenbrock, Scott Larson Nicoll Levy, Gerald Fredrick Lofstead, Jeff Mauldin, Thomas Jay Otahal, Craig D. Ulmer, Patrick Widener, Ron A. Oldfield, (2020). FY20 CSSE L2 Milestone 7186 Completion of L2 Milestone 7186 https://www.osti.gov/search/identifier:1820290 Document ID: 1196144

Gary J. Templet Jr., Matthew R. Glickman, Todd Henry Kordenbrock, Scott Larson Nicoll Levy, Gerald Fredrick Lofstead, Jeff Mauldin, Thomas Jay Otahal, Craig D. Ulmer, Patrick Widener, Ron A. Oldfield, (2020). Data Services for Visualization and Analysis ? ASC Level II Milestone (7186) https://www.osti.gov/search/identifier:1663267 Document ID: 1196150

Douglas Michael Pase, Anthony Michael Agelastos, Gary Lawson, Ron A. Oldfield, Joel O. Stevenson, Scott A. Warnock, (2020). Characterizing Resource Utilization on Production HPC Platforms to Inform Sustainable Policy, Procurements, and Development Tri-lab Advanced Simulation & Computing Sustainable Scientific Software Conference Document ID: 1103363

Tom Peterka, Deborah Bard, Janine Camille Bennett, E. Wes Bethel, Ron A. Oldfield, Line Pouchard, Christine Sweeney, Matthew Wolf, (2019). ASCR Workshop on In Situ Data Management https://www.osti.gov/search/identifier:1571714 Document ID: 1031326

Gabrielle Trujillo, Daniel Z. Turner, Ronald B. Brightwell, Ron A. Oldfield, Robert L. Clay, (2019). September 2019 ECP ST Project Review ECP-ST Review https://www.osti.gov/search/identifier:1646043 Document ID: 1032128

Ron A. Oldfield, (2019). Data Science and Computer Science Research at Sandia National Laboratories Georgia Tech College of Computing Graduate Student OrientationSandia Information Session https://www.osti.gov/search/identifier:1645816 Document ID: 998566

Ron A. Oldfield, (2019). Data Science and Computer Science R&D at SNL Sandia Academic Alliance Event — Sandia Day https://www.osti.gov/search/identifier:1643815 Document ID: 902429

Karla Weaver, Ron A. Oldfield, Huei Eliot Fang, Michael Gehl, Richard P. Muller, (2019). Sandia Day @ Georgia Tech Technical Breakout Sessions Sandia Day @ Georgia Tech https://www.osti.gov/search/identifier:1643814 Document ID: 902350

Steven J. Owen, Christopher Siefert, Craig Michael Vineyard, Ron A. Oldfield, (2019). SNL Data and Visualization: ML Projects at Sandia Exascale Computing Annual Meeting https://www.osti.gov/search/identifier:1591994 Document ID: 902257

Hemanth Kolla, Ron A. Oldfield, Thomas Jay Otahal, Gavin Matthew Baker, Jeffrey A. Mauldin, Tamara G. Kolda, Kenneth D. Moreland, (2018). SNL ATDM: In-situ Compression with ParaView/TuckerMPI ECP Annual Meeting https://www.osti.gov/search/identifier:1807003 Document ID: 901084

Ron A. Oldfield, Craig D. Ulmer, Gregory D. Sjaardema, Hemanth Kolla, Kenneth D. Moreland, (2018). October 2018 ECP ST Project Review: STDV04-SNL ATDM Data and Visualization Projects WBS 2.3.4.04 ECP Software Technologies Review Document ID: 876488

Ron A. Oldfield, (2018). Machine Learning for High-Performance Computing at Sandia ASC/CEA Collaboration Meeting Document ID: 807644

Ron A. Oldfield, Craig D. Ulmer, Gregory D. Sjaardema, (2018). ECP ST Capability Assesment Report SNL ATDM Data https://www.osti.gov/search/identifier:1528813 Document ID: 784918

Ron A. Oldfield, Vitus J. Leung, Warren Leon Davis, Craig Michael Vineyard, (2018). Machine Learning for High-Performance Computing at Sandia Applied Computer Science Technical Exchange Meeting Document ID: 760505

Steven J. Owen, Timothy Malcolm Shead, Shawn Martin, Teddy D. Blacker, Ron A. Oldfield, (2018). Machine Learning for Automated Simulation Model Preparation SciML2018DOE ASCR Workshop on Scientific Machine Learning Document ID: 738595

Ron A. Oldfield, Craig D. Ulmer, Patrick Widener, Harry Lee Ward, (2018). SPARC: Demonstrate burst-buffer-based checkpoint/restart on ATS-1 https://www.osti.gov/search/identifier:1417577 Document ID: 739064

Ron A. Oldfield, Craig D. Ulmer, Kenneth D. Moreland, (2017). December 2017 ECP ST Project Review: SNL ATDM Data and Visualization Projects ECP Project Review Document ID: 738217

Robert J. Hoekstra, Simon David Hammond, Karl Scott Hemmert, Ann C. Gentile, Ron A. Oldfield, Mike Lang, Steve Martin, (2017). Final Review of FY17 ASC CSSE L2 Milestone #6018 entitled ?Analyzing Power Usage Characteristics of Workloads Running on Trinity? https://www.osti.gov/search/identifier:1395433 Document ID: 682713

Craig D. Ulmer, Ron A. Oldfield, Todd Henry Kordenbrock, Scott Larson Nicoll Levy, Gerald Fredrick Lofstead, Shyamali Mukherjee, Gary J. Templet, Patrick Widener, (2017). ATDM Data Warehouse: Data Management Services for Exascale Computing Sandia CIS ERB https://www.osti.gov/search/identifier:1466487 Document ID: 670434

Ron A. Oldfield, (2017). Data Science R&D at Sandia National Laboratories Academic Alliance Early Career Faculty Day Document ID: 659103

Ron A. Oldfield, (2017). NNSA/ASC and CEA/DAM System Software Collaboration — Spring 2017 Update Nnsa/asc ? Cea/dam Computing Sciences Collaboration Workshop Document ID: 612908

Seung Woo (UMass Lowell) Son, Saba (Fermi National Accelerator Laboratory) Sehrish, Wei-keng (Northwestern University) Liao, Ron A. Oldfield, Alok (Northwestern University) Choudhary, (2017). Reducing I/O variability using dynamic I/O path characterization in petascale storage systems Journal of Supercomputing https://www.osti.gov/search/identifier:1356839 Document ID: 612303

Craig D. Ulmer, Craig D. Ulmer, Todd Henry Kordenbrock, Scott Larson Nicoll Levy, Gerald Fredrick Lofstead, Shyamali Mukherjee, Gregory D. Sjaardema, Gary J. Templet, Patrick Widener, Ron A. Oldfield, (2017). ATDM Data Warehouse Exascale Computing Project Annual Meeting https://www.osti.gov/search/identifier:1427407 Document ID: 577888

Ron A. Oldfield, (2016). Summary of Integrated HPC Visualization and Analysis Capabilities Briefing to Cloudera Document ID: 532224

Ron A. Oldfield, Patricia J. Crossno, Thomas Jay Otahal, Nathan D. Fabian, (2016). Demonstrate and Evaluate Advanced Analysis, Visualization, and I/O Capabilities for the SIERRA Toolkit ASC/CSSE Milestone Review https://www.osti.gov/search/identifier:1377712 Document ID: 507100

Gerald Fredrick Lofstead, Gregory Jean-Baptiste, Ron A. Oldfield, (2015). Delta: Data Reduction for Integrated Application Workflows https://www.osti.gov/search/identifier:1193147 Document ID: 307419

Kenneth D. Moreland, Ron A. Oldfield, (2015). Formal Metrics for Large-Scale Parallel Performance (slides) Isc 2015 https://www.osti.gov/search/identifier:1257157 Document ID: 286769

Robert (ANL) Ross, Robert (ANL) Ross, Gary (LLNL) Grider, Gary (LLNL) Grider, Evan (PNNL) Felix, Evan (PNNL) Felix, Mark (LLNL) Gary, Mark (LLNL) Gary, Scott (ORNL) Klasky, Scott (ORNL) Klasky, Ron A. Oldfield, Ron A. Oldfield, Galen (LANL) Shipman, Galen (LANL) Shipman, John (LBNL) Wu, John (LBNL) Wu, (2015). Storage Systems and Input/Output to Support Extreme Scale Science https://www.osti.gov/search/identifier:1459089 Document ID: 264886

Ron A. Oldfield, (2015). You Got Visualization in my Simulation: Integration of Simulation, Analysis, and Visualization at Extreme Scales DOE Computer Graphics Forum https://www.osti.gov/search/identifier:1249471 Document ID: 264708

Kenneth D. Moreland, Ron A. Oldfield, (2015). Formal Metrics for Large-Scale Parallel Performance ISC High Performance https://www.osti.gov/search/identifier:1248714 Document ID: 254298

Gerald Fredrick Lofstead, Matthew Leon Curry, Nathan D. Fabian, Todd Henry Kordenbrock, Shyamali Mukherjee, Ron A. Oldfield, Gregory D. Sjaardema, Gary J. Templet, Craig D. Ulmer, Patrick Widener, (2015). Enabling Capabilities for Intergrated Application Workflows Cis Erb https://www.osti.gov/search/identifier:1248699 Document ID: 243489

Ron A. Oldfield, (2015). Extreme-Scale Challenges for Integrating Simulation, Analysis, and Visualization SOS 19 Workshop https://www.osti.gov/search/identifier:1241106 Document ID: 232117

Ron A. Oldfield, (2015). Enabling Capabilities for Analysis at Extreme Scale ACS JOWOG 34 Applied Computer Science Meeting https://www.osti.gov/search/identifier:1504205 Document ID: 220837

Ron A. Oldfield, (2014). NVRAM Use Cases for HPC at Sandia SSIO Burst Buffer Workshop Document ID: 219286

Ron A. Oldfield, (2014). Addressing Scientific I/O Needs for Current and Future Architectures Storage Systems and I/O (SSIO) Summit https://www.osti.gov/search/identifier:1502778 Document ID: 155496

Ron A. Oldfield, (2014). Priorities for Exascale Systems that Support Integrated Application Workflows DOE Data Council Meeting Document ID: 155497

Gregory Jean-Baptiste, Gerald Fredrick Lofstead, Ron A. Oldfield, (2014). Delta: Data Reduction for Integrated Application Workflows} Parallel Data Storage Workshop at Supercomputing 2014 https://www.osti.gov/search/identifier:1563116 Document ID: 154642

Kenneth D. Moreland, Ron A. Oldfield, (2014). Formal Metrics for Large-Scale Parallel Performance Workshop on Visual Performance Analysis (VPA) https://www.osti.gov/search/identifier:1315135 Document ID: 132890

Nathan D. Fabian, Ron A. Oldfield, Kenneth D. Moreland, David Rogers, (2014). Data Co-Processing for Extreme Scale Analysis Conference on Data Analysis https://www.osti.gov/search/identifier:1731163 Document ID: 5333054

Kenneth D. Moreland, Ron A. Oldfield, Nathan D. Fabian, Berk Geveci, Andrew Bauer, David Lonie, (2014). Approaching Production In Situ Visualization for Extreme Scale Analysis (SIAM PP Minisymposium) Siam Pp https://www.osti.gov/search/identifier:1684890 Document ID: 5332846

David Rogers, Ron A. Oldfield, Nathan D. Fabian, Kenneth D. Moreland, (2013). Extreme-Scale CoProcessing: An Evaluation of In Situ and In Transit Analysis International Parallel & Distributed Processing Symposium (IPDPS) https://www.osti.gov/search/identifier:1115092 Document ID: 5329284

Gerald Fredrick Lofstead, Ron A. Oldfield, Jai Dayal, Karsten Schwan, (2013). D2T: Doubly Distributed Transactions for High Performance and Distributed Computing The 22nd International ACM Symposium on High Performance Parallel and Distributed Computing https://www.osti.gov/search/identifier:1661350 Document ID: 5323274

Deepesh K. Kholwadwala, Ron A. Oldfield, Patrick Widener, Adam Crume, Carlos Maltzahn, Matthew Leon Curry, (2013). Behavior-Based Simulation of Storage Devices Workshop on Modeling and Simulation of Exascale Systems https://www.osti.gov/search/identifier:1082225 Document ID: 5323239

David Rogers, Kenneth D. Moreland, Ron A. Oldfield, Nathan D. Fabian, (2013). Data Co-Processing for Extreme Scale Analysis Level II ASC Milestone (4745) https://www.osti.gov/search/identifier:1093707 Document ID: 5318603

Ronald B. Brightwell, Ron A. Oldfield, Arthur B. Maccabe, David E. Bernholdt, (2013). Composition and Virtualization as the Foundations of an Extreme-scale OS/R Workshop on Runtime and Operating Systems for Supercomputers https://www.osti.gov/search/identifier:1072602 Document ID: 5320765

Andrew T. Wilson, George W. Davidson, Craig D. Ulmer, Todd Kordenbrock, Ron A. Oldfield, (2013). Access to External Resources Using Service-Node Proxies Cug 2009 https://www.osti.gov/search/identifier:1503455 Document ID: 5271870

David R. Bronowski, Karl Scott Hemmert, Brian Barrett, Chad Kersey, Ron A. Oldfield, Marlo Weston, Rolf Riesen, Jeanine Cook, Paul Rosenfeld, Elliott Cooper-Balis, Bruce Jacob, (2012). The Structural Simulation Toolkit SIGMETRICS Performance Evaluation Review https://www.osti.gov/search/identifier:1063445 Document ID: 5316759

Kenneth D. Moreland, David Rogers, Nathan D. Fabian, Ron A. Oldfield, (2012). CSSE L2 Milestone – Data Co-Processing for Extreme Scale Analysis – Midterm Review ASC Level II Milestone Committee meeting (informal) https://www.osti.gov/search/identifier:1649753 Document ID: 5316699

Gerald Fredrick Lofstead, Ron A. Oldfield, Todd Kordenbrock, (2012). Experiences Applying Data Staging Technology in Unconventional Ways IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing https://www.osti.gov/search/identifier:1063344 Document ID: 5316134

Gerald Fredrick Lofstead, Qing Liu, Jeremy Logan, Yuan Tian, Hasan Abbasi, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Roselyne Tchoua, Ron A. Oldfield, Manish Parashar, Nagiza Samatova, Karsten Schwan, Arie Shoshani, Matthew Wolf, Kesheng Wu, Weikuan Yu, (2012). Hello ADIOS: The Challenges and Lessons of Developing Leadership Class I/O Frameworks Concurrency and ComputationPractice and Experience https://www.osti.gov/search/identifier:1061201 Document ID: 5313914

Nathan D. Fabian, Ron A. Oldfield, Andrew C. Bauer, Utkarsh Ayachit, Norbert Podhorszki, (2012). In-Situ Visualization with Catalyst Super Computing 2012 https://www.osti.gov/search/identifier:1116421 Document ID: 5312778

Gerald Fredrick Lofstead, Ron A. Oldfield, Jai Dayal, Karsten Schwan, (2012). D2T: Doubly Distributed Transactions for High Performance and Distributed Computing IEEE Cluster 2012 https://www.osti.gov/search/identifier:1061224 Document ID: 5313014

Gerald Fredrick Lofstead, Ron A. Oldfield, Jai Dayal, Karsten Schwan, (2012). D2T: Doubly Distributed Transactions for High Performance and Distributed Computing IEEE Cluster 2012 https://www.osti.gov/search/identifier:1061046 Document ID: 5313016

Gerald Fredrick Lofstead, Ron A. Oldfield, Jai Dayal, Karsten Schwan, (2012). D2T: Doubly Distributed Transactions for High Performance and Distributed Computing Hpdc 2012 https://www.osti.gov/search/identifier:1067554 Document ID: 5308955

Todd H. Kordenbrock, Ron A. Oldfield, (2012). Developing Integrated Data Services for Cray Systems with a Gemini Interconnect Cug 2012 https://www.osti.gov/search/identifier:1067597 Document ID: 5306727

Brian Barrett, Richard Frederick Barrett, James M. Brandt, Ronald B. Brightwell, Matthew Leon Curry, Nathan D. Fabian, Kurt Brian Ferreira, Ann C. Gentile, Karl Scott Hemmert, Suzanne M. Kelly, Ruth Ann Klundt, James H. Laros, Vitus J. Leung, Michael J. Levenhagen, Gerald Fredrick Lofstead, Kenneth D. Moreland, Ron A. Oldfield, Kevin Pedretti, Arun F. Rodrigues, David Thompson, Harry Lee Ward, John P. Vandyke, Courtenay T. Vaughan, Kyle Bruce Wheeler, Tom Tucker, (2012). Report of Experiments and Evidence for ASC L2 Milestone 4467 – Demonstration of a Legacy Applications Path to Exascale https://www.osti.gov/search/identifier:1039013 Document ID: 5305233

Brian Barrett, Richard Frederick Barrett, James M. Brandt, Ronald B. Brightwell, Matthew Leon Curry, Nathan D. Fabian, Kurt Brian Ferreira, Ann C. Gentile, Karl Scott Hemmert, Suzanne M. Kelly, Ruth Ann Klundt, James H. Laros, Vitus J. Leung, Michael J. Levenhagen, Gerald Fredrick Lofstead, Kenneth D. Moreland, Ron A. Oldfield, Kevin Pedretti, Arun F. Rodrigues, David Thompson, Harry Lee Ward, John P. Vandyke, Courtenay T. Vaughan, Kyle Bruce Wheeler, Tom Tucker, (2012). Demonstration of a Legacy Applications Path to Exascale – ASC L2 Milestone 4467 Presentation to L2 Milestone Review Panel https://www.osti.gov/search/identifier:1688616 Document ID: 5305236

Gerald Fredrick Lofstead, Ron A. Oldfield, Kenneth D. Moreland, Karsten Schwan, Greg Eisenhauer, Matthew Wolf, Hasan Abbasi, Scott Klasky, Nagi Rao, (2012). Extreme-Scale Analytics: Controlling and Provisioning Online Analytics for Dynamic End User Requirements XStack Grant Proposal https://www.osti.gov/search/identifier:1657433 Document ID: 5304233

Gerald Fredrick Lofstead, Ron A. Oldfield, Matthew Leon Curry, James H. Laros, Carlos Maltzahn, (2012). Valuing and Managing Data Based on Embodied Energy X-Stack Grant Proposal https://www.osti.gov/search/identifier:1657434 Document ID: 5304235

Gerald Fredrick Lofstead, Ron A. Oldfield, Jai Dayal, Karsten Schwan, (2011). Resilient Data Staging Through MxN Distributed Transactions https://www.osti.gov/search/identifier:1031296 Document ID: 5301897

Kenneth D. Moreland, Ron A. Oldfield, Nathan D. Fabian, Pat Marion, Sebastien Jourdain, Norbert Podhorszki, Venkatram Vishwanath, Ciprian Docan, Manish Prashar, Mark Hereld, Michael Papka, Scott Klasky, (2011). Examples of In Transit Visualization (Presentation Slides) Supercomputing https://www.osti.gov/search/identifier:1661484 Document ID: 5301902

Kirk A. Rackow, Ron A. Oldfield, Jon Stearley, James H. Laros, Kevin Pedretti, Ronald B. Brightwell, Rolf Riesen, (2011). Keeping Checkpoint/Restart Viable for Exascale Systems https://www.osti.gov/search/identifier:1029780 Document ID: 5299850

Kenneth D. Moreland, Ron A. Oldfield, Pat Marion, Sebastien Jourdain, Norbert Podhorszki, Venkatram Veshwanath, Nathan D. Fabian, Ciprian Docan, Manish Prashar, Mark Hereld, Michael Papka, Scott Klasky, (2011). Examples of In Transit Visualization Petascale Data AnalyticsChallenges and Opportunities (PDAC-11) https://www.osti.gov/search/identifier:1106221 Document ID: 5299312

Kirk A. Rackow, Ron A. Oldfield, Jon Stearley, James H. Laros, Kevin Pedretti, Ronald B. Brightwell, Rolf Riesen, (2011). rMPI: Increasing Fault Resiliency in a Message-Passing Environment https://www.osti.gov/search/identifier:1012733 Document ID: 5293699

Kirk A. Rackow, Jon Stearley, James H. Laros, Ron A. Oldfield, Kevin Pedretti, Ronald B. Brightwell, Rolf Riesen, Patrick G Bridges, Dorian Arnold, (2011). Evaluating the Viability of Process Replication Reliability for Exascale Systems The International Conference for High Performance Computing, Networking, Storage and Analysis https://www.osti.gov/search/identifier:1108309 Document ID: 5293967

David Alan Schoenwald, Jason E. Stamp, Joshua Stein, Robert J. Hoekstra, Jeffrey S. Nelson, Karina Munoz-Ramos, William C. McLendon, Thomas V. Russo, Laurence R. Phillips, Bryan T. Richardson, Andrew Charles Riehm, Paul Wolfenbarger, Brian M. Adams, Matthew J. Reno, Clifford Hansen, Ron A. Oldfield, (2011). Final Report for High Performance Computing for Advanced National Electric Power Grid Modeling and Integration of Solar Generation Resources, LDRD Project No. 149016 https://www.osti.gov/search/identifier:1011206 Document ID: 5291562

Kirk A. Rackow, Jon Stearley, Ron A. Oldfield, James H. Laros, Kevin Pedretti, Ronald B. Brightwell, (2010). Redundant Computing for Exascale Systems https://www.osti.gov/search/identifier:1011662 Document ID: 5290105

Kirk A. Rackow, Ron A. Oldfield, Jon Stearley, James H. Laros, Kevin Pedretti, Ronald B. Brightwell, Todd Kordenbrock, (2010). Increasing Fault Resiliency in a Message-Passing Environment https://www.osti.gov/search/identifier:1001015 Document ID: 5276698

Kenneth F. Alvin, Brian Barrett, Ronald B. Brightwell, Sudip S. Dosanjh, Karl Scott Hemmert, Richard C. Murphy, Ron A. Oldfield, Arun F. Rodrigues, Al Geist, Doug Kothe, Jeff Nichols, Jeffrey S. Vetter, (2010). On the Path to Exascale International Journal of Distributed Systems and Technologies (IJDST) https://www.osti.gov/search/identifier:1123730 Document ID: 5282712

Kirk A. Rackow, Rolf E. Riesen, Ron A. Oldfield, James H. Laros, Kevin Pedretti, Jon Stearley, Ronald B. Brightwell, (2010). rMPI: Increasing Fault Resiliency in a Message-Passing Environment nternational Conference for High Performance Computing, Networking, Storage, and Analysis https://www.osti.gov/search/identifier:1002112 Document ID: 5281778

Kirk A. Rackow, Rolf E. Riesen, Ron A. Oldfield, Ronald B. Brightwell, James H. Laros, Kevin Pedretti, (2009). HPC Application Fault-Tolerance Using Transparent Redundant Computation International Conference for High Performance Computing, Networking, Storage, and Analysis https://www.osti.gov/search/identifier:971418 Document ID: 5274791

Arthur B. Maccabe, Sarala Arunagiri, Ron A. Oldfield, Rolf E. Riesen, Harry Lee Ward, William Lawry, (2007). A lightweight approach to file system development File and Storage Technologies https://www.osti.gov/search/identifier:969119 Document ID: 5234002

Showing Results. Show More Publications

Curated Publications

Ph.D. Dissertation
  • Ron Oldfield. Efficient I/O for Computational Grid Applications. PhD thesis, Dept. of Computer Science, Dartmouth College, May 2003. Available as Dartmouth Computer Science Technical Report TR2003-459. [bibtex]
Book Chapters
  • Ron A. Oldfield, Todd Kordenbrock, and Patrick Widener. Data-movement approaches for HPC storage systems. In Ada Gavrilovska, editor, Attaining High Performance Communication: A Vertical Approach, chapter 17, pages 329-351. CRC Press, 2009. [bibtex]
  • Ron Oldfield and David Kotz. Scientific applications using parallel I/O. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, chapter 45, pages 655-666. IEEE Computer Society Press and John Wiley & Sons, 2001. [bibtex]
Journal Articles
  • Ron A. Oldfield, Gregory D. Sjaardema, Gerald F. Lofstead II, and Todd Kordenbrock. Trilinos I/O Support (Trios)Scientific Programming, 20(2):181-196, August 2012. [bibtex]
  • Ken Alvin, Brian Barrett, Ron Brightwell, Sudip Dosanjh, Al Geist, Scott Hemmert, Michael Heroux, Doug Kothe, Richard Murphy, Jeff Nichols, Ron Oldfield, Arun Rodrigues, and Jeffrey S. Vetter. On the path to exascale. International Journal of Distributed Systems and Technologies, 1(2):1-22, 2010. [bibtex]
  • Ron Oldfield and David Kotz. Improving data access for computational grid applications. Cluster Computing, The Journal of Networks, Software Tools and Applications, 9(1):79-99, January 2006. [bibtex]
  • B Bode, R Bradshaw, E DeBenedictus, N Desai, J Duell, G A Geist, P Hargrove, D Jackson, S Jackson, J Laros, C Lowe, E Lusk, W McLendon, J Mugler, T Naughton, J P Navarro, R Oldfield, N Pundit, S L Scott, M Showerman, C Steffen, and K Walker. Scalable system software: a component-based approach. Journal of Physics: Conference Series, 16:546-550, April 2005. [bibtex]
  • Ron Oldfield and David Kotz. Armada: a parallel I/O framework for computational grids. Future Generation Computing Systems (FGCS), 18(4):501-523, March 2002. [bibtex]
  • Ron A. Oldfield, David E. Womble, and Curtis C. Ober. Efficient parallel I/O in seismic imaging. The International Journal of High Performance Computing Applications, 12(3):333-344, Fall 1998. [bibtex]
  • Curtis C. Ober, Ron A. Oldfield, David E. Womble, and John Van Dyke. Seismic imaging on the Intel Paragon. Computers & Mathematics with Applications, 35(7):65 – 72, 1998. Advanced Computing on Intel Architectures. [bibtex]
Conference and Workshop Papers
  • Ron A. Oldfield, George Davidson, Craig Ulmer, and Andrew Wilson. Investigating the integration of supercomputers and data-warehouse appliances. In Proceedings of the 6th Workshop on UnConventional High Performance Computing, UCHPC 2013, Aachen, Germany, August 2013. [bibtex]
  • Ron B. Brightwell, Ron A. Oldfield, Arthur B. Maccabe, and David E. Bernholdt. Hobbes: Composition and virtualization as the foundations of an extreme-scale OS/R. In Proceedings of the International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’13, pages 2:1-2:8, Eugene, OR, June 2013. ACM Press. [bibtex]
  • Jay Lofstead, Ron A. Oldfield, and Todd H. Kordenbrock. Experiences applying data staging technology in unconventional ways. In 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Delft, The Netherlands, May 2013. IEEE/ACM. [bibtex]
  • Jay Lofstead, Jai Dayal, Karsten Schwan, and Ron Oldfield. D2T: Doubly distributed transactions for high performance and distributed computing. In Proceedings of the IEEE International Conference on Cluster Computing, Cluster 2012, pages 90-98. IEEE Press, September 2012. [bibtex]
  • Ron A. Oldfield, Todd Kordenbrock, and Jay Lofstead. Developing integrated data services for Cray systems with a Gemini interconnect. In Cray User Group Meeting, April 2012. [bibtex]
  • Jay Lofstead, Ron Oldfield, Todd Kordenbrock, and Charles Reiss. Extending scalability of collective I/O through Nessie and staging. In Proceedings of the 6th Parallel Data Storage Workshop, PDSW ’11, pages 7-12, Seattle, WA, November 2011. [bibtex]
  • Kenneth Moreland, Ron Oldfield, Pat Marion, Sebastien Joudain, Norbert Podhorszki, Venkatram Vishwanath, Nathan Fabian, Ciprian Docan, Manish Parashar, Mark Hereld, Michael E. Papka, and Scott Klasky. Examples of in transit visualization. In Proceedings of the 2nd International Workshop on Petascale Data Analytics: Challenges and Opportunities, PDAC ’11, pages 1-6, Seattle, WA, November 2011. [bibtex]
  • Kurt Ferreira, Jon Stearley, James H. Laros III, Ron Oldfield, Kevin Pedretti, Ron Brightwell, Rolf Riesen, Patrick Bridges, and Dorian Arnold. Evaluating the viability of process replication reliability for exascale systems. In Proceedings of SC2011: High Performance Networking and Computing, Seattle, WA, November 2011. ACM Press. [bibtex]
  • Jay Lofstead, Milo Polte, Garth Gibson, Scott A. Klasky, Karsten Schwan, Ron Oldfield, and Matthew Wolf. Six degrees of scientific data: Reading patters for extreme scale IO. In Proceedings of the Twentieth IEEE International Symposium on High Performance Distributed Computing, San Jose, CA, June 2011. IEEE Computer Society Press. [bibtex]
  • Jay Lofstead, Fang Zheng, Qing Liu, Scott Klasky, Ron Oldfield, Todd Kordenbrock, Karsten Schwan, and Matthew Wolf. Managing variability in the IO performance of petascale storage systems. In Proceedings of SC2010: High Performance Networking and Computing, November 2010. [bibtex]
  • Ron A. Oldfield, Brett W. Bader, and Peter Chew. Supporting multilingual document clustering on the Cray XT3. In SIAM Conference on Parallel Processing and Scientific Computing, February 2010. [bibtex]
  • Ron A. Oldfield, Andrew Wilson, George Davidson, and Craig Ulmer. Access to external resources using service-node proxies. In Proceedings of the Cray User Group Meeting, Atlanta, GA, May 2009. [bibtex]
  • Ron A. Oldfield, Rolf Riesen, Sarala Arunagiri, Patricia J. Teller, Maria Ruiz Varela, Seetharami Seelam, and Philip C. Roth. Impact of checkpoints on next-generation systems. In Cray User Group Technical Conference, May 2008. [bibtex]
  • Jay Lofstead, Chen Jin, Scott Klasky, Stephen Hodson, Weikuan Yu, Hasan Abbasi, Karsten Schwan, Matthew Wolf, Wei keng Liao, Alok Choudhary, Manish Parashar, Ciprian Docan, and Ron Oldfield. Adaptive I/O system (ADIOS). In Proceedings of the Cray User Group Meeting, Helsinki Finland, May 2008. [bibtex]
  • Ron A. Oldfield, Sarala Arunagiri, Patricia J. Teller, Seetharami Seelam, Rolf Riesen, Maria Ruiz Varela, and Philip C. Roth. Modeling the impact of checkpoints on next-generation systems. In Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies, San Diego, CA, September 2007. [bibtex]
  • Ron A. Oldfield, Lee Ward, Arther B. Maccabe, and Patrick Widener. Scalable security for MPP storage systems. In International Conference on Security and Management: Special Session on Security in Supercomputing Clusters, Las Vegas, NV, July 2007. Invited talk. [bibtex]
  • Ron A. Oldfield. Investigating lightweight storage and overlay networks for fault tolerence. In Proceedings of the High Availability and Performance Computing Workshop, Santa Fe, NM, October 2006. [bibtex]
  • Ron A. Oldfield, Arthur B. Maccabe, Sarala Arunagiri, Todd Kordenbrock, Rolf Riesen, Lee Ward, and Patrick Widener. Lightweight I/O for scientific applications. In Proceedings of the IEEE International Conference on Cluster Computing, Barcelona, Spain, September 2006. [bibtex]
  • Ron A. Oldfield, Patrick Widener, Arthur B. Maccabe, Lee Ward, and Todd Kordenbrock. Efficient data-movement for lightweight I/O. In Proceedings of the 2006 International Workshop on High Performance I/O Techniques and Deployment of Very Large Scale I/O Systems, Barcelona, Spain, September 2006. [bibtex]
  • Ron Oldfield and David Kotz. Armada: A parallel file system for computational grids. In Proceedings of the FirstIEEE/ACM International Symposium on Cluster Computing and the Grid, pages 194-201, Brisbane, Australia, May 2001. IEEE Computer Society Press. Best paper award. [bibtex]
  • Curtis Ober, Ron Oldfield, David Womble, L. Romero, and Charles Burch. Practical aspects of prestack depth migration with finite differences. In Proceedings of the 67th Annual International Meeting of the Society of Exploration Geophysicists, pages 1758-1761, Dallas Texas, November 1997. Expanded Abstracts. [bibtex]
  • Curtis Ober, Ron Oldfield, David Womble, John VanDyke, and Sudip Dosanjh. Seismic imaging on massively parallel computers. In Proceedings of the 1996 Simulations Multiconference, April 1996. [bibtex]
  • D.E. Womble, S.S. Dosanjh, J.P. VanDyke, R.A. Oldfield, and D.S. Greenberg. 3-d seismic imaging of complex geologies. In High Performance Computing Symposium 1995 `Grand Challenges in Computer Simulation`. Proceedings of the 1995 Simulation Multiconference, pages 405-410, Pheonix, AZ, April 1995. [bibtex]
  • Ron A. Oldfield, B. D. Semeraro, and J. P. VanDyke. Parallel acoustic wave propagation and generation of a seismic dataset. In Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, pages 243-244, San Fransisco, CA, February 1995. [bibtex]
Technical Reports
  • David Rogers, Kenneth Moreland, Ron Oldfield, and Nathan Fabian. Data co-processing for extreme scale analysis. Technical Report, Sandia National Laboratories, March 2013. Level II ASC Milestone 4745. [bibtex]
  • Brian Barrett, Richard Barrett, James Brandt, Ron Brightwell, Matthew Curry, Nathan Fabian, Kurt Ferreira, Ann Gentile, Scott Hemmert, Suzanne Kelly, Ruth Klundt, James Laros III, Vitus Leung, Michael Levenhagen, Gerald Lofstead, Ken Moreland, Ron Oldfield, Kevin Pedretti, Arun Rodrigues, David Thompson, Tom Tucker, Lee Ward, John Van Dyke, Courtenay Vaughan, and Kyle Wheeler. Demonstration of a legacy application’s path to exascale. Technical Report, Sandia National Laboratories, March 2012. [bibtex]
  • Jai Dayal, Gerald Lofstead, Karsten Schwan, and Ron Oldfield. Resilient data staging through MxN distributed transactions. Technical Report, Sandia National Laboratories, November 2011. [bibtex]
  • Kurt Ferreira, Rolf Riesen, Ron Oldfield, Jon Stearley, James Laros, Kevin Pedretti, and Ron Brightwell. Keeping checkpoint/restart viable for exascale systems. Technical Report, Sandia National Laboratories, September 2011. [bibtex]
  • Kurt Ferreira, Ron Oldfield, Jon Stearley, James Laros, Kevin Pedretti, Ron Brightwell, and Rolf Riesen. rMPI: Increasing fault resiliency in a message-passing environment. Technical Report, Sandia National Laboratories, Albuquerque, NM, April 2011. [bibtex]
  • David A. Schoenwald, Jason E. Stamp, Joshua S. Stein, Robert J. Hoekstra, Jeffrey S. Nelson, Karina Munoz, William C. McLendon, Thomas V. Russo, Laurence R. Phillips, Bryan T. Richardson, Andrew C. Riehm, Paul R. Wolfenbarger, Brian M. Adams, Matthew J. Reno, Clifford W. Hansen, and Ron A. Oldfield. High performance computing for advanced national electric power grid modeling and integration of solar generation resources. Technical Report, Sandia National Laboratories, Albuquerque, New Mexico 87185 and Livermore, California 94550, February 2011. Final Report for LDRD Project No. 149016. [bibtex]
  • Jon R. Stearley, Rolf Riesen, James H. Laros III, Kurt B. Ferreira, Kevin Pedretti, Ron A. Oldfield, and Ron Brightwell. Redundant computing for exascale systems. Technical Report , Sandia National Laboratories, December 2010. [bibtex]
  • Ron A. Oldfield. Lightweight storage and overlay networks for fault tolerance. Technical Report, Sandia National Laboratories, Albuquerque, NM, January 2010. LDRD Final Report. [bibtex]
  • Gerald F. Lofstead II, Karsten Schwan, Scott Klasky, and Ron A. Oldfield. Advanced I/O for large-scale scientific applications. Technical Report, Sandia National Laboratories, Albuquerque, NM, December 2009. [bibtex]
  • Kurt Ferreira, Rolf Riesen, Ron Oldfield, Jon Stearley, James Laros, Kevin Pedretti, Ron Brightwell, and Todd Kordenbrock. Increasing fault resiliency in a message-passing environment. Technical Report, Sandia National Laboratories, October 2009. [bibtex]
  • Sarala Arunagiri, , John Daly, Patricia J. Teller, Seetharami Seelam, Ron A. Oldfield, Maria Ruiz Varela, and Rolf Riesen. Opportunistic checkpoint intervals to improve system performance. Technical Report UTEP-CS-08-24, El Paso, TX, June 2008. [bibtex]
  • Sarala Arunagiri, Seetharami Seelam, Ron A. Oldfield, Maria Ruiz Varela, Patricia J. Teller, and Rolf Riesen. The impact of checkpoint latency on checkpoint interval and execution times. Technical Report TR07-55, University of Texas at El Paso, El Paso, TX, August 2007. [bibtex]
  • Ron A. Oldfield, Arthur B. Maccabe, Sarala Arunagiri, Todd Kordenbrock, Rolf Riesen, Lee Ward, and Patrick Widener. Lightweight I/O for scientific applications. Technical Report, Sandia National Laboratories, Albuquerque, NM, May 2006. [bibtex]
  • William C. McLendon and Ron A. Oldfield. APItest user guide. Technical Report Sandia National Laboratories, April 2005. [bibtex]
  • Ron Oldfield. Efficient I/O for computational grid applications. Technical Report TR2003-459, Dept. of Computer Science, Dartmouth College, May 2003. [bibtex]
  • Ron Oldfield and David Kotz. Using the Emulab network testbed to evaluate the Armada I/O framework for computational grids. Technical Report TR2002-433, Dept. of Computer Science, Dartmouth College, Hanover, NH, September 2002. [bibtex]
  • Ron Oldfield and David Kotz. Applications of parallel I/O. Technical Report PCS-TR98-337, Dept. of Computer Science, Dartmouth College, August 1998. Supplement to PCS-TR96-297. [bibtex]
  • Curtis Ober, Ron Oldfield, John VanDyke, and David Womble. Seismic imaging on massively parallel computers. Technical Report, Sandia National Laboratories, April 1996. [bibtex]

Research Projects

Current
  • Hobbes [FY13-FY16]: I am one of many collaborators on the ASCR-funded Hobbes project to develop an operating system and runtime framework for extreme-scale systems. Hobbes is intended to provide OS/R support for traditional HPC applications as well as an emerging class of "big-data" analytics codes.
  • Scalable I/O Research: [PM, FY08–]: I am the project manager for Sandia’s ASC/CSSE Scalable I/O Research project. This project provides support for I/O library and file systems on existing petascale ASC platforms as well as critical R&kD to provide I/O capabilities on future exascale platforms. The research performed in this project directly addresses two vital concerns for I/O on exascale platforms: scalable parallel file systems (Sirocco), and technologies for integration of computation and analysis (See Trios page for more details and links to software products).
  • Scalable Data Analysis [PM, FY13–]: I am the project manager for Sandia’s ASC/CSSE Scalable Data Analysis project. This project provides data analysis tools, R&D, and support for ASC customers from analysts and code developers to algorithm designers and hardware architects.
  • Behavioral Disk Simulation [PI, FY12–]: This is a small seed project (collaboration with UCSC) to explore a behavioral approach to storage device simulation. Matthew Curry has taken over PI duties for this project and is preparing for subsequent funding opportunities.
Past
  • HPC Informatics [PI, FY10-11]: This ASC/CSRF-funded project explored issues around leveraging HPC systems to address informatics problems. The primary contributions included demonstrations of software and hardware integration of the Cray Red Storm supercomputer and a Netezza data-warehouse appliance.
  • Network Grand Challenge [FY08-10]: The networks grand challenge LDRD performed R&D to evaluate analysis capabilities that address adversarial networks. I joined this project in its second year to provide an "HPC perspective", that is, to investigate ways to apply high-performance computing to these capabilities. Our primary contribution was the development of a scalable multilingual document clustering application. We performed scaling studies that clustered topic-related documents from a dataset of more than 10 million documents in 16 languages on 64K cores of the Cray Jaguar system at ORNL.
  • System-Directed Resilience LDRD [PI, FY08-11]: The goal of this project was to explore unconventional (non disk-based) methods for resilience in large-scale applications. The primary contribution was the theory, demonstration, and evaluation of redundant computation as a viable approach for exascale application resilience.
  • Lightweight Storage and Overlay Networks for Fault Tolerance [PI, FY07-09]: This project explored the use of compute nodes to stage checkpoint data for parallel applications. This was the first example of what is today called a "burst buffer" approach. We demonstrated through analytic models and a reference implementation of data-staging PnetCDF library, that it is possible to get order-of-magnitude "effective" bandwidth improvements for staged checkpoint operations.
  • Lightweight File System [PI, FY04-06]: The lightweight file system project investigated the applicability of lightweight solutions for storage systems. In LWFS, traditional filesystem semantics such as atomicity and naming are not provided by the storage system. Instead, LWFS emphasizes secure and direct access to storage devices, and is extensible to allow the use of additional services to match the specific needs of the application. The code developed for LWFS is the basis for the Nessie data-services software and Sirocco file system; two codes being developed for the ASC SIO research project.
  • Scalable System Software [FY03-05]: Scalable System Software (SSS) was a multi-institution SciDAC project to define interfaces and develop prototype system software for tera-scale MPP systems. I was the Sandia PI responsible for testing and evaluating the software. As part of this project, we developed the APITest testing framework.
  • Armada [PI, FY01-03]: My Ph.D. dissertation explored complex ways to compose and optimize large-scale geographically distributed applications that combined the use of disparate data sources, HPC codes, and intermediate processing comonents (e.g., filters, permutators, etc). The novelty of the work was in how we represented the application workflow as a series-parallel graph that could be expanded, compressed, and mapped to system resources based on resource availability and network performance.
  • 3D Seismic Imaging [FY96-99]: This project was a collaboration between SNL and a number of oil-and-gas companies to develop a scalable code for seismic imaging called Salvo. In 1999, Salvo won the R&D 100 award as one of the 100 most technologically significant products of the year. I developed a parallel I/O framework that used an "I/O partition" to offload pre/post processing functionality for performing FFTs, interpolation, and staging output results. This was a prelude to much of the in-situ and in-transit work currently being explored in the data-analysis and I/O community.
  • Generation of a Synthetic Seismic Dataset [PI, FY94-95]: The goal of this project was to generate a large (multi terabyte) synthetic seismic dataset that represents data gathered from the acquisition phase of oil-and-gas exploration. SNL, ORNL, LLNL, and LANL were each given the same sequential 10th order finite-difference acoustic wave propagation code and asked to parallelize the code for their respective HPC system, then generate and publish the data sets for the SEG overthrust and salt models. This data was later used to validate seismic imaging codes (like Salvo).
  • Molecular Visualization [FY92-94]: As an undergraduate intern at Sandia, I was given a number of interesting programming projects. One of these was to develop a code to visualize molecules. Since OpenGL did not exist at the time, I had to write code using SGI’s GL library to display, and transform (rotate, zoom, etc.) molecules. For example, Steve Plimpton used my code to generate this animation of liquid-crystal conformations.

Awards & Recognition

1999

Curtis Ober, David E. Womble, Louis Romero, Ron A. Oldfield, Robert Gjertsen, IBM, R&D 100 Award, Salvo - Seismic Imaging Software, R&D Magazine, September 1, 1999