Center for Computing Research (CCR)

As scientific simulations scale to use petascale machines and beyond, the data volumes generated pose a dual problem. First, with increasing machine sizes, the careful tuning of IO routines becomes more and more important to keep the time spent in IO acceptable. It is not uncommon, for instance, to have 20% of an application's runtime spent performing IO in a 'tuned' system. Careful management of the IO routines can move that to 5% or even less in some cases. Second, the data volumes are so large, on the order of 10s to 100s of TB, that trying to discover the scientifically valid contributions requires assistance at runtime to both organize and annotate the data. Waiting for offline processing is not feasible due both to the impact on the IO system and the time required. To reduce this load and improve the ability of scientists to use the large amounts of data being produced, new techniques for data management are required. First, there is a need for techniques for efficient movement of data from the compute space to storage. These techniques should understand the underlying system infrastructure and adapt to changing system conditions. Technologies include aggregation networks, data staging nodes for a closer parity to the IO subsystem, and autonomic IO routines that can detect system bottlenecks and choose different approaches, such as splitting the output into multiple targets, staggering output processes. Such methods must be end-to-end, meaning that even with properly managed asynchronous techniques, it is still essential to properly manage the later synchronous interaction with the storage system to maintain acceptable performance. Second, for the data being generated, annotations and other metadata must be incorporated to help the scientist understand output data for the simulation run as a whole, to select data and data features without concern for what files or other storage technologies were employed. All of these features should be attained while maintaining a simple deployment for the science code and eliminating the need for allocation of additional computational resources.

More Details

TYPE SAND Report YEAR 2010

OSTI DOI

Application-Level Data Services

Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Approaching Production In Situ Visualization for Extreme Scale Analysis (SIAM PP Minisymposium)

Moreland, Kenneth D.; Oldfield, Ron A.; Fabian, Nathan D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

ASCR Workshop on In Situ Data Management

Peterka, Tom P.; Bard, Deborah B.; Bennett, Janine C.; Bethel, E.W.; Oldfield, Ron A.; Pouchard, Line P.; Sweeney, Christine S.; Wolf, Matthew W.

Abstract not provided.

More Details

TYPE Other Report YEAR 2019

OSTI DOI

ATDM Data Warehouse

Ulmer, Craig D.; Ulmer, Craig D.; Kordenbrock, Todd H.; Levy, Scott L.; Lofstead, Gerald F.; Mukherjee, Shyamali M.; Sjaardema, Gregory D.; Templet, Gary J.; Widener, Patrick W.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

ATDM Data Warehouse: Data Management Services for Exascale Computing

Ulmer, Craig D.; Oldfield, Ron A.; Kordenbrock, Todd H.; Levy, Scott L.; Lofstead, Gerald F.; Mukherjee, Shyamali M.; Templet, Gary J.; Widener, Patrick W.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Behavior-Based Simulation of Storage Devices

Ward, Harry L.; Oldfield, Ron A.; Widener, Patrick W.; Curry, Matthew L.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Benefits and Challenges of Integration Simulation and Analysis

Oldfield, Ron A.; Moreland, Kenneth D.; Fabian, Nathan D.; Rogers, David R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Composition and Virtualization as the Foundations of an Extreme-scale OS/R

Brightwell, Ronald B.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

CSSE L2 Milestone - Data Co-Processing for Extreme Scale Analysis - Midterm Review

Moreland, Kenneth D.; Rogers, David R.; Fabian, Nathan D.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

D2T: Doubly Distributed Transactions for High Performance and Distributed Computing

Lofstead, Gerald F.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

D2T: Doubly Distributed Transactions for High Performance and Distributed Computing

Lofstead, Gerald F.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

D2T: Doubly Distributed Transactions for High Performance and Distributed Computing

Lofstead, Gerald F.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

D2T: Doubly Distributed Transactions for High Performance and Distributed Computing

Lofstead, Gerald F.; Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Data Co-Processing for Extreme Scale Analysis

Fabian, Nathan D.; Oldfield, Ron A.; Moreland, Kenneth D.

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Data co-processing for extreme scale analysis level II ASC milestone (4745)

Rogers, David R.; Moreland, Kenneth D.; Oldfield, Ron A.; Fabian, Nathan D.

Exascale supercomputing will embody many revolutionary changes in the hardware and software of high-performance computing. A particularly pressing issue is gaining insight into the science behind the exascale computations. Power and I/O speed con- straints will fundamentally change current visualization and analysis work ows. A traditional post-processing work ow involves storing simulation results to disk and later retrieving them for visualization and data analysis. However, at exascale, scien- tists and analysts will need a range of options for moving data to persistent storage, as the current o ine or post-processing pipelines will not be able to capture the data necessary for data analysis of these extreme scale simulations. This Milestone explores two alternate work ows, characterized as in situ and in transit, and compares them. We nd each to have its own merits and faults, and we provide information to help pick the best option for a particular use.

More Details

TYPE SAND Report YEAR 2013

OSTI DOI

Data Science and Computer Science R&D at SNL

Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Data Science and Computer Science Research at Sandia National Laboratories

Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Data Services and Trilinos: A Brief Introduction to Trios Data Services

Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Data Services and Trilinos: Addressing I/O Challenges for Exascale Applications

Oldfield, Ron A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Delta: Data Reduction for Integrated Application Workflows

Lofstead, Gerald F.; Jean-Baptiste, Gregory J.; Oldfield, Ron A.

Integrated Application Workflows (IAWs) run multiple simulation workflow components con- currently on an HPC resource connecting these components using compute area resources and compensating for any performance or data processing rate mismatches. These IAWs require high frequency and high volume data transfers between compute nodes and staging area nodes during the lifetime of a large parallel computation. The available network band- width between the two areas may not be enough to efficiently support the data movement. As the processing power available to compute resources increases, the requirements for this data transfer will become more difficult to satisfy and perhaps will not be satisfiable at all since network capabilities are not expanding at a comparable rate. Furthermore, energy consumption in HPC environments is expected to grow by an order of magnitude as exas- cale systems become a reality. The energy cost of moving large amounts of data frequently will contribute to this issue. It is necessary to reduce the volume of data without reducing the quality of data when it is being processed and analyzed. Delta resolves the issue by addressing the lifetime data transfer operations. Delta removes subsequent identical copies of already transmitted data during transfers and restores those copies once the data has reached the destination. Delta is able to identify duplicated information and determine the most space efficient way to represent it. Initial tests show about 50% reduction in data movement while maintaining the same data quality and transmission frequency.

More Details

TYPE SAND Report YEAR 2015

OSTI DOI

Publications