Publications

14 Results
Skip to search filters

Sensitivity and Uncertainty Workflow of Full System SIERRA Models Supporting High Consequence Applications

Orient, George E.; Clay, Robert L.; Friedman-Hill, Ernest J.; Pebay, Philippe P.; Ridgway, Elliott M.

Credibility of end-to-end CompSim (Computational Simulation) models and their agile execution requires an expressive framework to describe, communicate and execute complex computational tool chains representing the model. All stakeholders from system engineering and customers through model developers and V&V partners need views and functionalities of the workflow representing the model in a manner that is natural to their discipline. In the milestone and in this report we define workflow as a network of computation simulation activities executed autonomously on a distributed set of computational platforms. The FY19 ASC L2 Milestone (6802) for the Integrated Workflow (IWF) project was designed to integrate and improve existing capabilities or develop new functionalities to provide a wide range of stakeholders a coherent and intuitive platform capable of defining and executing CompSim modeling from analysis workflow definition to complex ensemble calculations. The main goal of the milestone was to advance the integrated workflow capabilities to support the weapon system analysts with a production deployment in FY20. Ensemble calculations supporting program decisions include sensitivity analysis, optimization and uncertainty quantification. The goal of the L2 milestone aligned with the ultimate goal of the IWF project is to foster cultural and technical shift toward and integrated CompSim capability based on automated workflows. Specific deliverables were defined in five broad categories: 1) Infrastructure, including development of distributed-computing workflow capability, 2) integration of Dakota (Sandia's sensitivity, optimization and UQ engine) with SAW (Sandia Analysis Workbench), 3) ARG (Automatic Report Generator introspecting analysis artifacts and generating human-readable extensible and archivable reports), 4) Libraries and Repositories aiding capability reuse, and 5) Exemplars to support training, capturing best practices and stress testing of the platform. A set of exemplars was defined to represent typical weapon system qualification CompSim projects. Analyzing the required capabilities and using the findings to plan implementation of required capabilities ensured optimal allocation of development resources focused on production deployment after the L2 is completed. It was recognized early that the end-to-end modeling applications pose a considerable number of diverse risks, and a formal risk tracking process was implemented. The project leveraged products, capabilities and development tasks of IWF partners. SAW, Dakota, Cubit, Sierra, Slycat, and NGA (NexGen Analytics, a small business) contributed to the integrated platform developed during this milestone effort. New products delivered include: a) NGW (Next Generation Workflow) for robust workflow definition and execution, b) Dakota wizards, editor and results visualization, and c) the automatic report generator ARG. User engagement was initiated early in the development process eliciting concrete requirements and actionable feedback to assure that the integrated CompSim capability will have high user acceptance and impact. The current integrated capabilities have been demonstrated and are continually being tested by a set of exemplars ranging from training scenarios to computationally demanding uncertainty analyses. The integrated workflow platform has been deployed on both SRN (Sandia Restricted Network) and SCN (Sandia Classified Network). Computational platforms where the system has been demonstrated span from Windows (Creo the CAD platform chosen by Sandia) to Trinity HPC (Sierra and CTH solvers). Follow up work will focus on deployment at SNL and other sites in the nuclear enterprise (LLNL, KCNSC), training and consulting support to democratize the analysis agility, process health and knowledge management benefits the NGW platform provides. ACKNOWLEDGEMENTS The IWF team would like to acknowledge the consistent support from the ASC sponsors: Scott Hutchinson, Walt Witkowski, Ken Alvin, Tom Klitsner, Jeremy Templeton, Erik Strack, and Amanda Dodd. Without their support this integrated effort would not have been possible. We would also like to thank the milestone review panel for their insightful feedback and guidance throughout the year: Martin Heinstein, Patty Hough, Jay Dike, Dan Laney (LLNL), and Jay Billings (ORNL). And of course, without the hard work of the IWF team none of this would have happened.

More Details

ASC ATDM Level 2 Milestone #5325: Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms

Baker, Gavin M.; Bettencourt, Matthew T.; Bova, S.W.; franko, ken f.; Gamell, Marc G.; Grant, Ryan E.; Hammond, Simon D.; Hollman, David S.; Knight, Samuel K.; Kolla, Hemanth K.; Lin, Paul L.; Olivier, Stephen O.; Sjaardema, Gregory D.; Slattengren, Nicole L.; Teranishi, Keita T.; Wilke, Jeremiah J.; Bennett, Janine C.; Clay, Robert L.; kale, laxkimant k.; Jain, Nikhil J.; Mikida, Eric M.; Aiken, Alex A.; Bauer, Michael B.; Lee, Wonchan L.; Slaughter, Elliott S.; Treichler, Sean T.; Berzins, Martin B.; Harman, Todd H.; humphreys, alan h.; schmidt, john s.; sunderland, dan s.; Mccormick, Pat M.; gutierrez, samuel g.; shulz, martin s.; Gamblin, Todd G.; Bremer, Peer-Timo B.

Abstract not provided.

ASC ATDM Level 2 Milestone #5325: Asynchronous Many-Task Runtime System Analysis and Assessment for Next Generation Platforms

Baker, Gavin M.; Bettencourt, Matthew T.; Bova, S.W.; franko, ken f.; Gamell, Marc G.; Grant, Ryan E.; Hammond, Simon D.; Hollman, David S.; Knight, Samuel K.; Kolla, Hemanth K.; Lin, Paul L.; Olivier, Stephen O.; Sjaardema, Gregory D.; Slattengren, Nicole L.; Teranishi, Keita T.; Wilke, Jeremiah J.; Bennett, Janine C.; Clay, Robert L.; kale, laxkimant k.; Jain, Nikhil J.; Mikida, Eric M.; Aiken, Alex A.; Bauer, Michael B.; Lee, Wonchan L.; Slaughter, Elliott S.; Treichler, Sean T.; Berzins, Martin B.; Harman, Todd H.; humphreys, alan h.; schmidt, john s.; sunderland, dan s.; Mccormick, Pat M.; gutierrez, samuel g.; shulz, martin s.; Gamblin, Todd G.; Bremer, Peer-Timo B.

This report provides in-depth information and analysis to help create a technical road map for developing next-generation programming models and runtime systems that support Advanced Simulation and Computing (ASC) work- load requirements. The focus herein is on asynchronous many-task (AMT) model and runtime systems, which are of great interest in the context of "Oriascale7 computing, as they hold the promise to address key issues associated with future extreme-scale computer architectures. This report includes a thorough qualitative and quantitative examination of three best-of-class AIM] runtime systems – Charm-++, Legion, and Uintah, all of which are in use as part of the Centers. The studies focus on each of the runtimes' programmability, performance, and mutability. Through the experiments and analysis presented, several overarching Predictive Science Academic Alliance Program II (PSAAP-II) Asc findings emerge. From a performance perspective, AIV runtimes show tremendous potential for addressing extreme- scale challenges. Empirical studies show an AM runtime can mitigate performance heterogeneity inherent to the machine itself and that Message Passing Interface (MP1) and AM11runtimes perform comparably under balanced conditions. From a programmability and mutability perspective however, none of the runtimes in this study are currently ready for use in developing production-ready Sandia ASC applications. The report concludes by recommending a co- design path forward, wherein application, programming model, and runtime system developers work together to define requirements and solutions. Such a requirements-driven co-design approach benefits the community as a whole, with widespread community engagement mitigating risk for both application developers developers. and high-performance computing runtime systein

More Details

Modeling and simulation technology readiness levels

Clay, Robert L.; Marburger, Scot J.; Shneider, Max S.; Trucano, Timothy G.

This report summarizes the results of an effort to establish a framework for assigning and communicating technology readiness levels (TRLs) for the modeling and simulation (ModSim) capabilities at Sandia National Laboratories. This effort was undertaken as a special assignment for the Weapon Simulation and Computing (WSC) program office led by Art Hale, and lasted from January to September 2006. This report summarizes the results, conclusions, and recommendations, and is intended to help guide the program office in their decisions about the future direction of this work. The work was broken out into several distinct phases, starting with establishing the scope and definition of the assignment. These are characterized in a set of key assertions provided in the body of this report. Fundamentally, the assignment involved establishing an intellectual framework for TRL assignments to Sandia's modeling and simulation capabilities, including the development and testing of a process to conduct the assignments. To that end, we proposed a methodology for both assigning and understanding the TRLs, and outlined some of the restrictions that need to be placed on this process and the expected use of the result. One of the first assumptions we overturned was the notion of a ''static'' TRL--rather we concluded that problem context was essential in any TRL assignment, and that leads to dynamic results (i.e., a ModSim tool's readiness level depends on how it is used, and by whom). While we leveraged the classic TRL results from NASA, DoD, and Sandia's NW program, we came up with a substantially revised version of the TRL definitions, maintaining consistency with the classic level definitions and the Predictive Capability Maturity Model (PCMM) approach. In fact, we substantially leveraged the foundation the PCMM team provided, and augmented that as needed. Given the modeling and simulation TRL definitions and our proposed assignment methodology, we conducted four ''field trials'' to examine how this would work in practice. The results varied substantially, but did indicate that establishing the capability dependencies and making the TRL assignments was manageable and not particularly time consuming. The key differences arose in perceptions of how this information might be used, and what value it would have (opinions ranged from negative to positive value). The use cases and field trial results are included in this report. Taken together, the results suggest that we can make reasonably reliable TRL assignments, but that using those without the context of the information that led to those results (i.e., examining the measures suggested by the PCMM table, and extended for ModSim TRL purposes) produces an oversimplified result--that is, you cannot really boil things down to just a scalar value without losing critical information.

More Details
14 Results
14 Results