Publications Search

Leveraging abstraction to establish out-of-nominal safety properties

Communications in Computer and Information Science

Mayo, Jackson R.; Armstrong, Robert C.; Hulette, Geoffrey C.

Digital systems in an out-of-nominal environment (e.g., one causing hardware bit flips) may not be expected to function correctly in all respects but may be required to fail safely. We present an approach for understanding and verifying a system’s out-of-nominal behavior as an abstraction of nominal behavior that preserves designated critical safety requirements. Because abstraction and refinement are already widely used for improved tractability in formal design and proof techniques, this additional way of viewing an abstraction can potentially verify a system’s out-of-nominal safety with little additional work. We illustrate the approach with a simple model of a turnstile controller with possible logic faults (formalized in the temporal logic of actions and NuSMV), noting how design choices can be guided by the desired out-of-nominal abstraction. Principles of robustness in complex systems (specifically, Boolean networks) are found to be compatible with the formal abstraction approach. This work indicates a direction for broader use of formal methods in safety-critical systems.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

A robust technique to make a 2D advection solver tolerant to soft faults

Procedia Computer Science

Strazdins, Peter; Harding, Brendan; Lee, Chung; Mayo, Jackson R.; Ray, Jaideep; Armstrong, Robert C.

We present a general technique to solve Partial Differential Equations, called robust stencils, which make them tolerant to soft faults, i.e. bit flips arising in memory or CPU calculations. We show how it can be applied to a two-dimensional Lax-Wendroff solver. The resulting 2D robust stencils are derived using an orthogonal application of their 1D counterparts. Combinations of 3 to 5 base stencils can then be created. We describe how these are then implemented in a parallel advection solver. Various robust stencil combinations are explored, representing tradeoff between performance and robustness. The results indicate that the 3-stencil robust combinations are slightly faster on large parallel workloads than Triple Modular Redundancy (TMR). They also have one third of the memory footprint. We expect the improvement to be significant if suitable optimizations are performed. Because faults are avoided each time new points are computed, the proposed stencils are also comparably robust to faults as TMR for a large range of error rates. The technique can be generalized to 3D (or higher dimensions) with similar benefits.

More Details

TYPE Conference Poster YEAR 2016

DOI OSTI Scopus

Analyzing and Mitigating Unexpected Behavior in Digital Systems [PowerPoint]

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2015

OSTI

Local recovery and failure masking for stencil-based applications at extreme scales

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Gamell, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Application resilience is a key challenge that has to be addressed to realize the exascale vision. Online recovery, even when it involves all processes, can dramatically reduce the overhead of failures as compared to the more traditional approach where the job is terminated and restarted from the last checkpoint. In this paper we explore how local recovery can be used for certain classes of applications to further reduce overheads due to resilience. Specifically we develop programming support and scalable runtime mechanisms to enable online and transparent local recovery for stencil-based parallel applications on current leadership class systems. We also show how multiple independent failures can be masked to effectively reduce the impact on the total time to solution. We integrate these mechanisms with the S3D combustion simulation, and experimentally demonstrate (using the Titan Cray-XK7 system at ORNL) the ability to tolerate high failure rates (i.e., node failures every 5 seconds) with low overhead while sustaining performance, at scales up to 262144 cores.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI Scopus

Leveraging Abstraction to Establish Out-of-Nominal Safety Properties

Mayo, Jackson R.; Armstrong, Robert C.; Hulette, Geoffrey C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

DOI OSTI

Local Recovery and Failure Masking for Stencil-based Applications at Extreme Scales

Gamell Balmana, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Exploring failure recovery for stencil-based applications at extreme scales

HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing

Gamell Balmana, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Application resilience is a key challenge that must be ad-dressed in order to realize the exascale vision. Previous work has shown that online recovery, even when done in a global manner (i.e., involving all processes), can dramatically re-duce the overhead of failures when compared to the more traditional approach of terminating the job and restarting it from the last stored checkpoint. In this paper we suggest going one step further, and explore how local recovery can be used for certain classes of applications to reduce the over-heads due to failures. Specifically we study the feasibility of local recovery for stencil-based parallel applications and we show how multiple independent failures can be masked to effectively reduce the impact on the total time to solution.

More Details

TYPE Conference Poster YEAR 2015

OSTI Scopus

Digital system robustness via design constraints: The lesson of formal methods

9th Annual IEEE International Systems Conference, SysCon 2015 - Proceedings

Mayo, Jackson R.; Armstrong, Robert C.; Hulette, Geoffrey C.

Current programming languages and programming models make it easy to create software and hardware systems that fulfill an intended function but also leave such systems open to unintended function and vulnerabilities. Software engineering and code hygiene may make systems incrementally safer, but do not produce the wholesale change necessary for secure systems from the outset. Yet there exists an approach with impressive results: We cite recent examples showing that formal methods, coupled with formally informed digital design, have produced objectively more robust code even beyond the properties directly proven. Though discovery of zero-day vulnerabilities is almost always a surprise and powerful tools like semantic fuzzers can cover a larger search space of vulnerabilities than a developer can conceive of, formal models seem to produce robustness of a higher qualitative order than traditionally developed digital systems. Because the claim is necessarily a qualitative one, we illustrate similar results with an idealized programming language in the form of Boolean networks where we have control of parameters related to stability and adaptability. We argue that verifiability with formal methods is an instance of broader design constraints that promote robustness. We draw analogies to real-world programming models and languages that can be mathematically reasoned about in contrast to ones that are essentially undecidable.

More Details

TYPE Conference Poster YEAR 2015

OSTI Scopus

Assured Trust through RTL-to-GDS Formal Equivalence

Mannos, Tom J.; Michnovicz, Jason; Land, Matthew; Eames, Brandon K.; Templin, Joshua R.; Armstrong, Robert C.; Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Cybersecurity Design Techniques Targeting and Leveraging HPC

Letchford, Joshua; Mayo, Jackson R.; Armstrong, Robert C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Exploring Failure Recovery for Stencil-based Applications at Extreme Scales

Gamell, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2015

OSTI

Failure Masking and Local Recovery for Stencil-based Applications at Extreme Scales

Gamell, Marc; Teranishi, Keita; Heroux, Michael A.; Mayo, Jackson R.; Kolla, Hemanth; Chen, Jacqueline H.; Parashar, Manish

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

OSTI

Digital System Robustness via Design Constraints: The Lesson of Formal Methods

Mayo, Jackson R.; Armstrong, Robert C.; Hulette, Geoffrey C.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

DOI OSTI

Theorem-Proving Analysis of Digital Control Logic Interacting with Continuous Dynamics

Hulette, Geoffrey C.; Armstrong, Robert C.; Mayo, Jackson R.; Ruthruff, Joseph

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2014

DOI OSTI

An Adaptive Power Method for the Generalized Tensor Eigenproblem

Kolda, Tamara G.; Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Digital System Analysis

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

Digital Design Verification using Formal and Complexity Analysis

Armstrong, Robert C.; Mayo, Jackson R.; Punnoose, Ratish J.; Ruthruff, Joseph

Abstract not provided.

More Details

TYPE Presentation YEAR 2014

OSTI

Theorem-Proving Analysis of Digital Control Logic Interacting with Continuous Dynamics

Hulette, Geoffrey C.; Armstrong, Robert C.; Mayo, Jackson R.; Ruthruff, Joseph

Abstract not provided.

More Details

TYPE Conference YEAR 2014

DOI OSTI

Finite Difference Stencils Robust to Silent Data Corruption

Ray, Jaideep; Mayo, Jackson R.; Armstrong, Robert C.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI

An adaptive shifted power method for computing generalized tensor eigenpairs

SIAM Journal on Matrix Analysis and Applications

Kolda, Tamara G.; Mayo, Jackson R.

Several tensor eigenpair definitions have been put forth in the past decade, but these can all be unified under generalized tensor eigenpair framework, introduced by Chang, Pearson, and Zhang [J. Math. Anal. Appl., 350 (2009), pp. 416-422]. Given mth-order, n-dimensional realvalued symmetric tensors A and B, the goal is to find λ ε ℝ and x ε ℝn, x ≠= 0 such that Axm-1 = λBxm-1. Different choices for B yield different versions of the tensor eigenvalue problem. We present our generalized eigenproblem adaptive power (GEAP) method for solving the problem, which is an extension of the shifted symmetric higher-order power method (SS-HOPM) for finding Z-eigenpairs. A major drawback of SS-HOPM is that its performance depended on choosing an appropriate shift, but our GEAP method also includes an adaptive method for choosing the shift automatically.

More Details

TYPE Journal Article YEAR 2014

DOI OSTI Scopus

SC13 HPC Resilience Poster

Clay, Robert L.; Mayo, Jackson R.; Teranishi, Keita; Slattengren, Nicole L.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Fourier Analysis of Boolean Networks: Explaining Resilience of Structured Families

Physical Review Letters

Comandur, Seshadhri; Smith, Andrew M.; Mayo, Jackson R.; Armstrong, Robert C.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2013

OSTI

An Adaptive Power Method for the Generalized Tensor Eigenproblem

Kolda, Tamara G.; Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

Comparison of statistical properties of advected diffusive scalars and advected propagating fronts and implications for turbulent premixed flame propagation

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Fundamental analysis and prediction of turbulent premixed combustion: Status and prospects

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Applications of Complexity Science to Digital Systems

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Fundamental analysis and prediction of turbulent premixed combustion: Status and prospects

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Leveraging Formal Methods and Fuzzing to Verify Security and Reliability Properties of Large-Scale High-Consequence Systems

Ruthruff, Joseph; Armstrong, Robert C.; Davis, Benjamin G.; Mayo, Jackson R.; Punnoose, Ratish J.

Formal methods describe a class of system analysis techniques that seek to prove specific properties about analyzed designs, or locate flaws compromising those properties. As an analysis capability,these techniques are the subject of increased interest from both internal and external customers of Sandia National Laboratories. Given this lab's other areas of expertise, Sandia is uniquely positioned to advance the state-of-the-art with respect to several research and application areas within formal methods. This research project was a one-year effort funded by Sandia's CyberSecurity S&T Investment Area in its Laboratory Directed Research & Development program to investigate the opportunities for formal methods to impact Sandia's present mission areas, more fully understand the needs of the research community in the area of formal methods and where Sandia can contribute, and clarify from those potential research paths those that would best advance the mission-area interests of Sandia. The accomplishments from this project reinforce the utility of formal methods in Sandia, particularly in areas relevant to Cyber Security, and set the stage for continued Sandia investments to ensure this capabilityis utilized and advanced within this laboratory to serve the national interest.

More Details

TYPE SAND Report YEAR 2012

DOI OSTI

Symmetric Tensor Decomposition via a Power Method for the Generalized Tensor Eigenproblem

Mayo, Jackson R.; Kolda, Tamara G.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Toward Idealizations for Fundamental Understanding of Parallel Scalability amid Failure

Thompson, David; Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Fritz: Characterizing Programming Models amid Failures

Clay, Robert L.; Thompson, David; Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Modeling Failures in Large-Scale Computer Systems

Thompson, David; Mayo, Jackson R.; Brandt, James M.; Gentile, Ann C.; Wong, Matthew H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Proposed Prototype Demonstrations of Formal Methods

Ruthruff, Joseph; Mayo, Jackson R.; Armstrong, Robert C.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Community-Based Resistance to Intrusion in Information Technology Systems

Armstrong, Robert C.; Sale, Kenneth L.; Schoeniger, Joseph S.; Solis, John H.; Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Leveraging Complexity for Unpredictable yet Robust Cyber Systems

Mayo, Jackson R.; Armstrong, Robert C.; Allan, Benjamin A.; Hulette, Geoffrey C.; Schwartz, Moses; Bauer, Todd M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

What Then Do We Do About Computer Security?

Berg, Michael J.; Davis, Christopher E.; Mayo, Jackson R.; Suppona, Roger A.; Wyss, Gregory D.

This report presents the answers that an informal and unfunded group at SNL provided for questions concerning computer security posed by Jim Gosler, Sandia Fellow (00002). The primary purpose of this report is to record our current answers; hopefully those answers will turn out to be answers indeed. The group was formed in November 2010. In November 2010 Jim Gosler, Sandia Fellow, asked several of us several pointed questions about computer security metrics. Never mind that some of the best minds in the field have been trying to crack this nut without success for decades. Jim asked Campbell to lead an informal and unfunded group to answer the questions. With time Jim invited several more Sandians to join in. We met a number of times both with Jim and without him. At Jim's direction we contacted a number of people outside Sandia who Jim thought could help. For example, we interacted with IBM's T.J. Watson Research Center and held a one-day, videoconference workshop with them on the questions.

More Details

TYPE SAND Report YEAR 2011

DOI OSTI

Tradeoffs in targeted fuzzing of cyber systems by defenders and attackers

ACM International Conference Proceeding Series

Mayo, Jackson R.; Armstrong, Robert C.

Automated randomized testing, known as fuzzing, is an effective and widely used technique for detecting faults and vulnerabilities in digital systems, and is a key tool for security assessment of smart-grid devices and protocols. It has been observed that the effectiveness of fuzzing can be improved by sampling test inputs in a targeted way that reflects likely fault conditions. We propose a systematic prescription for such targeting, which favors test inputs that are "simple" in an appropriate sense. The notion of Kolmogorov complexity provides a rigorous foundation for this approach. Under certain assumptions, an optimal fuzzing procedure is derived for statistically evaluating a system's security against a realistic attacker who also uses fuzzing. Copyright © 2011 Association for Computing Machinery.

More Details

TYPE Conference YEAR 2011

OSTI Scopus

Fault Oblivious HPC with Dynamic Task Replication and Substitution

Vorobeychik, Yevgeniy; Mayo, Jackson R.; Armstrong, Robert C.; Minnich, Ronald G.; Rudish, Donald W.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Tradeoffs in Targeted Fuzzing of Cyber Systems by Defenders and Attackers

Mayo, Jackson R.; Armstrong, Robert C.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

A Simplified Version of 'Complex System Modeling and Science-Based Cybersecurity'

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Computing Tensor Eigenvalues: Theory and Practice

Kolda, Tamara G.; Mayo, Jackson R.; Ballard, Grey B.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Leveraging Formal Methods andConfronting Complexity for Digital System Reliability and Security

Armstrong, Robert C.; Hu, Yalin; Mayo, Jackson R.; Ruthruff, Joseph

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Leveraging Formal Methods and Confronting Complexity for Digital System Reliability and Security

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Framework for Enabling System Understanding

Brandt, James M.; Chen, Frank X.; Gentile, Ann C.; Mayo, Jackson R.; Pebay, Philippe P.; Roe, Diana C.; Thompson, David; Wong, Matthew H.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Influence and Dynamic Behavior in Random Boolean Networks

Physical Review Letters

Comandur, Seshadhri; Armstrong, Robert C.; Mayo, Jackson R.; Ruthruff, Joseph

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

Shifted Power Method for Computing Tensor Eigenvalues

Kolda, Tamara G.; Mayo, Jackson R.; Ballard, Grey B.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Complex System Modeling and Science-Based Cybersecurity

Mayo, Jackson R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Oblivious High Performance Computing with Dynamic Task Replication and Substitution

Mayo, Jackson R.; Vorobeychik, Yevgeniy; Rudish, Donald W.; Minnich, Ronald G.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Fault oblivious high performance computing with dynamic task replication and substitution

Computer Science - Research and Development

Mayo, Jackson R.; Armstrong, Robert C.; Minnich, Ronald G.; Rudish, Donald W.

Traditional parallel programming techniques will suffer rapid deterioration of performance scaling with growing platform size, as the work of coping with increasingly frequent failures dominates over useful computation. To address this challenge, we introduce and simulate a novel software architecture that combines a task dependency graph with a substitution graph. The role of the dependency graph is to limit communication and checkpointing and enhance fault tolerance by allowing graph neighbors to exchange data, while the substitution graph promotes fault oblivious computing by allowing a failed task to be substituted onthe- fly by another task, incurring a quantifiable error. We present optimization formulations for trading off substitution errors and other factors such as available system capacity and low-overlap task partitioning among processors, and demonstrate that these can be approximately solved in real time after some simplifications. Simulation studies of our proposed approach indicate that a substitution network adds considerable resilience and simple enhancements can limit the aggregate substitution errors. © Springer-Verlag 2011.

More Details

TYPE Conference YEAR 2011

OSTI Scopus

Scalable HPC monitoring and analysis for understanding and automated response

Brandt, James M.; Chen, Frank X.; De Sapio, Vincent; Gentile, Ann C.; Mayo, Jackson R.; Pebay, Philippe P.; Roe, Diana C.; Wong, Matthew H.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Publications

Search results