|
![]() Surety Solutions for the 21st Century |
|
|
Surety Science and Engineering Workshop Proceedings
Dr. Pace VanDevender Sandia Surety Leadership Team, and Chief Information Officer Sandia National Laboratories Thank you and good morning. This is a workshop in the true sense of the word. Its also an experiment in the true sense of the word Im delighted to take you through some of the methodology of Surety Science and Engineering. Id like to remind you that surety has three components: reliability in normal circumstances, safety in abnormal and security in use control in malevolent circumstances. All of these need to be a system. As we talked to people who are experts in many disciplines, we found that we were talking about the same thing, and in fact surety could have a common methodology if we were smart enough to deduce it. In that deduction we realized what we needed in surety was something that is the equivalent of the simple machines and mechanical design. If you have the concepts of the screw, the inclined plain, the lever, the pulley, etc., firmly in your mind, it lets you invent more complex machines. The same discoveries occur in physics. Once you have mastered the concepts of mass, distance, velocity, momentum, and energy, then you have the conceptual framework for solving much more complex kinetics problems. Our goal in this workshop is to define the equivalent of those concepts for the complexities of Surety Science and Engineering. Those equivalent concepts are the four levels and eight approaches. My purpose is to discuss those eight approaches and how they help define and illuminate the four levels for application later today. This discipline is applicable at various levels of aggregation of complexity. For instance, in the defense establishment, theres a hierarchy starting with the Armed Services. Below that is weapon platforms, weapon systems, subsystems, and components. Surety can be applied to each of those levels. Determining where the opportunity for change and improvement is the first real step. That requires judgement on the next best target for improving overall surety. Its also generally true the higher the degree of aggregation, the more things have to come together to make a change. If you want to make a fundamental change to the defense establishment, take a deep breath. If you want to take a fundamental change to a component, you have a much better chance. Determining the level of the opportunity is the first task for discretion and judgement. Joan has already mentioned that surety comes in four levels:
Now, in fact, everything relies on the laws of nature and mathematics. Everything also has some human element in it. The value added by this four-level view is to ask: Where is the centroid of concern, or the centroid of vulnerability or opportunity? Does it rely mostly with the humans controlling the system? Or, when it really matters, does it rely on built-in science and engineering, with perhaps no humans? Have we eliminated those ambiguities of engineered positive measures and instead relied in design principles on the fundamental laws of nature? I would like you to take out this sheet to make some notes on it. This is an interactive workshop. As I go through the levels and the approaches to them, think of your own work area. If the level or approach stimulates a new idea or an option, write it down. These notes will be part of the substance that youll be contributing later on this afternoon. The worksheet has the levels of surety on the left hand side, the eight approaches that are matched to those levels and help define them in the middle column, and then your own example to be noted on the right hand side. Level 1. Working sufficiently as expected and buying insurance to cover the upsets. These are systems designed for liable operation. There are no special considerations for off-normal conditions. You insure as a normal business cost against most of the upsets. Its a fairly reactive response to mitigate the consequences that may occur, and you rely on foresight to insure safety. The usual examples are most industrial applications. Now that doesnt demean them. Level 1 surety is not Level 0 surety, as it takes a lot of effort to do this right. Highway safety is an example, in the sense that we license teenagersand I used to be oneessentially for life, with just a intermediate eye check at later times in life. Fix that level of surety in your mind. There are two approaches within Level 1. The first is reliance on foresight of designers and the good practices of people. This is the world of warrantees and insurance. The design and manufacturing of consumer products is appropriate for this regime. I was somewhat to our surprise to find that nuclear nonproliferation in India and Pakistan was de facto Level 1 surety when we thought it was much higherand inappropriately so. That illustrates mistakes at Level 1. Level 1 has another approach and now I want you to start registering some views. Please reach under your chair and pull out a Newton [computer]. This is a dialog, you see. Im going to say something, youre going to say something through your Newton, and well all see how it came out. This way I can see if were on track In this part Ill be showing you two recommendations and, under recommendation 1 and 2, you will tell me the degree to which that recommendation illustrates the particular approach in the title. [Discussion of recommendations for Level 1, Approach 2.] Approach 2 is mitigation after the fact by coordinated emergency response and correcting what went wrong. This is the world of response teams, of investigations, of lessons learned and of corrective action. Within that framework there are two recommendations. The first pertains to investigations of airline and nuclear reactor incidents and accidents, and the retrofit of units or systems to correct failures. The second is school security in the sense of kids killing kids and the reaction to that in this last year. [(Slide 11) Interactive voting, see Results] Level 2 surety is fundamentally different. At Level 2 surety were talking about surety by proactive human intervention. That means a system is designed with continual human actions to help insure safety. It requires people cognizant and especially adapted for safety purposes. A plan is in place relying upon human actions to control the environment for operations. The plan is to perform operations reliably, and to respond in the case of emergency. Most aircraft safety and most military operations fall in this regime. The consequences are too high for Level 1 surety. The attributes of Level 2 surety have been differentiated in the same sense as, say, the Malcolm Baldrige Quality Award attributes. There are defined (Slide 13) attributes to make progress. The (Slide 14) ways to improve with rigor within Level 2 and increase the level of surety without changing levels Approach No. 3. Surety is maintained by proper operation with thorough science-based understanding, independent assessment, and continuous improvement. It is a much more focused, dedicated effort than Level 1. This is the world of validated databases of computer simulations, of extensive, continual training in simulations, of systemic analysis and predictive understanding. [Discussion of recommendations for Level 2, Approach 3.] Recommendation 1. The design-deploy-fix style of debugging of software. [(Slide 16) Interactive voting, see Results] Recommendation 2 shows much more positive than the vote on recommendation 1 for, indeed, the continual simulator training and flight requalification of airline pilots is the correct one. Approach 4. [Discussion of recommendations for Level 2, Approach 4.] Here administrative controls reduce the probability of deleterious environments occurring. This is the world of the person-in-the-loop, of preventive action, control systems, diagnostics all aimed at prevention of the occurrence. To what degree are these are based on Approach 4? [(Slide 18) Interactive voting, see Results] I see weve got quite a few airline travelers who recognize that recommendation No. 2, x-ray screening and metal screening at the airports, does rely on administrative control of personnel to insure that system works Approach 4, that is preventing the occurrence from happening actually spans two levels because Level 2 often has problems and a process sometimes can take too long, response times are too long and people make errors. Now I need your help here, I need some kind of fast response, and this is a verbal response, so will you give me a fast response? [Audience: Yes.] There are three questions in this. [Pure white slide.] What color do you see? [Audience: White.] What do cows drink? [Audience: Milk.] Is that right? [Audience: No.] Why in the world would we say cows drink milk? Its because the brain takes about 3 seconds to go from being miscued into giving the right answer. Smart people who are quick dont let the brain process that 3 seconds to find theyre being miscued and thats part of the problem with Level 2when it counts people make errors. In that sense then Level 3 is surety by positive measures from science and engineering, which is partly is why were at the National Academy of Engineering. At Level 3, engineering and scientific measures are in place to control the environment for the operation, to ensure reliable performance, and to respond in case of emergency. Were moving into higher consequence endeavors. That is not to say that airline crashes are low consequence events but, from a social impact, nuclear reactors, our ballistic missile defense, self-healing communication routers to insure reliability of our telecommunication infrastructure, and nuclear weapons without modern safety features have huge consequences. Level 3 suretyscience and positive measures, characteristically handles them. As described in the White Paper, there are, as in Level 2, these attributes in Level 3:
Here an analogy with the quality award is important. Surety is in its infancy compared to quality in its deployment throughout industry. You may remember when we talked about "quality costs." After a while, "quality is free." And then finally, "quality pays." We hope, in ten years, we will see that there was a time when "surety cost." And then, "surety is free." And then finally, "surety pays." A Surety Award is something that we might well consider. (Slide 22: Varying manifestations of Level III attributes define three sublevels) The approach at 4.5 is aimed at reducing the probability of deleterious environment occurring. This includes engineered controls similar to lower levels, but these are automated, autonomous, preventive action control systems, and diagnostics aimed at prevention. [(Slide 23) Discussion of recommendations for Level 3, Approach 4.5.] [(Slide 24) Interactive voting on Level 3, Approach 5.] [(Slide 25) Discussion of recommendations for Level 3, Approach 5.] [(Slide 26) Discussion of recommendations for Level 3, Approach 5.] As you see, we have a great diversity of opinion on this because the measures do have something to do with both. On the one hand, in the automated breathalyzer and alcohol blood monitors that enable someone to start a car, it is engineering controls that reduce a probability of a deleterious environment occurringthat is a drunk on the highway. On the other hand, you can also see that, in order to keep a drunk off the highway, all relevant positive measures must workthe blood alcohol monitor has to work. In the case of one intercept in a ballistic missile defense system, it is an engineered approach to reduce the probability of deleterious environment occurring. Since you have one shot at it, everything must work. It is a one-layer defense. Youll find that many of the surety approaches can satisfy both. In this case we see the better example is "one intercept ballistic missile defense," as opposed to the subsequent approach, where only one of many positive measures is necessary for success. Approach 6. [Discussion of voting for "Only one of many positive measures is necessary for success. Gas, air, compression, and spark in internal combustion engine versus Rec. 2. Multi-tier ballistic missile defense"] Of course, Approach 5 and Approach 6 represent the additional surety acquired by having a multiple capability so that a breach of one does not constitute a move in the direction of deleterious consequences. Not all cases can you afford the multiple independent actions. [Discussion of voting results] Approach No. 6. Lets do it graphically. This represents Internet connectivity or coolant and loss-of-coolant systems for nuclear reactors. There are multiple paths from start to Mission Success, and only one has to work at a time. Thus, in addition to the barrier kind of model that we had before, the multiple parallel paths to independent success are conceptually the same model. Approach 7 is substantively different from previous approaches in that you have cumulative comparative adaptive positive measures. It is cognitively different. There is an event or process that has an input and output, with a comparator that monitors what happens and then intervenes to insure that the output is what should happen given the input, to some extent regardless of what happens internally to the process. [Discussion of voting for " Space shuttle computers voting to assure 2 of the 3 give same answer before acting" versus "Redundant components for reliability"] [(Slide 31)Discussion of Results] Level 3 can also have problems. Designs age or are flawed; software has bugs; hardware fails; sequences unfold at unexpected and escalating ways. Therefore its a great advantage if you can minimize to the extent possible your reliance on these details of the science and engineering. Level 4 surety is reliance to the extent possible on the laws of nature and mathematics in your design. You tailor the parameter space to minimize any ambiguities. The long-term goal of Surety Science and Engineering is indeed to rely to the extent possible only on the laws of nature and mathematics. But its hard. Flawless foresight is difficult to achieve. There are sublevels from a first deployment in which the intent is to shape the allowed parameter space bounded by laws of nature and mathematics to an ideal of absolute surety. Level 4 is not absolute surety: it is the reliance to the extent possible on nature and mathematics and because times change a periodic surety assessment is performed to uncover any new vulnerabilities. The attributes of Level 4 surety are reliance upon the laws of nature and mathematics as your centroid. Its a principles-based design to approach the physical impossibility of undesired consequences. Then continuous assessment strives for absolute suretycontinuous assessment must be in place. Approach 8 relies as much as possible on the laws of nature to approach physical impossibility in high consequence systems. The parameter space of Approach 8 can be diagrammed by using physics, chemistry, and material science to bound the permitted operation so that the untoward or high consequences are precluded because they are outside of that bound. [Voting on "Anti-lock brakes" versus "Hang glider air foil that becomes a parachute instead of stalling"] In this case the overwhelming majority but not everyone saw that recommendation 2 which is the hang glider air flow becoming a parachute instead of stalling, by changing the design so that the stall is precluded from the area of operationwould be an approach of Approach 8. The antilock brakes are intended to be predictable, a cumulative comparative adaptive positive measure in which the braking system senses that its about to fail and lock and then pumps the brake automatically. Weve talked about hit-and-miss all the way through presentation. Now, without trying to bias the work of reactor operations, I and a few of my friendsnone of whom are reactor operatorstook a look at what the public would see as reactor safety. Our intent was to show how these approaches and the corresponding level of surety would play against reactor safety. For instance:
We are in fact going beyond our best practices to create Surety Science and Engineering through this workshop. This slide summarizes the levels and approaches that we have been discussing. When I talk to people from reliability, from security, and from safety, I found we were all talking about the same things, and that there could be a common set of approaches for single strategy to address them all as a system. That is the challenge of this workshop.
Let me introduce Jim Rice. Ive had the pleasure of working with the 60 or so people on the Sandia Surety Leadership Team, whose work Ive had the pleasure to present to you today. I have a new assignment as the Chief Information Officer of Sandia and therefore Im pleased to introduce my successor. We both have the same hairline. This is Jim Rice, who will be taking over. I hope you enjoy working with him as much as I have. Thank you. | ||||||||||||
|
|
Back to top of page |