Publications

Results 1–25 of 75
Skip to search filters

SAGE Intrusion Detection System: Sensitivity Analysis Guided Explainability for Machine Learning

Smith, Michael R.; Acquesta, Erin A.; Ames, Arlo L.; Carey, Alycia N.; Cueller, Christopher R.; Field, Richard V.; Maxfield, Trevor M.; Mitchell, Scott A.; Morris, Elizabeth S.; Moss, Blake C.; Nyre-Yu, Megan N.; Rushdi, Ahmad R.; Stites, Mallory C.; Smutz, Charles S.; Zhou, Xin Z.

This report details the results of a three-fold investigation of sensitivity analysis (SA) for machine learning (ML) explainability (MLE): (1) the mathematical assessment of the fidelity of an explanation with respect to a learned ML model, (2) quantifying the trustworthiness of a prediction, and (3) the impact of MLE on the efficiency of end-users through multiple users studies. We focused on the cybersecurity domain as the data is inherently non-intuitive. As ML is being using in an increasing number of domains, including domains where being wrong can elicit high consequences, MLE has been proposed as a means of generating trust in a learned ML models by end users. However, little analysis has been performed to determine if the explanations accurately represent the target model and they themselves should be trusted beyond subjective inspection. Current state-of-the-art MLE techniques only provide a list of important features based on heuristic measures and/or make certain assumptions about the data and the model which are not representative of the real-world data and models. Further, most are designed without considering the usefulness by an end-user in a broader context. To address these issues, we present a notion of explanation fidelity based on Shapley values from cooperative game theory. We find that all of the investigated MLE explainability methods produce explanations that are incongruent with the ML model that is being explained. This is because they make critical assumptions about feature independence and linear feature interactions for computational reasons. We also find that in deployed, explanations are rarely used due to a variety of reason including that there are several other tools which are trusted more than the explanations and there is little incentive to use the explanations. In the cases when the explanations are used, we found that there is the danger that explanations persuade the end users to wrongly accept false positives and false negatives. However, ML model developers and maintainers find the explanations more useful to help ensure that the ML model does not have obvious biases. In light of these findings, we suggest a number of future directions including developing MLE methods that directly model non-linear model interactions and including design principles that take into account the usefulness of explanations to the end user. We also augment explanations with a set of trustworthiness measures that measure geometric aspects of the data to determine if the model output should be trusted.

More Details

An Agile Design-to-Simulation Workflow Using a New Conforming Moving Least Squares Method

Koester, Jacob K.; Tupek, Michael R.; Mitchell, Scott A.

This report summarizes the accomplishments and challenges of a two year LDRD effort focused on improving design-to-simulation agility. The central bottleneck in most solid mechanics simulations is the process of taking CAD geometry and creating a discretization of suitable quality, i.e., the "meshine effort. This report revisits meshfree methods and documents some key advancements that allow their use on problems with complex geometries, low quality meshes, nearly incompressible materials or that involve fracture. The resulting capability was demonstrated to be an effective part of an agile simulation process by enabling rapid discretization techniques without increasing the time to obtain a solution of a given accuracy. The first enhancement addressed boundary-related challenges associated with meshfree methods. When using point clouds and Euclidean metrics to construct approximation spaces, the boundary information is lost, which results in low accuracy solutions for non-convex geometries and mate rial interfaces. This also complicates the application of essential boundary conditions. The solution involved the development of conforming window functions which use graph and boundary information to directly incorporate boundaries into the approximation space. The next enhancement was a procedure for producing a quality approximation with a low quality mesh. Unlike, the finite element method, meshfree approximation spaces do not require a mesh. However, meshes can be useful in providing domain boundary information and performing domain integration. A process was developed which aggregates low quality elements to create polyhedra of agreeable quality for domain integration. Stable time increments for transient dynamic simulations were observed to be up to 1000x larger than finite element simulations and solution quality and robustness were vastly superior. Obtaining a solution which is free of nonphysical displacement or pressure oscillations is a challenge for many methods when simulating nearly incompressible materials. Existing nodally integrated meshfree methods suffer from this limitation as well. New techniques were developed that combine B / F methods and the strain smoothing technique used in nodal integration to provide agreeable solutions for problems with nearly incompressible materials. The last major contribution enabled efficient simulations of material fracture with mass conservation. An inter-particle connectivity degradation approach was developed using ideas from peridynamics and cohesive zone modeling to disassociate nodes when fracture conditions are met. The method can, in principal, be applied to any material model with a specified failure criterion. For a mode-I ductile crack propagation problem, the method demonstrates mesh-size independent behavior without the particle instabilities near the fracture surface that are common to other particle methods. Addressing the aforementioned challenges of meshfree methods opens the approach to a broader class of problems and enables an agile simulation development process for problems of interest to Sandia.

More Details

Fast Approximate Union Volume in High Dimensions with Line Samples

Mitchell, Scott A.; Awad, Muhammad A.; Ebeida, Mohamed S.; Swiler, Laura P.

The classical problem of calculating the volume of the union of d-dimensional balls is known as "Union Volume." We present line-sampling approximation algorithms for Union Volume. Our methods may be extended to other Boolean operations, such as setminus; or to other shapes, such as hyper-rectangles. The deterministic, exact approaches for Union Volume do not scale well to high dimensions. However, we adapt several of these exact approaches to approximation algorithms based on sampling. We perform local sampling within each ball using lines. We have several variations, depending on how the overlapping volume is partitioned, and depending on whether radial, axis-aligned, or other line patterns are used. Our variations fall within the family of Monte Carlo sampling, and hence have about the same theoretical convergence rate, 1 /$\sqrt{M}$, where M is the number of samples. In our limited experiments, line-sampling proved more accurate per unit work than point samples, because a line sample provides more information, and the analytic equation for a sphere makes the calculation almost as fast. We performed a limited empirical study of the efficiency of these variations. We suggest a more extensive study for future work. We speculate that different ball arrangements, differentiated by the distribution of overlaps in terms of volume and degree, will benefit the most from patterns of line samples that preferentially capture those overlaps. Acknowledgement We thank Karl Bringman for explaining his BF-ApproxUnion (ApproxUnion) algorithm [3] to us. We thank Josiah Manson for pointing out that spoke darts oversample the center and we might get a better answer by uniform sampling. We thank Vijay Natarajan for suggesting random chord sampling. The authors are grateful to Brian Adams, Keith Dalbey, and Vicente Romero for useful technical discussions. This work was sponsored by the Laboratory Directed Research and Development (LDRD) Program at Sandia National Laboratories. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR), Applied Mathematics Program. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.

More Details

VoroCrust illustrated: Theory and challenges

Leibniz International Proceedings in Informatics, LIPIcs

Abdelkader, Ahmed; Bajaj, Chandrajit L.; Ebeida, Mohamed S.; Mahmoud, Ahmed H.; Mitchell, Scott A.; Owens, John D.; Rushdi, Ahmad A.

Over the past decade, polyhedral meshing has been gaining popularity as a better alternative to tetrahedral meshing in certain applications. Within the class of polyhedral elements, Voronoi cells are particularly attractive thanks to their special geometric structure. What has been missing so far is a Voronoi mesher that is sufficiently robust to run automatically on complex models. In this video, we illustrate the main ideas behind the VoroCrust algorithm, highlighting both the theoretical guarantees and the practical challenges imposed by realistic inputs.

More Details

Sampling conditions for conforming voronoi meshing by the vorocrust algorithm

Leibniz International Proceedings in Informatics, LIPIcs

Abdelkader, Ahmed; Bajaj, Chandrajit L.; Ebeida, Mohamed S.; Mahmoud, Ahmed H.; Mitchell, Scott A.; Owens, John D.; Rushdi, Ahmad A.

We study the problem of decomposing a volume bounded by a smooth surface into a collection of Voronoi cells. Unlike the dual problem of conforming Delaunay meshing, a principled solution to this problem for generic smooth surfaces remained elusive. VoroCrust leverages ideas from α-shapes and the power crust algorithm to produce unweighted Voronoi cells conforming to the surface, yielding the first provably-correct algorithm for this problem. Given an ϵ-sample on the bounding surface, with a weak σ-sparsity condition, we work with the balls of radius δ times the local feature size centered at each sample. The corners of this union of balls are the Voronoi sites, on both sides of the surface. The facets common to cells on opposite sides reconstruct the surface. For appropriate values of ϵ, σ and δ, we prove that the surface reconstruction is isotopic to the bounding surface. With the surface protected, the enclosed volume can be further decomposed into an isotopic volume mesh of fat Voronoi cells by generating a bounded number of sites in its interior. Compared to state-of-the-art methods based on clipping, VoroCrust cells are full Voronoi cells, with convexity and fatness guarantees. Compared to the power crust algorithm, VoroCrust cells are not filtered, are unweighted, and offer greater flexibility in meshing the enclosed volume by either structured grids or random samples.

More Details

Sampling Conditions for Conforming Voronoi Meshing by the VoroCrust Algorithm

LIPIcs-Leibniz International Proceedings in Informatics

Abdelkader, Ahmed A.; Bajaja, Chandrajit L.; Ebeida, Mohamed S.; Mahmoud, Ahmed H.; Mitchell, Scott A.; Owens, John D.; Rushdi, Ahmad A.

© Ahmed Abdelkader, Chandrajit L. Bajaj, Mohamed S. Ebeida, Ahmed H. Mahmoud, Scott A. Mitchell, John D. Owens and Ahmad A. Rushdi; licensed under Creative Commons License CC-BY 34th Symposium on Computational Geometry (SoCG 2018). We study the problem of decomposing a volume bounded by a smooth surface into a collection of Voronoi cells. Unlike the dual problem of conforming Delaunay meshing, a principled solution to this problem for generic smooth surfaces remained elusive. VoroCrust leverages ideas from α-shapes and the power crust algorithm to produce unweighted Voronoi cells conforming to the surface, yielding the first provably-correct algorithm for this problem. Given an ϵ-sample on the bounding surface, with a weak σ-sparsity condition, we work with the balls of radius δ times the local feature size centered at each sample. The corners of this union of balls are the Voronoi sites, on both sides of the surface. The facets common to cells on opposite sides reconstruct the surface. For appropriate values of ϵ, σ and δ, we prove that the surface reconstruction is isotopic to the bounding surface. With the surface protected, the enclosed volume can be further decomposed into an isotopic volume mesh of fat Voronoi cells by generating a bounded number of sites in its interior. Compared to state-of-the-art methods based on clipping, VoroCrust cells are full Voronoi cells, with convexity and fatness guarantees. Compared to the power crust algorithm, VoroCrust cells are not filtered, are unweighted, and offer greater flexibility in meshing the enclosed volume by either structured grids or random samples.

More Details

Footprint placement for mosaic imaging by sampling and optimization

Proceedings International Conference on Automated Planning and Scheduling, ICAPS

Mitchell, Scott A.; Valicka, Christopher G.; Rowe, Stephen R.; Zou, Simon Z.

We consider the problem of selecting a small set (mosaic) of sensor images (footprints) whose union covers a two-dimensional Region Of Interest (ROI) on Earth. We take the approach of modeling the mosaic problem as a Mixed-Integer Linear Program (MILP). This allows solutions to this subproblem to feed into a larger remote-sensor collection-scheduling MILP. This enables the scheduler to dynamically consider alternative mosaics, without having to perform any new geometric computations. Our approach to set up the optimization problem uses maximal disk sampling and point-in-polygon geometric calculations. Footprints may be of any shape, even non-convex, and we show examples using a variety of shapes that may occur in practice. The general integer optimization problem can become computationally expensive for large problems. In practice, the number of placed footprints is within an order of magnitude of ten, making the time to solve to optimality on the order of minutes. This is fast enough to make the approach relevant for near real-time mission applications. We provide open source software for all our methods, "GeoPlace."

More Details
Results 1–25 of 75
Results 1–25 of 75