Publications Details

Publications / SAND Report

Defending Against Adversarial Examples

Short, Austin; La Pay, Trevor; Gandhi, Apurva

Adversarial machine learning is an active field of research that seeks to investigate the security of machine learning methods against cyber-attacks. An important branch of this field is adversarial examples, which seek to trick machine learning models into misclassifying inputs by maliciously tampering with input data. As a result of the pervasiveness of machine learning models in diverse areas such as computer vision, health care, and national security, this vulnerability is a rapidly growing threat. With the increasing use of AI solutions, threats against AI must be considered before deploying systems in a contested space. Adversarial machine learning is a problem strongly tied to software security, and just like other more common software vulnerabilities, it exploits a weakness in software, like components of machine learning models. During this project, we attempted to survey and replicate several adversarial machine learning techniques with the goal of developing capabilities for Sandia to advise and defend against these threats. To accomplish this, we scanned state of the art research for robust defenses against adversarial examples and applied them to a machine learning problem.