Publications Details
Test and Evaluation of Reinforcement Learning via Robustness Testing and Explainable AI for High-Speed Aerospace Vehicles
Raz, Ali K.; Nolan, Sean M.; Levin, Winston; Mall, Kshitij; Mia, Ahmad; Mockus, Linas; Ezra, Kris; Williams, Kyle R.
Reinforcement Learning (RL) provides an ability to train an artificial intelligent agent in dynamic and uncertain environments. RL has demonstrated an impressive performance capability to learn nearly optimal policies in various application domains including aerospace. Despite the demonstrated performance outcomes of RL, characterizing performance boundaries, explaining the logic behind RL decisions, and quantifying resulting uncertainties in RL outputs are major challenges that slow down the adoption of RL in real-time systems. This is particularly true for aerospace systems where the risk of failure is high and performance envelopes of systems of interest may be small. To facilitate adoption of learning agents in real-time systems, this paper presents a three-part Test and Evaluation (T&E) framework for RL built from Systems engineering for artificial intelligence (SE4AI) perspective. This T&E framework introduces robustness testing approaches to characterize performance bounds on RL, employs Explainable AI techniques, namely Shapley Additive Explanations (SHAP) to examine RL decision-making, and incorporates validation of RL outputs with known and accepted solutions. This framework is applied to a high-speed aerospace vehicle emergency descent problem where RL is trained to provide an angle of attack command and the framework is utilized to comprehensively examine the impact of uncertainties in the vehicle's altitude, velocity, and flight path angle. The robustness testing characterizes acceptable ranges of disturbances in flight parameters, while SHAP exposes the most significant features that impact RL selection of angle of attack-in this case the vehicle altitude. Finally, RL outputs are compared to trajectory generated by indirect optimal control methods for validation.