Commercial-Off-The-Shelf (COTS) Hardware Implementation Method for Real-Time Control of Mobile Robotics and UAVs
Abstract not provided.
Abstract not provided.
Journal of Optimization Theory and Applications
Control of nonlinear dynamical systems is a complex and multifaceted process. Essential elements of many engineering systems include high-fidelity physics-based modeling, offline trajectory planning, feedback control design, and data acquisition strategies to reduce uncertainties. This article proposes an optimization-centric perspective which couples these elements in a cohesive framework. We introduce a novel use of hyper-differential sensitivity analysis to understand the sensitivity of feedback controllers to parametric uncertainty in physics-based models used for trajectory planning. These sensitivities provide a foundation to define an optimal experimental design which seeks to acquire data most relevant in reducing demand on the feedback controller. Our proposed framework is illustrated on the Zermelo navigation problem and a hypersonic trajectory control problem using data from NASA’s X-43 hypersonic flight tests.
Proceedings of the American Control Conference
Reinforcement learning (RL) may enable fixedwing unmanned aerial vehicle (UAV) guidance to achieve more agile and complex objectives than typical methods. However, RL has yet struggled to achieve even minimal success on this problem; fixed-wing flight with RL-based guidance has only been demonstrated in literature with reduced state and/or action spaces. In order to achieve full 6-DOF RL-based guidance, this study begins training with imitation learning from classical guidance, a method known as warm-staring (WS), before further training using Proximal Policy Optimization (PPO). We show that warm starting is critical to successful RL performance on this problem. PPO alone achieved a 2% success rate in our experiments. Warm-starting alone achieved 32% success. Warm-starting plus PPO achieved 57% success over all policies, with 40% of policies achieving 94% success.
AIAA SciTech Forum and Exposition, 2023
Motion primitives (MPs) provide a fundamental abstraction of movement templates that can be used to guide and navigate a complex environment while simplifying the movement actions. These MPs, when utilized as an action space in reinforcement learning (RL), can allow an agent to learn to select a sequence of simple actions to guide a vehicle towards desired complex mission outcomes. This is particularly useful for missions involving high speed aerospace vehicles (HSAVs) (i.e., Mach 1 to 30) where near real time trajectory generation is needed but the computational cost and timeliness of trajectory generation remains prohibitive. This paperdemonstrates that when MPs are employed in conjunction with RL, the agent can learn to solve a wider range of problems for HSAV missions. To this end, using both a MP and and non-MP approach, RL is employed to solve the problem of an HSAV arriving at a non-maneuvering moving target at a constant altitude and with an arbitrary, but constant, velocity and heading angle. The MPs for HSAV consist of multiple pull (flight path angle) and turn (heading angle) commands that are defined for a specific duration based on mission phases; whereas the non-MP approach uses angle of attack and bank angle as action space for RL. The paper describes details on HSAV problem formulation to include equations of motion, observation space, telescopic reward function, RL algorithm and hyperparameters, RL curriculum, formation of the MPs, and calculation of time to execute the MP used for the problem. Our results demonstrate that the non-MP approach is unable to even train an agent that is successful in the base-case of the RL curriculum. The MP approach, however, can train an agent with success rate of 76.6% inarriving at a target moving with any heading angle with a velocity between 0 and 500 m/s.
Abstract not provided.
Abstract not provided.
Journal of Aerospace Information Systems
This paper describes how the performance of motion primitive-based planning algorithms can be improved using reinforcement learning. Specifically, we describe and evaluate a framework that autonomously improves the performance of a primitive-based motion planner. The improvement process consists of three phases: exploration, extraction, and reward updates. This process can be iterated continuously to provide successive improvement. The exploration step generates new trajectories, and the extraction step identifies new primitives from these trajectories. These primitives are then used to update rewards for continued exploration. This framework required novel shaping rewards, development of a primitive extraction algorithm, and modification of the Hybrid A* algorithm. The framework is tested on a navigation task using a nonlinear F-16 model. The framework autonomously added 91 motion primitives to the primitive library and reduced average path cost by 21.6 s, or 35.75% of the original cost. The learned primitives are applied to an obstacle field navigation task, which was not used in training, and reduced path cost by 16.3 s, or 24.1%. Additionally, two heuristics for the modified Hybrid A* algorithm are designed to improve effective branching factor.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Proceedings of the IEEE Conference on Decision and Control
Pyomo and Dakota are openly available software packages developed by Sandia National Labs. In this tutorial, methods for automating the optimization of controller parameters for a nonlinear cart-pole system are presented. Two approaches are described and demonstrated on the cart-pole example problem for tuning a linear quadratic regulator and also a partial feedback linearization controller. First the problem is formulated as a pseudospectral optimization problem under an open box methodology utilizing Pyomo, where the plant model is fully known to the optimizer. In the next approach, a black-box approach utilizing Dakota in concert with a MATLAB or Simulink plant model is discussed, where the plant model is unknown to the optimizer. A comparison of the two approaches provides the end user the advantages and shortcomings of each method in order to pick the right tool for their problem. We find that complex system models and objectives are easily incorporated in the Dakota-based approach with minimal setup time, while the Pyomo-based approach provides rapid solutions once the system model has been developed.
AIAA Scitech 2021 Forum
Multi-phase, pseudospectral optimization is employed in a variety of applications, but many of the world-class optimization libraries are closed-source. In this paper we formulate an open-source, object-oriented framework for dynamic optimization using the Pyomo modeling language. This strategy supports the reuse of common code for rapid, error-free model development. Flexibility of our framework is demonstrated on a series of dynamic optimization problems, including multi-phase trajectory optimization using highly accurate pseudospectral methods and controller gain optimization in the presence of stability margin constraints. We employ numerical procedures to improve convergence rates and solution accuracy. We validate our framework using GPOPS-II, a commercial, MATLAB-based optimization program, for a vehicle ascent problem. The trajectory results show close alignment with this state-of-the-art optimization suite.
AIAA Scitech 2021 Forum
This paper describes how the performance of motion primitive based planning algorithms can be improved using reinforcement learning. Specifically, we describe and evaluate a framework for policy improvement via the discovery of new motion primitives. Our approach combines the predictable behavior of deterministic planning methods with the exploration capability of reinforcement learning. The framework consists of three phases: evaluation, exploration, and extraction. This framework can be iterated continuously to provide successive improvement. The evaluation step scores the performance of a motion primitive library using value iteration to create a cost map. A local difference metric is then used to identify regions that need improvement. The exploration step utilizes reinforcement learning to examine new trajectories in the regions of greatest need. The extraction step encodes the agent’s experiences into new primitives. The framework is tested on a point-to-point navigation task using a 6DOF nonlinear F-16 model. One iteration of the framework discovered 17 new primitives and provided a maximum planning time reduction of 96.91%. After 3 full iterations, 123 primitives were added with a maximum time reduction of 97.39%. The proposed framework is easily extensible to a range of vehicles, environments, and cost functions.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
AIAA Scitech 2019 Forum
The generation of optimal trajectories for test flights of hypersonic vehicles with highly nonlinear dynamics and complicated physical and path constraints is often time consuming and sometimes intractable for high-fidelity, software-in-the-loop vehicle models. Practical use of hypersonic vehicles requires the ability to rapidly generate a feasible and robust optimal trajectory. We propose a solution that involves interaction between an optimizer using a low fidelity 3-DOF vehicle model and feedback from vehicle simulations of varying fidelities, with the goal of rapidly converging to a solution trajectory for a hypersonic vehicle mission. Further computational efficiency is sought using aerodynamic surrogate models in place of aerodynamic coefficient look-up tables. We address the need for rapidly converging optimization by analyzing how model fidelity choice impacts the quality and speed of the resulting guidance solution.