Neuroevolutionary Reinforcement Learning for Generalized Control of Simulated Helicopters
Rogier Koppejan and Shimon Whiteson
Abstract
This article presents an extended case study in the application of neuroevolution to generalized simulated helicopter hovering, an important challenge problem for reinforcement learning. While neuroevolution is well suited to coping with the domain's complex transition dynamics and high-dimensional state and action spaces, the need to explore efficiently and learn on-line poses unusual challenges. We propose and evaluate several methods for three increasingly challenging variations of the task, including the method that won first place in the 2008 Reinforcement Learning Competition. The results demonstrate that 1) neuroevolution can be effective for complex on-line reinforcement learning tasks such as generalized helicopter hovering, 2) neuroevolution excels at finding effective helicopter hovering policies but not at learning helicopter models, 3) due to the difficulty of learning reliable models, model-based approaches to helicopter hovering are feasible only when domain expertise is available to aid the design of a suitable model representation and 4) recent advances in efficient resampling can enable neuroevolution to tackle more aggressively generalized reinforcement learning tasks. NOTE: This article contains a minor error in that action features a1 and a2 have been swapped in Table 2. The accompanying source code to which the article refers is correct.