Efficient Reinforcement Learning using Gaussian Processes


This book examines Gaussian processes (GPs) in model-based reinforcement learning (RL) and inference in nonlinear dynamic systems. First, we introduce PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available. PILCO learns fast since it takes model uncertainties consistently into account during long-term planning and decision making. Thus, it reduces model bias, a common problem in model-based RL. Due to its generality and efficiency, PILCO is a conceptual and practical approach to jointly learning models and controllers fully automatically. Across all tasks, we report an unprecedented degree of automation and an unprecedented speed of learning. Second, we propose principled algorithms for robust filtering and smoothing in GP dynamic systems. Our methods are based on analytic moment matching and clearly advance state-of-the-art methods.