We are a research group at UCL’s Centre for Artificial Intelligence. Our research expertise is in data-efficient machine learning, probabilistic modeling, and autonomous decision making. Applications focus on robotics, climate science, nuclear fusion, and sustainable development.
If you are interested in joining the team, please check out our openings.
SML Group in November 2022
Machine learning, Gaussian processes, Reinforcement learning, Robotics, Meta learning
Machine learning, Robotics
Machine learning, Climate science, Fluid mechanics, Geometric mechanics
Meta-learning, Probabilistic Programming, Reinforcement Learning, Deep Generative Models
Machine learning, Optimal transport, Gaussian processes
Generative models, Reinforcement learning, Natural language processing, Scalable and safe machine learning.
Machine learning, Robotics, Transfer Learning, Reinforcement Learning
Machine learning, Gaussian processes
Sociotechnical AI, Robotics
Machine learning, Generative models, Large-scale deep learning, Variational inference, Information theory, Sparsity
Machine learning, Gaussian processes, Earth systems modelling
Machine learning, Graph neural networks, Diffusion models, PAC-Bayes
Computer vision, Uncertainty estimation
Probabilistic modeling, Approximate inference, Machine learning, Climate science
Machine learning, Nuclear fusion, Bayesian optimization
Machine learning, Bayesian theory, Geometric machine learning
Machine learning, Gaussian processes, Meta learning, Structural priors, Variational inference
Machine learning, Discrete optimization, Differential privacy, Submodularity
Machine learning, Gaussian processes, Bayesian optimization
Machine learning, Meta learning, Differential geometry, Reinforcement learning
Machine learning, Deep probabilistic models, Approximate inference
Machine learning, Gaussian processes, Bayesian optimization, Practical approximate inference
Machine learning, Reinforcement learning, Optimal control, Copulas
Machine learning, Community detection, Representation of graphs, Hyperbolic embeddings
Machine learning, Bayesian optimization, Mechanistic models, Model discrimination
With the advent of large datasets, offline reinforcement learning is a promising framework for learning good decision-making policies without the need to interact with the real environment. However, offline RL requires the dataset to be reward-annotated, which presents practical challenges when reward engineering is difficult or when obtaining reward annotations is labor-intensive. In this paper, we introduce Optimal Transport Relabeling (OTR), an imitation learning algorithm that can automatically relabel offline data of mixed and unknown quality with rewards from a few good demonstrations. OTR’s key idea is to use optimal transport to compute an optimal alignment between an unlabeled trajectory in the dataset and an expert demonstration to obtain a similarity measure that can be interpreted as a reward, which can then be used by an offline RL algorithm to learn the policy. OTR is easy to implement and computationally efficient. On D4RL benchmarks, we demonstrate that OTR with a single demonstration can consistently match the performance of offline RL with ground-truth rewards.
A typical criticism of Gaussian processes is their unfavourable scaling in both compute and memory requirements. Sparse variational Gaussian processes based on inducing variables are commonly used to scale Gaussian processes to large dataset sizes; their inherent compute and memory requirements are dominated by the number of inducing variables used. However, in practise sparse GPs are still limited by the number of datapoints and the number of inducing points one can use to perform matrix operations, making it again challenging to model large complex datasets. In this work we propose a new class of inter-domain variational GP, constructed by projecting the GP onto a set of compactly supported B-Spline basis functions. The key benefit of our approach is that the compact support of the B-Spline basis admits the use of sparse linear algebra to significantly speed up matrix operations and drastically reduce the memory footprint.
Bayesian inference in non-linear dynamical systems seeks to find good posterior approximations of a latent state given a sequence of observations. Gaussian filters and smoothers, including the (extended/unscented) Kalman filter/smoother, which are commonly used in engineering applications, yield Gaussian posteriors on the latent state. While they are computationally efficient, they are often criticised for their crude approximation of the posterior state distribution. In this paper, we address this criticism by proposing a message passing scheme for iterative state estimation in non-linear dynamical systems, which yields more informative (Gaussian) posteriors on the latent states. Our message passing scheme is based on expectation propagation (EP). We prove that classical Rauch–Tung–Striebel (RTS) smoothers, such as the extended Kalman smoother (EKS) or the unscented Kalman smoother (UKS), are special cases of our message passing scheme. Running the message passing scheme more than once can lead to significant improvements of the classical RTS smoothers, so that more informative state estimates can be obtained. We address potential convergence issues of EP by generalising our state estimation framework to damped updates and the consideration of general alpha-divergences.
Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels
As Gaussian processes are used to answer increasingly complex questions, analytic solutions become scarcer and scarcer. Monte Carlo methods act as a convenient bridge for connecting intractable mathematical expressions with actionable estimates via sampling. Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations. This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector. These methods are prohibitively expensive in cases where we would, ideally, like to draw high-dimensional vectors or even continuous sample paths. In this work, we investigate a different line of reasoning: rather than focusing on distributions, we articulate Gaussian conditionals at the level of random variables. We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors. Starting from first principles, we derive these methods and analyze the approximation errors they introduce. We, then, ground these results by exploring the practical implications of pathwise conditioning in various applied settings, such as global optimization and reinforcement learning.