Conformal Calibration (UCL MSc Thesis), Sept 2025
A post-training method which makes value-based RL algorithms more robust to distribution shift. The idea works by observing agents in the training environment and using conformal prediction to lower bound the action-value function.
Supervised by Mirco Musolesi and Lorenz Wolf.