2025
LiveBench: A Challenging, Contamination-Limited LLM Benchmark.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
vTune: Verifiable Fine-Tuning for LLMs Through Backdooring.
CoRR, 2024

LiveBench: A Challenging, Contamination-Free LLM Benchmark.
CoRR, 2024

Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive.
CoRR, 2024

Large Language Models Must Be Taught to Know What They Don't Know.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

A Soft Robotic System Automatically Learns Precise Agile Motions Without Model Information.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

2023
Giraffe: Adventures in Expanding Context Lengths in LLMs.
CoRR, 2023

2018
Understanding disentangling in β-VAE.
CoRR, 2018

SCAN: Learning Hierarchical Compositional Visual Concepts.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
SCAN: Learning Abstract Hierarchical Compositional Visual Concepts.
CoRR, 2017

DARLA: Improving Zero-Shot Transfer in Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Early Visual Concept Learning with Unsupervised Deep Learning.
CoRR, 2016