2025
SwiLTra-Bench: The Swiss Legal Translation Benchmark.
CoRR, March, 2025

2024
Can Large Language Models Infer Causation from Correlation?
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage.
CoRR, 2022

2021
Not All Memories are Created Equal: Learning to Forget by Expiring.
Proceedings of the 38th International Conference on Machine Learning, 2021

Retrieval Augmentation Reduces Hallucination in Conversation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions.
CoRR, 2020