LIMA: Less Is More for Alignment.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
LMentry: A Language Model Benchmark of Elementary Language Tasks.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
SCROLLS: Standardized CompaRison Over Long Language Sequences.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
How Optimal is Greedy Decoding for Extractive Question Answering?
Proceedings of the 4th Conference on Automated Knowledge Base Construction, 2022
Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
The Turking Test: Can Language Models Understand Instructions?
CoRR, 2020
A Simple and Effective Model for Answering Multi-span Questions.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
Tag-based Multi-Span Extraction in Reading Comprehension.
CoRR, 2019