Shayne Longpre

According to our database1, Shayne Longpre authored at least 45 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Survey on Data Selection for Language Models.
Trans. Mach. Learn. Res., 2024

Scaling Instruction-Finetuned Language Models.
J. Mach. Learn. Res., 2024

Future and AI-Ready Data Strategies: Response to DOC RFI on AI and Open Government Data Assets.
CoRR, 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons.
CoRR, 2024

The Foundation Model Transparency Index v1.1: May 2024.
CoRR, 2024

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources.
CoRR, 2024

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models.
CoRR, 2024

AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research.
CoRR, 2024

Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
CoRR, 2024

On the Societal Impact of Open Foundation Models.
CoRR, 2024

A Safe Harbor for AI Evaluation and Red Teaming.
CoRR, 2024

Foundation Model Transparency Reports.
CoRR, 2024

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Position: AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Position: Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Proceedings of the Forty-first International Conference on Machine Learning, 2024



Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

OctoPack: Instruction Tuning Code Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI.
CoRR, 2023

The Foundation Model Transparency Index.
CoRR, 2023

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models.
CoRR, 2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts.
CoRR, 2023

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning.
Proceedings of the International Conference on Machine Learning, 2023

2022
Scaling Instruction-Finetuned Language Models.
CoRR, 2022

MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages.
CoRR, 2022

Active Learning Over Multiple Domains in Natural Language Tasks.
CoRR, 2022


Combining Compressions for Multiplicative Size Scaling on Natural Language Tasks.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering.
Trans. Assoc. Comput. Linguistics, 2021

Question Rewriting for Conversational Question Answering.
Proceedings of the WSDM '21, 2021

On the Transferability of Minimal Prediction Preserving Inputs in Question Answering.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Open-Domain Question Answering Goes Conversational via Question Rewriting.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Entity-Based Knowledge Conflicts in Question Answering.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

A Comparison of Question Rewriting Methods for Conversational Passage Retrieval.
Proceedings of the Advances in Information Retrieval, 2021

Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Pivot Through English: Reliably Answering Multilingual Questions without Document Retrieval.
CoRR, 2020

A Wrong Answer or a Wrong Question? An Intricate Relationship between Question Reformulation and Answer Selection in Conversational Question Answering.
CoRR, 2020

Leveraging Query Resolution and Reading Comprehension for Conversational Passage Retrieval.
Proceedings of the Twenty-Ninth Text REtrieval Conference, 2020

How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019
An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering.
Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

2016
A Way out of the Odyssey: Analyzing and Combining Recent Insights for LSTMs.
CoRR, 2016


  Loading...