2025
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning.
CoRR, May, 2025
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks.
CoRR, March, 2025
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions.
,
,
,
,
,
,
,
,
,
,
CoRR, February, 2025
LLM Pretraining with Continuous Concepts.
CoRR, February, 2025
Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge.
CoRR, January, 2025
2024
Training Large Language Models to Reason in a Continuous Latent Space.
CoRR, 2024
Towards Full Delegation: Designing Ideal Agentic Behaviors for Travel Planning.
CoRR, 2024
To the Globe (TTG): Towards Language-Driven Guaranteed Travel Planning.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Self-Rewarding Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Self-Alignment with Instruction Backtranslation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Better Alignment with Instruction Back-and-Forth Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Chain-of-Verification Reduces Hallucination in Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model.
CoRR, 2023
Large Language Model Programs.
CoRR, 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023
2022
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
OPT: Open Pre-trained Transformer Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
Efficient Language Modeling with Sparse all-MLP.
CoRR, 2022
Lifting the Curse of Multilinguality by Pre-training Modular Transformers.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Few-shot Learning with Multilingual Generative Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Efficient Large Scale Language Modeling with Mixtures of Experts.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Unified Speech-Text Pre-training for Speech Translation and Recognition.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
Efficient Large Scale Language Modeling with Mixtures of Experts.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Few-shot Learning with Multilingual Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs.
CoRR, 2021
Adaptive Sparse Transformer for Multilingual Translation.
CoRR, 2021
Towards Understanding the Optimal Behaviors of Deep Active Learning Algorithms.
CoRR, 2021
Robust Optimization for Multilingual Translation with Imbalanced Data.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task.
Proceedings of the 18th International Conference on Spoken Language Translation, 2021
Distributionally Robust Multilingual Machine Translation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021
Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021
Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021
Multilingual Translation from Denoising Pre-Training.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021
Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
Gender bias amplification during Speed-Quality optimization in Neural Machine Translation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
Improving Zero-Shot Translation by Disentangling Positional Information.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
Multilingual Speech Translation from Efficient Finetuning of Pretrained Models.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
Multilingual Denoising Pre-training for Neural Machine Translation.
Trans. Assoc. Comput. Linguistics, 2020
Cross-Modal Transfer Learning for Multilingual Speech-to-Text Translation.
CoRR, 2020
Multilingual Translation with Extensible Multilingual Pretraining and Finetuning.
CoRR, 2020
Findings of the WMT 2020 Shared Task on Machine Translation Robustness.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Fifth Conference on Machine Translation, 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Deep Transformers with Latent Depth.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Findings of the Fourth Workshop on Neural Generation and Translation.
Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020
Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
2019
Improved Variational Neural Machine Translation by Promoting Mutual Information.
CoRR, 2019
Findings of the First Shared Task on Machine Translation Robustness.
Proceedings of the Fourth Conference on Machine Translation, 2019
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Design and Evaluation of a Social Media Writing Support Tool for People with Dyslexia.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019
2018
A Corpus for Multilingual Document Classification in Eight Languages.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
2013
Dynamics of investor attention on the social web.
PhD thesis, 2013
2012
Financial and economic data management using Semantic Web technologies.
Proceedings of the 2012 IEEE Conference on Computational Intelligence for Financial Engineering & Economics, 2012
2011
TWC LOGD: A portal for linked open government data ecosystems.
,
,
,
,
,
,
,
,
,
,
,
,
J. Web Semant., 2011
Fundamental analysis powered by Semantic Web.
Proceedings of the 2011 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics, 2011
2010
TWC data-gov corpus: incrementally generating linked government data from data.gov.
Proceedings of the 19th International Conference on World Wide Web, 2010
Representing Financial Reports on the Semantic Web: - A Faithful Translation from XBRL to OWL.
Proceedings of the Semantic Web Rules - International Symposium, 2010
Provenance-Based Strategies to Develop Trust in Semantic Web Applications.
Proceedings of the Provenance and Annotation of Data and Processes, 2010
Data-gov Wiki: Towards Linking Government Data.
Proceedings of the Linked Data Meets Artificial Intelligence, 2010