AutoJudge: Judge Decoding Without Manual Annotation.
CoRR, April, 2025
Multilingual Language Model Pretraining using Machine-translated Data.
CoRR, February, 2025
Label Privacy in Split Learning for Large Models with Parameter-Efficient Training.
CoRR, 2024
Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language.
CoRR, 2024
The Hallucinations Leaderboard - An Open Effort to Measure Hallucinations in Large Language Models.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding.
CoRR, 2024
RedPajama: an Open Dataset for Training Large Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Sequoia: Scalable and Robust Speculative Decoding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy.
CoRR, 2023
High-throughput Generative Inference of Large Language Models with a Single GPU.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation.
CoRR, 2023
Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Distributed Inference and Fine-tuning of Large Language Models Over The Internet.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient.
Proceedings of the International Conference on Machine Learning, 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.
Proceedings of the International Conference on Machine Learning, 2023
Petals: Collaborative Inference and Fine-tuning of Large Models.
CoRR, 2022
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Secure Distributed Training at Scale.
Proceedings of the International Conference on Machine Learning, 2022
RuCoLA: Russian Corpus of Linguistic Acceptability.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Distributed Deep Learning In Open Collaborations.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Training Transformers Together.
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, 2021
It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Embedding Words in Non-Vector Space with Unsupervised Graph Learning.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020