2025

AutoJudge: Judge Decoding Without Manual Annotation.

[DOI]

Roman Garipov

Fedor Velikonivtsev

Ruslan Svirschevski

Vage Egiazarian

Max Ryabinin

CoRR, April, 2025

Multilingual Language Model Pretraining using Machine-translated Data.

[DOI]

David Ifeoluwa Adelani

Yihong Chen

Raphael Tang

Pontus Stenetorp

CoRR, February, 2025

Towards Best Practices for Open Datasets for LLM Training.

[DOI]

CoRR, January, 2025

2024

Label Privacy in Split Learning for Large Models with Parameter-Efficient Training.

[DOI]

CoRR, 2024

INTELLECT-1 Technical Report.

[DOI]

CoRR, 2024

Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language.

[DOI]

CoRR, 2024

The Hallucinations Leaderboard - An Open Effort to Measure Hallucinations in Large Language Models.

[DOI]

Laura Perez-Beltrachini

CoRR, 2024

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding.

[DOI]

CoRR, 2024

RedPajama: an Open Dataset for Training Large Language Models.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Sequoia: Scalable and Robust Speculative Decoding.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements.

[DOI]

Anton Voronov

Lena Wolf

Max Ryabinin

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy.

[DOI]

Anton Baryshnikov

Max Ryabinin

CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.

[DOI]

CoRR, 2023

Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation.

[DOI]

CoRR, 2023

Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Distributed Inference and Fine-tuning of Large Language Models Over The Internet.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

Petals: Collaborative Inference and Fine-tuning of Large Models.

[DOI]

CoRR, 2022

Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees.

[DOI]

Alexander V. Gasnikov

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Secure Distributed Training at Scale.

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

RuCoLA: Russian Corpus of Linguistic Acceptability.

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets.

[DOI]

Max Ryabinin

Andrey Malinin

Mark J. F. Gales

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Distributed Deep Learning In Open Collaborations.

[DOI]

Albert Villanova del Moral

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Training Transformers Together.

[DOI]

Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, 2021

It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning.

[DOI]

Alexey Tikhonov

Max Ryabinin

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020

Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts.

[DOI]

Max Ryabinin

Anton Gusev

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Embedding Words in Non-Vector Space with Unsupervised Graph Learning.

[DOI]

Max Ryabinin

Sergei Popov

Liudmila Prokhorenkova

Elena Voita

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020