Don't Compress Gradients in Random Reshuffling: Compress Gradient Differences.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Directional Smoothness and Gradient Methods: Convergence and Adaptivity.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Tuning-Free Stochastic Optimization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization.
J. Optim. Theory Appl., November, 2023
Better Theory for SGD in the Nonconvex World.
Trans. Mach. Learn. Res., 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Faster federated optimization under second-order similarity.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Federated Optimization Algorithms with Random Reshuffling and Gradient Compression.
CoRR, 2022
Proximal and Federated Random Reshuffling.
Proceedings of the International Conference on Machine Learning, 2022
FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022
Applying fast matrix multiplication to neural networks.
Proceedings of the SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing, online event, [Brno, Czech Republic], March 30, 2020
Random Reshuffling: Simple Analysis with Vast Improvements.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Tighter Theory for Local SGD on Identical and Heterogeneous Data.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020
Distributed Fixed Point Methods with Compressed Iterates.
CoRR, 2019
Better Communication Complexity for Local SGD.
CoRR, 2019
Gradient Descent with Compressed Iterates.
CoRR, 2019
First Analysis of Local GD on Heterogeneous Data.
CoRR, 2019