Peter Richtárik

CoRR, January, 2025

Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Faster Rates for Compressed Federated Learning with Client-Variance Reduction.

[BibT_eX]

[DOI]

Haoyu Zhao

SIAM J. Math. Data Sci., March, 2024

Federated Sampling with Langevin Algorithm under Isoperimetry.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

On the Convergence of DP-SGD with Adaptive Clipping.

[BibT_eX]

[DOI]

CoRR, 2024

MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes.

[BibT_eX]

[DOI]

Igor Sokolov

CoRR, 2024

Differentially Private Random Block Coordinate Descent.

[BibT_eX]

[DOI]

CoRR, 2024

Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem.

[BibT_eX]

[DOI]

CoRR, 2024

Error Feedback under (L0,L1)-Smoothness: Normalization and Momentum.

[BibT_eX]

[DOI]

CoRR, 2024

Tighter Performance Theory of FedExProx.

[BibT_eX]

[DOI]

CoRR, 2024

Unlocking FedNL: Self-Contained Compute-Optimized Implementation.

[BibT_eX]

[DOI]

CoRR, 2024

Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation.

[BibT_eX]

[DOI]

Hasan Abed Al Kader Hammoud

Umberto Michieli

CoRR, 2024

MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times.

[BibT_eX]

[DOI]

Omar Shaikh Omar

CoRR, 2024

On the Convergence of FedProx with Extrapolation and Inexact Prox.

[BibT_eX]

[DOI]

Hanmin Li

CoRR, 2024

Methods for Convex (L0,L1)-Smooth Optimization: Clipping, Acceleration, and Adaptivity.

[BibT_eX]

[DOI]

CoRR, 2024

Cohort Squeeze: Beyond a Single Communication Round per Cohort in Cross-Device Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Prune at the Clients, Not the Server: Accelerated Sparse Training in Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2024

SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2024

A Unified Theory of Stochastic Proximal Point Methods without Smoothness.

[BibT_eX]

[DOI]

Yury Demidovich

CoRR, 2024

FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models.

[BibT_eX]

[DOI]

CoRR, 2024

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction.

[BibT_eX]

[DOI]

Yury Demidovich

CoRR, 2024

LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression.

[BibT_eX]

[DOI]

CoRR, 2024

Correlated Quantization for Faster Nonconvex Distributed Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

On the Optimal Time Complexities in Decentralized Stochastic Asynchronous Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Shadowheart SGD: Distributed Asynchronous SGD with Optimal Time Complexity Under Arbitrary Computation and Communication Heterogeneity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Freya PAGE: First Optimal Time Complexity for Large-Scale Nonconvex Finite-Sum Optimization with Heterogeneous Asynchronous Computations.

[BibT_eX]

[DOI]

Kaja Gruntkowska

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Don't Compress Gradients in Random Reshuffling: Compress Gradient Differences.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression.

[BibT_eX]

[DOI]

Dan Alistarh

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

The Power of Extrapolation in Federated Learning.

[BibT_eX]

[DOI]

Hanmin Li

Kirill Acharya

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Improving the Worst-Case Bidirectional Communication Complexity for Nonconvex Distributed Optimization under Function Similarity.

[BibT_eX]

[DOI]

Kaja Gruntkowska

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards a Better Theoretical Understanding of Independent Subnetwork Training.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise.

[BibT_eX]

[DOI]

Pavel E. Dvurechensky

Proceedings of the Forty-first International Conference on Machine Learning, 2024

FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization.

[BibT_eX]

[DOI]

Hanmin Li

Avetik G. Karagulyan

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Understanding Progressive Training Through the Framework of Randomized Coordinate Descent.

[BibT_eX]

[DOI]

Rafal Szlendak

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization.

[BibT_eX]

[DOI]

J. Optim. Theory Appl., November, 2023

Stochastic distributed learning with gradient quantization and double-variance reduction.

[BibT_eX]

[DOI]

Sebastian U. Stich

Optim. Methods Softw., January, 2023

Sharper Rates and Flexible Framework for Nonconvex SGD with Client and Data Sampling.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Adaptive Compression for Communication-Efficient Distributed Training.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Better Theory for SGD in the Nonconvex World.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Personalized Federated Learning with Communication Compression.

[BibT_eX]

[DOI]

El Houcine Bergou

Trans. Mach. Learn. Res., 2023

On Biased Compression for Distributed Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

MAST: Model-Agnostic Sparsified Training.

[BibT_eX]

[DOI]

CoRR, 2023

Byzantine Robustness and Partial Participation Can Be Achieved Simultaneously: Just Clip Gradient Differences.

[BibT_eX]

[DOI]

CoRR, 2023

Improving Accelerated Federated Learning with Compression and Importance Sampling.

[BibT_eX]

[DOI]

Michal Grudzien

CoRR, 2023

Clip21: Error Feedback for Gradient Clipping.

[BibT_eX]

[DOI]

CoRR, 2023

Global-QSGD: Practical Floatless Quantization for Distributed Learning with Theoretical Guarantees.

[BibT_eX]

[DOI]

CoRR, 2023

Error Feedback Shines when Features are Rare.

[BibT_eX]

[DOI]

CoRR, 2023

Explicit Personalization and Local Training: Double Communication Acceleration in Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2023

ELF: Federated Langevin Algorithms with Primal, Dual and Bidirectional Compression.

[BibT_eX]

[DOI]

Avetik G. Karagulyan

CoRR, 2023

TAMUNA: Accelerated Federated Learning with Local Training and Partial Participation.

[BibT_eX]

[DOI]

CoRR, 2023

Federated Learning with Regularized Client Participation.

[BibT_eX]

[DOI]

CoRR, 2023

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes.

[BibT_eX]

[DOI]

Slavomír Hanzely

CoRR, 2023

Random Reshuffling with Variance Reduction: New Analysis and Better Rates.

[BibT_eX]

[DOI]

Alibek Sailanbayev

Proceedings of the Uncertainty in Artificial Intelligence, 2023

A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Optimal Time Complexities of Parallel Stochastic Optimization Methods Under a Fixed Computation Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2Direction: Theoretically Faster Distributed Training with Bidirectional Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Momentum Provably Improves Error Feedback!

[BibT_eX]

[DOI]

Ilyas Fatkhullin

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Guide Through the Zoo of Biased SGD.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance.

[BibT_eX]

[DOI]

Pavel E. Dvurechensky

Proceedings of the International Conference on Machine Learning, 2023

EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression.

[BibT_eX]

[DOI]

Kaja Gruntkowska

Proceedings of the International Conference on Machine Learning, 2023

DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Kimad: Adaptive Gradient Compression with Bandwidth Awareness.

[BibT_eX]

[DOI]

Proceedings of the 4th International Workshop on Distributed Machine Learning, 2023

Server-Side Stepsizes and Sampling Without Replacement Provably Help in Federated Optimization.

[BibT_eX]

[DOI]

Proceedings of the 4th International Workshop on Distributed Machine Learning, 2023

Federated Learning is Better with Non-Homomorphic Encryption.

[BibT_eX]

[DOI]

Abdulmajeed Alrowithi

Fahad Ali Albalawi

Proceedings of the 4th International Workshop on Distributed Machine Learning, 2023

Convergence of Stein Variational Gradient Descent under a Weaker Smoothness Condition.

[BibT_eX]

[DOI]

Avetik G. Karagulyan

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Catalyst Acceleration of Error Compensated Methods Leads to Better Communication Complexity.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Can 5th Generation Local Training Methods Support Client Sampling? Yes!

[BibT_eX]

[DOI]

Michal Grudzien

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

FedShuffle: Recipes for Better Use of Local Work in Federated Learning.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2022

Optimal Client Sampling for Federated Learning.

[BibT_eX]

[DOI]

Wenlin Chen

Trans. Mach. Learn. Res., 2022

Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization.

[BibT_eX]

[DOI]

SIAM J. Math. Data Sci., 2022

Quasi-Newton methods for machine learning: forget the past, just sample.

[BibT_eX]

[DOI]

Optim. Methods Softw., 2022

Dualize, Split, Randomize: Toward Fast Nonsmooth Optimization Algorithms.

[BibT_eX]

[DOI]

J. Optim. Theory Appl., 2022

Direct nonlinear acceleration.

[BibT_eX]

[DOI]

EURO J. Comput. Optim., 2022

Can 5th Generation Local Training Methods Support Client Sampling? Yes!

[BibT_eX]

[DOI]

Michal Grudzien

CoRR, 2022

GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity.

[BibT_eX]

[DOI]

CoRR, 2022

Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication.

[BibT_eX]

[DOI]

Ivan Agarský

CoRR, 2022

Improved Stein Variational Gradient Descent with Importance Weights.

[BibT_eX]

[DOI]

CoRR, 2022

Adaptive Learning Rates for Faster Stochastic Gradient Methods.

[BibT_eX]

[DOI]

CoRR, 2022

Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox.

[BibT_eX]

[DOI]

CoRR, 2022

A Note on the Convergence of Mirrored Stein Variational Gradient Descent under (L0, L1)-Smoothness Condition.

[BibT_eX]

[DOI]

CoRR, 2022

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression.

[BibT_eX]

[DOI]

CoRR, 2022

Certified Robustness in Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Federated Learning with a Sampling Algorithm under Isoperimetry.

[BibT_eX]

[DOI]

CoRR, 2022

Federated Random Reshuffling with Compression and Variance Reduction.

[BibT_eX]

[DOI]

CoRR, 2022

DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization.

[BibT_eX]

[DOI]

CoRR, 2022

Shifted compression framework: generalizations and improvements.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2022

BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Optimal Algorithms for Decentralized Stochastic Variational Inequalities.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Damped Newton Method Achieves Global $\mathcal O \left(\frac{1}{k^2}\right)$ and Local Quadratic Convergence Rate.

[BibT_eX]

[DOI]

Slavomír Hanzely

Dmitry Kamzolov

Dmitry Pasechnyuk

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EF-BV: A Unified Theory of Error Feedback and Variance Reduction Mechanisms for Biased and Unbiased Compression in Distributed Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Natural Compression for Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Mathematical and Scientific Machine Learning, 2022

MURANA: A Generic Framework for Stochastic Variance-Reduced Optimization.

[BibT_eX]

[DOI]

Proceedings of the Mathematical and Scientific Machine Learning, 2022

A Convergence Theory for SVGD in the Population Limit under Talagrand's Inequality T1.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

FedNL: Making Newton-Type Methods Applicable to Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

[BibT_eX]

[DOI]

Sebastian U. Stich

Proceedings of the International Conference on Machine Learning, 2022

Proximal and Federated Random Reshuffling.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Permutation Compressors for Provably Faster Distributed Nonconvex Optimization.

[BibT_eX]

[DOI]

Rafal Szlendak

Proceedings of the Tenth International Conference on Learning Representations, 2022

IntSGD: Adaptive Floatless Compression of Stochastic Gradients.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

An Optimal Algorithm for Strongly Convex Minimization under Affine Constraints.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Basis Matters: Better Communication-Efficient Second Order Methods for Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Revisiting Randomized Gossip Algorithms: General Framework, Convergence Rates and Novel Block and Accelerated Protocols.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2021

Stochastic quasi-gradient methods: variance reduction via Jacobian sketching.

[BibT_eX]

[DOI]

Robert M. Gower

Francis R. Bach

Math. Program., 2021

L-SVRG and L-Katyusha with Arbitrary Sampling.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

EF21 with Bells & Whistles: Practical Algorithmic Extensions of Modern Error Feedback.

[BibT_eX]

[DOI]

CoRR, 2021

FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning.

[BibT_eX]

[DOI]

Haoyu Zhao

CoRR, 2021

A Field Guide to Federated Optimization.

[BibT_eX]

[DOI]

CoRR, 2021

Smoothness-Aware Quantization Techniques.

[BibT_eX]

[DOI]

CoRR, 2021

Complexity Analysis of Stein Variational Gradient Descent Under Talagrand's Inequality T1.

[BibT_eX]

[DOI]

CoRR, 2021

ZeroSARAH: Efficient Nonconvex Finite-Sum Optimization with Zero Full Gradient Computation.

[BibT_eX]

[DOI]

CoRR, 2021

AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods.

[BibT_eX]

[DOI]

CoRR, 2021

IntSGD: Floatless Compression of Stochastic Gradients.

[BibT_eX]

[DOI]

CoRR, 2021

Accelerated Bregman proximal gradient methods for relatively smooth convex optimization.

[BibT_eX]

[DOI]

Lin Xiao

Comput. Optim. Appl., 2021

Fastest rates for stochastic mirror descent methods.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 2021

Scaling Distributed Machine Learning with In-Network Aggregation.

[BibT_eX]

[DOI]

Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, 2021

Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback.

[BibT_eX]

[DOI]

Igor Sokolov

Ilyas Fatkhullin

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Error Compensated Distributed SGD Can Be Accelerated.

[BibT_eX]

[DOI]

Tong Zhang

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Stochastic Sign Descent Methods: New Algorithms and Better Theory.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

ADOM: Accelerated Decentralized Optimization Method for Time-Varying Networks.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Distributed Second Order Methods with Fast Rates and Compressed Communication.

[BibT_eX]

[DOI]

Rustem Islamov

Proceedings of the 38th International Conference on Machine Learning, 2021

MARINA: Faster Non-Convex Distributed Learning with Compression.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

FL_PyTorch: optimization research simulator for federated learning.

[BibT_eX]

[DOI]

Proceedings of the DistributedML '21: Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning, 2021

A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free!

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Hyperparameter Transfer Learning with Adaptive Complexity.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Local SGD: Unified Theory and New Efficient Methods.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Best Pair Formulation & Accelerated Scheme for Non-Convex Principal Component Pursuit.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2020

Convergence Analysis of Inexact Randomized Iterative Methods.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2020

Stochastic Reformulations of Linear Systems: Algorithms and Convergence Theory.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., 2020

Stochastic Three Points Method for Unconstrained Smooth Minimization.

[BibT_eX]

[DOI]

El Houcine Bergou

SIAM J. Optim., 2020

Variance-Reduced Methods for Machine Learning.

[BibT_eX]

[DOI]

Proc. IEEE, 2020

Optimal Gradient Compression for Distributed and Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Distributed Proximal Splitting Algorithms with Rates and Acceleration.

[BibT_eX]

[DOI]

CoRR, 2020

A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization.

[BibT_eX]

[DOI]

CoRR, 2020

Adaptive Learning of the Optimal Mini-Batch Size of SGD.

[BibT_eX]

[DOI]

CoRR, 2020

Dualize, Split, Randomize: Fast Nonsmooth Optimization Algorithms.

[BibT_eX]

[DOI]

CoRR, 2020

On the Convergence Analysis of Asynchronous SGD for Solving Consistent Linear Systems.

[BibT_eX]

[DOI]

CoRR, 2020

Fast Linear Convergence of Randomized BFGS.

[BibT_eX]

[DOI]

CoRR, 2020

Uncertainty Principle for Communication Compression in Distributed and Federated Learning and the Search for an Optimal Compressor.

[BibT_eX]

[DOI]

CoRR, 2020

Federated Learning of a Mixture of Global and Local Models.

[BibT_eX]

[DOI]

CoRR, 2020

Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 2020

99% of Worker-Master Communication in Distributed Optimization Is Not Needed.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

Primal Dual Interpretation of the Proximal Stochastic Gradient Langevin Algorithm.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Random Reshuffling: Simple Analysis with Vast Improvements.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Lower Bounds and Optimal Algorithms for Personalized Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Linearly Converging Error Compensated SGD.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

From Local SGD to Local Fixed-Point Methods for Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Stochastic Subspace Cubic Newton Method.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

A Stochastic Derivative Free Optimization Method with Momentum.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory, 2020

Revisiting Stochastic Extragradient.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Tighter Theory for Local SGD on Identical and Heterogeneous Data.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Stochastic Derivative-Free Optimization Method with Importance Sampling: Theory and Learning to Control.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Randomized Projection Methods for Convex Feasibility: Conditioning and Convergence Rates.

[BibT_eX]

[DOI]

Ion Necoara

Andrei Patrascu

SIAM J. Optim., 2019

New Convergence Aspects of Stochastic Gradient Algorithms.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2019

Distributed Fixed Point Methods with Compressed Iterates.

[BibT_eX]

[DOI]

CoRR, 2019

Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates.

[BibT_eX]

[DOI]

CoRR, 2019

Better Communication Complexity for Local SGD.

[BibT_eX]

[DOI]

CoRR, 2019

Gradient Descent with Compressed Iterates.

[BibT_eX]

[DOI]

CoRR, 2019

First Analysis of Local GD on Heterogeneous Data.

[BibT_eX]

[DOI]

CoRR, 2019

One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods.

[BibT_eX]

[DOI]

CoRR, 2019

99% of Parallel Optimization is Inevitably a Waste of Time.

[BibT_eX]

[DOI]

CoRR, 2019

SGD: General Analysis and Improved Rates.

[BibT_eX]

[DOI]

CoRR, 2019

A Privacy Preserving Randomized Gossip Algorithm via Controlled Noise Insertion.

[BibT_eX]

[DOI]

CoRR, 2019

Distributed Learning with Compressed Gradient Differences.

[BibT_eX]

[DOI]

CoRR, 2019

Online and Batch Supervised Background Estimation Via L1 Regression.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Stochastic Convolutional Sparse Coding.

[BibT_eX]

[DOI]

Jinhui Xiong

Wolfgang Heidrich

Proceedings of the 24th International Symposium on Vision, Modeling, and Visualization, 2019

Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

RSN: Randomized Subspace Newton.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

SGD with Arbitrary Sampling: General Analysis and Improved Rates.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

SAGA with Arbitrary Sampling.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Nonconvex Variance Reduced Optimization with Arbitrary Sampling.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Provably Accelerated Randomized Gossip Algorithms.

[BibT_eX]

[DOI]

Michael G. Rabbat

Proceedings of the IEEE International Conference on Acoustics, 2019

Accelerated Coordinate Descent with Arbitrary Sampling and Best Rates for Minibatches.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

A Nonconvex Projection Method for Robust PCA.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications.

[BibT_eX]

[DOI]

Antonin Chambolle

Matthias J. Ehrhardt

Carola-Bibiane Schönlieb

SIAM J. Optim., 2018

On the complexity of parallel coordinate descent.

[BibT_eX]

[DOI]

Rachael Tappenden

Optim. Methods Softw., 2018

Importance Sampling for Minibatches.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2018

Randomized Distributed Mean Estimation: Accuracy vs. Communication.

[BibT_eX]

[DOI]

Frontiers Appl. Math. Stat., 2018

Weighted Low-Rank Approximation of Matrices and Background Modeling.

[BibT_eX]

[DOI]

Xin Li

CoRR, 2018

Matrix Completion Under Interval Uncertainty: Highlights.

[BibT_eX]

[DOI]

Jakub Marecek

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2018

Stochastic Spectral and Conjugate Descent Methods.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

SEGA: Variance Reduction via Gradient Sketching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

SGD and Hogwild! Convergence Without the Bounded Gradients Assumption.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Randomized Block Cubic Newton Method.

[BibT_eX]

[DOI]

Nikita Doikov

Proceedings of the 35th International Conference on Machine Learning, 2018

Coordinate Descent Faceoff: Primal or Dual?

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory, 2018

Accelerated Gossip via Stochastic Heavy Ball Method.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Allerton Conference on Communication, 2018

2017

Randomized Quasi-Newton Updates Are Linearly Convergent Matrix Inversion Algorithms.

[BibT_eX]

[DOI]

Robert M. Gower

SIAM J. Matrix Anal. Appl., 2017

Distributed optimization with arbitrary local solvers.

[BibT_eX]

[DOI]

Optim. Methods Softw., 2017

Semi-stochastic coordinate descent.

[BibT_eX]

[DOI]

Optim. Methods Softw., 2017

Semi-Stochastic Gradient Descent Methods.

[BibT_eX]

[DOI]

Frontiers Appl. Math. Stat., 2017

Matrix completion under interval uncertainty.

[BibT_eX]

[DOI]

Jakub Marecek

Eur. J. Oper. Res., 2017

Linearly convergent stochastic heavy ball method for minimizing generalization error.

[BibT_eX]

[DOI]

CoRR, 2017

A Batch-Incremental Video Background Estimation Model Using Weighted Low-Rank Approximation of Matrices.

[BibT_eX]

[DOI]

Xin Li

Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

2016

Optimization in High Dimensions via Accelerated, Parallel, and Proximal Coordinate Descent.

[BibT_eX]

[DOI]

Olivier Fercoq

SIAM Rev., 2016

Coordinate descent with arbitrary sampling II: expected separable overapproximation.

[BibT_eX]

[DOI]

Optim. Methods Softw., 2016

Coordinate descent with arbitrary sampling I: algorithms and complexity.

[BibT_eX]

[DOI]

Optim. Methods Softw., 2016

On optimal probabilities in stochastic coordinate descent methods.

[BibT_eX]

[DOI]

Optim. Lett., 2016

Parallel coordinate descent methods for big data optimization.

[BibT_eX]

[DOI]

Math. Program., 2016

Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2016

Inexact Coordinate Descent: Complexity and Preconditioning.

[BibT_eX]

[DOI]

Rachael Tappenden

Jacek Gondzio

J. Optim. Theory Appl., 2016

Distributed Coordinate Descent Method for Learning with Big Data.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2016

AIDE: Fast and Communication Efficient Distributed Optimization.

[BibT_eX]

[DOI]

CoRR, 2016

Federated Learning: Strategies for Improving Communication Efficiency.

[BibT_eX]

[DOI]

Ananda Theertha Suresh

Dave Bacon

CoRR, 2016

Federated Optimization: Distributed Machine Learning for On-Device Intelligence.

[BibT_eX]

[DOI]

CoRR, 2016

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling.

[BibT_eX]

[DOI]

Proceedings of the 33nd International Conference on Machine Learning, 2016

SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization.

[BibT_eX]

[DOI]

Proceedings of the 33nd International Conference on Machine Learning, 2016

Stochastic Block BFGS: Squeezing More Curvature out of Data.

[BibT_eX]

[DOI]

Robert M. Gower

Donald Goldfarb

Proceedings of the 33nd International Conference on Machine Learning, 2016

A new perspective on randomized gossip algorithms.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing, 2016

2015

Randomized Iterative Methods for Linear Systems.

[BibT_eX]

[DOI]

Robert Mansel Gower

SIAM J. Matrix Anal. Appl., 2015

Accelerated, Parallel, and Proximal Coordinate Descent.

[BibT_eX]

[DOI]

Olivier Fercoq

SIAM J. Optim., 2015

Separable approximations and decomposition methods for the augmented Lagrangian.

[BibT_eX]

[DOI]

Rachael Tappenden

Burak Büke

Optim. Methods Softw., 2015

Distributed Mini-Batch SDCA.

[BibT_eX]

[DOI]

Nathan Srebro

CoRR, 2015

Stochastic Dual Ascent for Solving Linear Systems.

[BibT_eX]

[DOI]

Robert Mansel Gower

CoRR, 2015

Primal Method for ERM with Flexible Mini-batching Schemes and Non-convex Losses.

[BibT_eX]

[DOI]

CoRR, 2015

Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling.

[BibT_eX]

[DOI]

Tong Zhang

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Adding vs. Averaging in Distributed Primal-Dual Optimization.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

Stochastic Dual Coordinate Ascent with Adaptive Probabilities.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

2014

Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function.

[BibT_eX]

[DOI]

Math. Program., 2014

Inequality-Constrained Matrix Completion: Adding the Obvious Helps!

[BibT_eX]

[DOI]

Jakub Marecek

CoRR, 2014

Randomized Dual Coordinate Ascent with Arbitrary Sampling.

[BibT_eX]

[DOI]

Tong Zhang

CoRR, 2014

Simple Complexity Analysis of Direct Search.

[BibT_eX]

[DOI]

CoRR, 2014

mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting.

[BibT_eX]

[DOI]

CoRR, 2014

Fast distributed coordinate descent for non-strongly convex losses.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2014

2013

TOP-SPIN: TOPic discovery via Sparse Principal component INterference.

[BibT_eX]

[DOI]

Selin Damla Ahipasaoglu

Ngai-Man Cheung

CoRR, 2013

Smooth minimization of nonsmooth functions with parallel coordinate descent methods.

[BibT_eX]

[DOI]

Olivier Fercoq

CoRR, 2013

Mini-Batch Primal and Dual Methods for SVMs.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

2012

Approximate Level Method for Nonsmooth Convex Minimization.

[BibT_eX]

[DOI]

J. Optim. Theory Appl., 2012

Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes

[BibT_eX]

[DOI]

Selin Damla Ahipasaoglu

CoRR, 2012

Optimal diagnostic tests for sporadic Creutzfeldt-Jakob disease based on support vector machine classification of RT-QuIC data

[BibT_eX]

[DOI]

CoRR, 2012

2011

Improved Algorithms for Convex Minimization in Relative Scale.

[BibT_eX]

[DOI]

SIAM J. Optim., 2011

Efficient Serial and Parallel Coordinate Descent Methods for Huge-Scale Truss Topology Design.

[BibT_eX]

[DOI]