Mikhail Belkin
According to our database1,
Mikhail Belkin
authored at least 117 papers
between 2001 and 2024.
Collaborative distances:
Collaborative distances:
Awards
ACM Fellow
ACM Fellow 2023, "For contributions to modern machine learning theory and algorithms".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product.
CoRR, 2024
Unmemorization in Large Language Models via Self-Distillation and Deliberate Imagination.
CoRR, 2024
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
More is Better: when Infinite Overparameterization is Optimal and Overfitting is Obligatory.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024
2023
A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors.
SIAM J. Math. Data Sci., December, 2023
SIAM J. Math. Data Sci., December, 2023
More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory.
CoRR, 2023
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems.
CoRR, 2023
Proceedings of the Uncertainty in Artificial Intelligence, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the International Conference on Machine Learning, 2023
2022
Feature learning in neural networks and kernel machines that recursively learn features.
CoRR, 2022
CoRR, 2022
CoRR, 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture.
CoRR, 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models.
CoRR, 2022
CoRR, 2022
2021
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation.
Acta Numer., May, 2021
Classification vs regression in overparameterized regimes: Does the loss function matter?
J. Mach. Learn. Res., 2021
CoRR, 2021
Simple, Fast, and Flexible Framework for Matrix Completion with Infinite Width Neural Networks.
CoRR, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Evaluation of Neural Architectures trained with square Loss vs Cross-Entropy in Classification Tasks.
Proceedings of the 9th International Conference on Learning Representations, 2021
2020
Proc. Natl. Acad. Sci. USA, 2020
IEEE Trans. Pattern Anal. Mach. Intell., 2020
Linear Convergence and Implicit Regularization of Generalized Mirror Descent with Time-Dependent Mirrors.
CoRR, 2020
Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning.
CoRR, 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Proceedings of the 8th International Conference on Learning Representations, 2020
2019
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019
Kernel Machines Beat Deep Neural Networks on Mask-Based Single-Channel Speech Enhancement.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019
2018
CoRR, 2018
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018
Proceedings of the 35th International Conference on Machine Learning, 2018
Approximation beats concentration? An approximation view on inference with smooth radial kernels.
Proceedings of the Conference On Learning Theory, 2018
Proceedings of the Algorithmic Learning Theory, 2018
2017
Diving into the shallows: a computational perspective on large-scale shallow learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
2016
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Proceedings of the 33nd International Conference on Machine Learning, 2016
Proceedings of the 29th Conference on Learning Theory, 2016
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016
2015
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
Proceedings of the 35th IEEE International Conference on Distributed Computing Systems, 2015
Proceedings of the 2015 IEEE European Modelling Symposium, 2015
Proceedings of The 28th Conference on Learning Theory, 2015
2014
CoRR, 2014
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014
The More, the Merrier: the Blessing of Dimensionality for Learning Large Gaussian Mixtures.
Proceedings of The 27th Conference on Learning Theory, 2014
2013
Random Struct. Algorithms, 2013
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013
Proceedings of the COLT 2013, 2013
2012
Toward Understanding Complex Spaces: Graph Laplacians on Manifolds with Singularities and Boundaries.
Proceedings of the COLT 2012, 2012
Graph Laplacians on Singular Manifolds: Toward understanding complex spaces: graph Laplacians on manifolds with singularities and boundaries
CoRR, 2012
Proceedings of the Mobile Computing, Applications, and Services, 2012
Proceedings of the Mobile Computing, Applications, and Services, 2012
2011
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011
Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011
2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the COLT 2010, 2010
2009
Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2009
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009
2008
J. Comput. Syst. Sci., 2008
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008
Data spectroscopy: learning mixture models using eigenspaces of convolution operators.
Proceedings of the Machine Learning, 2008
Proceedings of the 24th ACM Symposium on Computational Geometry, 2008
2007
Proceedings of the Advances in Neural Information Processing Systems 20, 2007
2006
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples.
J. Mach. Learn. Res., 2006
Proceedings of the Advances in Neural Information Processing Systems 19, 2006
Proceedings of the Advances in Neural Information Processing Systems 19, 2006
2005
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005
Proceedings of the Machine Learning, 2005
2004
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004
Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004
2003
Neural Comput., 2003
2002
Proceedings of the ACL-02 Workshop on Morphological and Phonological Learning, 2002
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002
2001
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001