Matus Telgarsky

Affiliations:
  • University of Illinois at Urbana-Champaign, Department of Computer Science, IL, USA
  • Carnegie Mellon University, Machine Learning Department, Pittsburgh, PA, USA


According to our database1, Matus Telgarsky authored at least 53 papers between 2007 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
One-layer transformers fail to solve the induction heads task.
CoRR, 2024

Transformers, parallel computation, and logarithmic depth.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Spectrum Extraction and Clipping for Implicitly Linear Layers.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Representational Strengths and Limitations of Transformers.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Feature selection and low test error in shallow low-rotation ReLU networks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

On Achieving Optimal Adversarial Test Error.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Feature selection with gradient descent on two-layer networks in low-rotation regimes.
CoRR, 2022

Convex Analysis at Infinity: An Introduction to Astral Space.
CoRR, 2022

Actor-critic is implicitly biased towards high entropy optimal policies.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Stochastic linear optimization never overfits with quadratically-bounded losses on general data.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

2021
Early-stopped neural networks are consistent.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Fast margin maximization via dual acceleration.
Proceedings of the 38th International Conference on Machine Learning, 2021

Generalization bounds via distillation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Characterizing the implicit bias via a primal-dual analysis.
Proceedings of the Algorithmic Learning Theory, 2021

2020
Directional convergence and alignment in deep learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Neural tangent kernels, transportation mappings, and universal approximation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

Gradient descent follows the regularization path for general losses.
Proceedings of the Conference on Learning Theory, 2020

2019
A refined primal-dual analysis of the implicit bias.
CoRR, 2019

A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization.
Proceedings of the 36th International Conference on Machine Learning, 2019

Gradient descent aligns the layers of deep linear networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

The implicit bias of gradient descent on nonseparable data.
Proceedings of the Conference on Learning Theory, 2019

2018
Risk and parameter convergence of logistic regression.
CoRR, 2018

Social Welfare and Profit Maximization from Revealed Preferences.
Proceedings of the Web and Internet Economics - 14th International Conference, 2018

Size-Noise Tradeoffs in Generative Networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

2017
Spectrally-normalized margin bounds for neural networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Neural Networks and Rational Functions.
Proceedings of the 34th International Conference on Machine Learning, 2017

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis.
Proceedings of the 30th Conference on Learning Theory, 2017

2016
Greedy bi-criteria approximations for k-medians and k-means.
CoRR, 2016

Rate of Price Discovery in Iterative Combinatorial Auctions.
Proceedings of the 2016 ACM Conference on Economics and Computation, 2016

benefits of depth in neural networks.
Proceedings of the 29th Conference on Learning Theory, 2016

2015
Convex Risk Minimization and Conditional Probability Estimation.
CoRR, 2015

Representation Benefits of Deep Feedforward Networks.
CoRR, 2015

Convex Risk Minimization and Conditional Probability Estimation.
Proceedings of The 28th Conference on Learning Theory, 2015

Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT).
Proceedings of the Algorithmic Learning Theory - 26th International Conference, 2015

2014
Tensor decompositions for learning latent variable models.
J. Mach. Learn. Res., 2014

Scalable Nonlinear Learning with Adaptive Polynomial Expansions.
CoRR, 2014

Scalable Non-linear Learning with Adaptive Polynomial Expansions.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Duality and Data Dependence in Boosting /.
PhD thesis, 2013

Dirichlet draws are sparse with high probability
CoRR, 2013

Moment-based Uniform Deviation Bounds for $k$-means and Friends.
CoRR, 2013

Moment-based Uniform Deviation Bounds for <i>k</i>-means and Friends.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Margins, Shrinkage, and Boosting.
Proceedings of the 30th International Conference on Machine Learning, 2013

Boosting with the Logistic Loss is Consistent.
Proceedings of the COLT 2013, 2013

2012
A Primal-Dual Convergence Analysis of Boosting.
J. Mach. Learn. Res., 2012

Statistical Consistency of Finite-dimensional Unregularized Linear Classification
CoRR, 2012

Agglomerative Bregman Clustering.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
Blackwell Approachability and Minimax Theory
CoRR, 2011

The Convergence Rate of AdaBoost and Friends
CoRR, 2011

The Fast Convergence of Boosting.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Hartigan's Method: k-means Clustering without Voronoi.
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

2007
Signal Decomposition using Multiscale Admixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2007


  Loading...