Antonio Orvieto

CoRR, 2024

NIMBA: Towards Robust and Principled Processing of Point Clouds With SSMs.

[BibT_eX]

[DOI]

Nursena Köprücü

Destiny Okpekpe

CoRR, 2024

Loss Landscape Characterization of Neural Networks without Over-Parametrization.

[BibT_eX]

[DOI]

CoRR, 2024

Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture.

[BibT_eX]

[DOI]

Sajad Movahedi

Seyed-Mohsen Moosavi-Dezfooli

CoRR, 2024

An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes.

[BibT_eX]

[DOI]

Lin Xiao

CoRR, 2024

Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes.

[BibT_eX]

[DOI]

CoRR, 2024

Recurrent neural networks: vanishing and exploding gradients are not the end of the story.

[BibT_eX]

[DOI]

Nicolas Zucchet

CoRR, 2024

Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2024

On the low-shot transferability of [V]-Mamba.

[BibT_eX]

[DOI]

Diganta Misra

Jay Gala

CoRR, 2024

Theoretical Foundations of Deep Selective State-Space Models.

[BibT_eX]

[DOI]

CoRR, 2024

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Recurrent Distance Filtering for Graph Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SDEs for Minimax Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

Recurrent Distance-Encoding Neural Networks for Graph Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2023

On the Universality of Linear Recurrences Followed by Nonlinear Projections.

[BibT_eX]

[DOI]

CoRR, 2023

On the effectiveness of Randomized Signatures as Reservoir for Learning Rough Dynamics.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2023

Resurrecting Recurrent Neural Networks for Long Sequences.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

An SDE for Modeling SAM: Theory and Insights.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Explicit Regularization in Overparametrized Models via Noise Injection.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Randomized Signature Layers for Signal Extraction in Time Series Data.

[BibT_eX]

[DOI]

CoRR, 2022

Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution.

[BibT_eX]

[DOI]

Simon Lacoste-Julien

Nicolas Loizou

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On the Theoretical Properties of Noise Correlation in Stochastic Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Anticorrelated Noise Injection for Improved Generalization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Vanishing Curvature in Randomly Initialized Deep ReLU Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Vanishing Curvature and the Power of Adaptive Methods in Randomly Initialized Deep Networks.

[BibT_eX]

[DOI]

CoRR, 2021

Rethinking the Variational Interpretation of Accelerated Optimization Methods.

[BibT_eX]

[DOI]

Peiyuan Zhang

Hadi Daneshmand

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Second-order Convergence Properties of Random Search Methods.

[BibT_eX]

[DOI]

Giambattista Parascandolo

Adamos Solomou

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning explanations that are hard to vary.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Momentum Improves Optimization on Riemannian Manifolds.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Two-Level K-FAC Preconditioning for Deep Learning.

[BibT_eX]

[DOI]

Nikolaos Tselepidis

Jonas Kohler

CoRR, 2020

An Accelerated DFO Algorithm for Finite-sum Convex Functions.

[BibT_eX]

[DOI]

Yuwen Chen

Proceedings of the 37th International Conference on Machine Learning, 2020

A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

The Role of Memory in Stochastic Optimization.

[BibT_eX]

[DOI]

Jonas Kohler

Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

Shadowing Properties of Optimization Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Continuous-time Models for Stochastic Optimization Algorithms.

[BibT_eX]

[DOI]