A Simple Model of Inference Scaling Laws.
CoRR, 2024
Grokking at the Edge of Linear Separability.
CoRR, 2024
Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets.
CoRR, 2024
Decoupled Weight Decay for Any <i>p</i> Norm.
CoRR, 2024
Measuring Sharpness in Grokking.
CoRR, 2024
Grokking in Linear Estimators - A Solvable Model that Groks without Understanding.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
The Underlying Scaling Laws and Universal Statistical Structure of Complex Datasets.
CoRR, 2023
Charting the Topography of the Neural Network Landscape with Thermal-Like Noise.
CoRR, 2023
Noise Injection Node Regularization for Robust Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Noise Injection as a Probe of Deep Learning Dynamics.
CoRR, 2022
Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 2003