Behnam Neyshabur

According to our database1, Behnam Neyshabur authored at least 60 papers between 2013 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models.
Trans. Mach. Learn. Res., 2024

Gemma 2: Improving Open Language Models at a Practical Size.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2024

2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

Long Range Language Modeling via Gated State Spaces.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

REPAIR: REnormalizing Permuted Activations for Interpolation Repair.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning.
Trans. Mach. Learn. Res., 2022

Convexifying Transformers: Improving optimization and understanding of transformer networks.
CoRR, 2022

Layer-Stack Temperature Scaling.
CoRR, 2022

Teaching Algorithmic Reasoning via In-context Learning.
CoRR, 2022

Understanding the effect of sparsity on neural networks robustness.
CoRR, 2022

Data Scaling Laws in NMT: The Effect of Noise and Architecture.
CoRR, 2022

Solving Quantitative Reasoning Problems with Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Block-Recurrent Transformers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Exploring Length Generalization in Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Revisiting Neural Scaling Laws in Language and Vision.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Data Scaling Laws in NMT: The Effect of Noise and Architecture.
Proceedings of the International Conference on Machine Learning, 2022

A Loss Curvature Perspective on Training Instabilities of Deep Learning Models.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Leveraging unlabeled data to predict out-of-distribution performance.
Proceedings of the Tenth International Conference on Learning Representations, 2022

The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Exploring the Limits of Large Scale Pre-training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
A Loss Curvature Perspective on Training Instability in Deep Learning.
CoRR, 2021

Deep Learning Through the Lens of Example Difficulty.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

When Do Curricula Work?
Proceedings of the 9th International Conference on Learning Representations, 2021

The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers.
Proceedings of the 9th International Conference on Learning Representations, 2021

Understanding the failure modes of out-of-distribution generalization.
Proceedings of the 9th International Conference on Learning Representations, 2021

Extreme Memorization via Scale of Initialization.
Proceedings of the 9th International Conference on Learning Representations, 2021

Are wider nets better given the same number of parameters?
Proceedings of the 9th International Conference on Learning Representations, 2021

Sharpness-aware Minimization for Efficiently Improving Generalization.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning.
CoRR, 2020

The Deep Bootstrap: Good Online Learners are Good Offline Generalizers.
CoRR, 2020

What is being transferred in transfer learning?
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Towards Learning Convolutions from Scratch.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Methods and Analysis of The First Competition in Predicting Generalization of Deep Learning.
Proceedings of the NeurIPS 2020 Competition and Demonstration Track, 2020

Observational Overfitting in Reinforcement Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Fantastic Generalization Measures and Where to Find Them.
Proceedings of the 8th International Conference on Learning Representations, 2020

The intriguing role of module criticality in the generalization of deep networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
The role of over-parametrization in generalization of neural networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks.
CoRR, 2018

Predicting protein-protein interactions through sequence-based deep learning.
Bioinform., 2018

Stronger Generalization Bounds for Deep Nets via a Compression Approach.
Proceedings of the 35th International Conference on Machine Learning, 2018

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Implicit Regularization in Deep Learning.
CoRR, 2017

Geometry of Optimization and Implicit Regularization in Deep Learning.
CoRR, 2017

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks.
CoRR, 2017

Stabilizing GAN Training with Multiple Random Projections.
CoRR, 2017

Exploring Generalization in Deep Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Implicit Regularization in Matrix Factorization.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Corralling a Band of Bandit Algorithms.
Proceedings of the 30th Conference on Learning Theory, 2017

2016
Data-Dependent Path Normalization in Neural Networks.
Proceedings of the 4th International Conference on Learning Representations, 2016

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Global Optimality of Local Search for Low Rank Matrix Recovery.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015
In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Path-SGD: Path-Normalized Optimization in Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

On Symmetric and Asymmetric LSHs for Inner Product Search.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Norm-Based Capacity Control in Neural Networks.
Proceedings of The 28th Conference on Learning Theory, 2015

Joint inference of tissue-specific networks with a scale free topology.
Proceedings of the 2015 IEEE International Conference on Bioinformatics and Biomedicine, 2015

2014
Clustering, Hamming Embedding, Generalized LSH and the Max Norm.
Proceedings of the Algorithmic Learning Theory - 25th International Conference, 2014

2013
Sparse Matrix Factorization.
CoRR, 2013

NETAL: a new graph-based method for global alignment of protein-protein interaction networks.
Bioinform., 2013

The Power of Asymmetry in Binary Hashing.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013


  Loading...