Daniel Soudry

According to our database1, Daniel Soudry authored at least 81 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The Implicit Bias of Gradient Descent on Separable Multiclass Data.
CoRR, 2024

Provable Tempered Overfitting of Minimal Nets and Typical Nets.
CoRR, 2024

Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks.
CoRR, 2024

Scaling FP8 training to trillion-token LLMs.
CoRR, 2024

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes.
CoRR, 2024

How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting - An Analytical Model.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Explore to Generalize in Zero-Shot RL.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

How do Minimum-Norm Shallow Denoisers Look in Function Space?
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DropCompute: simple and more robust distributed synchronous training via compute variance reduction.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond.
Proceedings of the International Conference on Machine Learning, 2023

Continual Learning in Linear Classification on Separable Data.
Proceedings of the International Conference on Machine Learning, 2023

The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Minimum Variance Unbiased N: M Sparsity for the Neural Gradients.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

The Role of Codeword-to-Class Assignments in Error-Correcting Codes: An Empirical Study.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Optimal Fine-Grained N: M sparsity for Activations and Neural Gradients.
CoRR, 2022

Implicit Bias of the Step Size in Linear Diagonal Neural Networks.
Proceedings of the International Conference on Machine Learning, 2022

A Statistical Framework for Efficient Out of Distribution Detection in Deep Neural Networks.
Proceedings of the Tenth International Conference on Learning Representations, 2022

How catastrophic can catastrophic forgetting be in linear regression?
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Regularization Guarantees Generalization in Bayesian Reinforcement Learning through Algorithmic Stability.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Task-Agnostic Continual Learning Using Online Variational Bayes With Fixed-Point Updates.
Neural Comput., 2021

Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning.
CoRR, 2021

Statistical Testing for Efficient Out of Distribution Detection in Deep Neural Networks.
CoRR, 2021

The Implicit Bias of Minima Stability: A View from Function Space.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N: M Transposable Masks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Accurate Post Training Quantization With Small Calibration Sets.
Proceedings of the 38th International Conference on Machine Learning, 2021

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent.
Proceedings of the 38th International Conference on Machine Learning, 2021

Neural gradients are near-lognormal: improved quantized and sparse training.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
The Global Optimization Geometry of Shallow Linear Neural Networks.
J. Math. Imaging Vis., 2020

Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming.
CoRR, 2020

Neural gradients are lognormally distributed: understanding sparse and quantized training.
CoRR, 2020

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Proceedings of the 37th International Conference on Machine Learning, 2020

A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case.
Proceedings of the 8th International Conference on Learning Representations, 2020

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Proceedings of the 8th International Conference on Learning Representations, 2020

Augment Your Batch: Improving Generalization Through Instance Repetition.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

The Knowledge Within: Methods for Data-Free Model Compression.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Kernel and Rich Regimes in Overparametrized Models.
Proceedings of the Conference on Learning Theory, 2020

2019
MTJ-Based Hardware Synapse Design for Quantized Deep Neural Networks.
CoRR, 2019

Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency.
CoRR, 2019

Augment your batch: better training with larger batches.
CoRR, 2019

A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Post training 4-bit quantization of convolutional networks for rapid-deployment.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models.
Proceedings of the 36th International Conference on Machine Learning, 2019

How do infinite width bounded norm networks look in function space?
Proceedings of the Conference on Learning Theory, 2019

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Convergence of Gradient Descent on Separable Data.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Seizure pathways: A model-based investigation.
PLoS Comput. Biol., 2018

The Implicit Bias of Gradient Descent on Separable Data.
J. Mach. Learn. Res., 2018

ACIQ: Analytical Clipping for Integer Quantization of neural networks.
CoRR, 2018

Bayesian Gradient Descent: Online Variational Bayes Learning with Increased Robustness to Catastrophic Forgetting and Weight Pruning.
CoRR, 2018

Convergence of Gradient Descent on Separable Data.
CoRR, 2018

On the Blindspots of Convolutional Networks.
CoRR, 2018

Norm matters: efficient and accurate normalization schemes in deep networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Implicit Bias of Gradient Descent on Linear Convolutional Networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Scalable methods for 8-bit training of neural networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Characterizing Implicit Bias in Terms of Optimization Geometry.
Proceedings of the 35th International Conference on Machine Learning, 2018

The Implicit Bias of Gradient Descent on Separable Data.
Proceedings of the 6th International Conference on Learning Representations, 2018

Exponentially vanishing sub-optimal local minima in multilayer neural networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Fix your classifier: the marginal value of training the last weight layer.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Multi-scale approaches for high-speed imaging and analysis of large neural populations.
PLoS Comput. Biol., 2017

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations.
J. Mach. Learn. Res., 2017

The Implicit Bias of Gradient Descent on Separable Data.
CoRR, 2017

Train longer, generalize better: closing the generalization gap in large batch training of neural networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016
No bad local minima: Data independent training error guarantees for multilayer neural networks.
CoRR, 2016

Binarized Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

A fully analog memristor-based neural network with online gradient training.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

2015
Memristor-Based Multilayer Neural Networks With Online Gradient Descent Training.
IEEE Trans. Neural Networks Learn. Syst., 2015

Efficient "Shotgun" Inference of Neural Connectivity from Highly Sub-sampled Activity Data.
PLoS Comput. Biol., 2015

Training Binary Multilayer Neural Networks for Image Classification using Expectation Backpropagation.
CoRR, 2015

2014
The neuronal response at extended timescales: a linearized spiking input-output relation.
Frontiers Comput. Neurosci., 2014

The neuronal response at extended timescales: long-term correlations without long-term memory.
Frontiers Comput. Neurosci., 2014

Diffusion approximation-based simulation of stochastic ion channels: which method to use?
Frontiers Comput. Neurosci., 2014

Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2012
Conductance-Based Neuron Models and the Slow Dynamics of Excitability.
Frontiers Comput. Neurosci., 2012

"Neuronal spike generation mechanism as an oversampling, noise-shaping A-to-D converter".
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2010
History-Dependent Dynamics in a Generic Model of Ion Channels - An Analytic Study.
Frontiers Comput. Neurosci., 2010


  Loading...