Hadi Daneshmand
According to our database1,
Hadi Daneshmand
authored at least 26 papers
between 2014 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Provable optimal transport with transformers: The essence of depth and prompt engineering.
CoRR, 2024
Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning.
CoRR, 2024
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
2023
On the impact of activation and normalization in obtaining isometric embeddings at initialization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Transformers learn to implement preconditioned gradient descent for in-context learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization.
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the International Conference on Machine Learning, 2023
2022
CoRR, 2022
2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021
2020
PhD thesis, 2020
CoRR, 2020
Batch normalization provably avoids ranks collapse for randomly initialised deep networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
2019
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019
2018
Proceedings of the 35th International Conference on Machine Learning, 2018
2017
2016
Estimating Diffusion Networks: Recovery Conditions, Sample Complexity and Soft-thresholding Algorithm.
J. Mach. Learn. Res., 2016
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Proceedings of the 33nd International Conference on Machine Learning, 2016
2014
Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm.
Proceedings of the 31th International Conference on Machine Learning, 2014