Chulhee Yun

According to our database1, Chulhee Yun authored at least 36 papers between 2013 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Position Coupling: Leveraging Task Structure for Improved Length Generalization of Transformers.
CoRR, 2024

Does SGD really happen in tiny subspaces?
CoRR, 2024

Fundamental Benefit of Alternating Updates in Minimax Optimization.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Linear attention is (maybe) all you need (to understand Transformer optimization).
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study.
CoRR, 2023

Enhancing Generalization and Plasticity for Sample Efficient Reinforcement Learning.
CoRR, 2023

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

On the Training Instability of Shuffling SGD with Batch Normalization.
Proceedings of the International Conference on Machine Learning, 2023

Provable Benefit of Mixup for Finding Optimal Decision Boundaries.
Proceedings of the International Conference on Machine Learning, 2023

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond.
Proceedings of the International Conference on Machine Learning, 2023

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Optimization for Deep Learning: Bridging the Theory-Practice Gap.
PhD thesis, 2021

Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
CoRR, 2021

A unifying view on implicit bias in training linear neural networks.
Proceedings of the 9th International Conference on Learning Representations, 2021

Minimum Width for Universal Approximation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
Proceedings of the Conference on Learning Theory, 2021

Provable Memorization via Deep Neural Networks using Sub-linear Parameters.
Proceedings of the Conference on Learning Theory, 2021

2020
O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

SGD with shuffling: optimal rates without component convexity and large epoch requirements.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Low-Rank Bottleneck in Multi-head Attention Models.
Proceedings of the 37th International Conference on Machine Learning, 2020

Are Transformers universal approximators of sequence-to-sequence functions?
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Are deep ResNets provably better than linear predictors?
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Small nonlinearities in activation functions create bad local minima in neural networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

Efficiently testing local optimality and escaping saddles for ReLU networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
Finite sample expressive power of small-width ReLU networks.
CoRR, 2018

A Critical View of Global Optimality in Deep Learning.
CoRR, 2018

Global Optimality Conditions for Deep Neural Networks.
Proceedings of the 6th International Conference on Learning Representations, 2018

Minimax Bounds on Stochastic Batched Convex Optimization.
Proceedings of the Conference On Learning Theory, 2018

2015
Face detection using Local Hybrid Patterns.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2013
An implementation of computer vision technique for an edutainment robot with a visual programming language.
Proceedings of the 10th International Conference on Ubiquitous Robots and Ambient Intelligence, 2013

A fusion of computer vision technique and a visual programming language for edutainment robots.
Proceedings of the 44th Internationel Symposium on Robotics, 2013


  Loading...