Difan Zou

CoRR, 2024

The Implicit Bias of Adam on Separable Data.

[BibT_eX]

[DOI]

Chenyang Zhang

Yuan Cao

CoRR, 2024

Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller.

[BibT_eX]

[DOI]

CoRR, 2024

Slight Corruption in Pre-training Data Makes Better Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models.

[BibT_eX]

[DOI]

Chengxing Xie

CoRR, 2024

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference.

[BibT_eX]

[DOI]

CoRR, 2024

The Dog Walking Theory: Rethinking Convergence in Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2024

On the Benefits of Over-parameterization for Out-of-Distribution Generalization.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems.

[BibT_eX]

[DOI]

Junwei Su

Chuan Wu

CoRR, 2024

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling.

[BibT_eX]

[DOI]

CoRR, 2024

Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data.

[BibT_eX]

[DOI]

Xuran Meng

Yuan Cao

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Faster Sampling via Stochastic Gradient Proximal Sampler.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference.

[BibT_eX]

[DOI]

Yujin Han

Proceedings of the Forty-first International Conference on Machine Learning, 2024

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks.

[BibT_eX]

[DOI]

Xingwu Chen

Proceedings of the Forty-first International Conference on Machine Learning, 2024

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks.

[BibT_eX]

[DOI]

Junwei Su

Chuan Wu

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Benign Oscillation of Stochastic Gradient Descent with Large Learning Rate.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo.

[BibT_eX]

[DOI]

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023

Benign Overfitting of Constant-Stepsize SGD for Linear Regression.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates.

[BibT_eX]

[DOI]

CoRR, 2023

Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks.

[BibT_eX]

[DOI]

CoRR, 2023

Per-Example Gradient Regularization Improves Learning Signals from Noisy Data.

[BibT_eX]

[DOI]

Xuran Meng

Yuan Cao

CoRR, 2023

Learning High-Dimensional Single-Neuron ReLU Networks with Finite Samples.

[BibT_eX]

[DOI]

CoRR, 2023

The Benefits of Mixup for Feature Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Towards Robust Graph Incremental Learning on Evolving Graphs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022

Understanding the Role of Optimization Algorithms in Learning Over-parameterized Models

[BibT_eX]

[DOI]

PhD thesis, 2022

Two-Dimensional Intensity Distribution and Adaptive Power Allocation for Ultraviolet Ad-Hoc Network.

[BibT_eX]

[DOI]

IEEE Trans. Green Commun. Netw., 2022

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Self-training Converts Weak Learners to Strong Learners in Mixture Models.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2021

Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise.

[BibT_eX]

[DOI]

Spencer Frei

Proceedings of the 38th International Conference on Machine Learning, 2021

Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Gradient descent optimizes over-parameterized deep ReLU networks.

[BibT_eX]

[DOI]

Mach. Learn., 2020

Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate.

[BibT_eX]

[DOI]

CoRR, 2020

On the Global Convergence of Training Deep Linear ResNets.

[BibT_eX]

[DOI]

Philip M. Long

Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Adversarial Robustness Requires Revisiting Misclassified Examples.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Two-dimensional Intensity Distribution and Connectivity in Ultraviolet Ad-Hoc Network.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Communications, 2020

2019

Characterization on Practical Photon Counting Receiver in Optical Scattering Communication.

[BibT_eX]

[DOI]

IEEE Trans. Commun., 2019

Signal Characterization and Achievable Transmission Rate of VLC Under Receiver Nonlinearity.

[BibT_eX]

[DOI]

IEEE Access, 2019

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

An Improved Analysis of Training Over-parameterized Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Signal Detection Under Short-Interval Sampling of Continuous Waveforms for Optical Wireless Scattering Communication.

[BibT_eX]

[DOI]

IEEE Trans. Wirel. Commun., 2018

Secrecy Rate of MISO Optical Wireless Scattering Communications.

[BibT_eX]

[DOI]

IEEE Trans. Commun., 2018

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks.

[BibT_eX]

[DOI]

CoRR, 2018

Subsampled Stochastic Variance-Reduced Gradient Langevin Dynamics.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Stochastic Variance-Reduced Hamilton Monte Carlo Methods.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently.

[BibT_eX]

[DOI]

Yaodong Yu

CoRR, 2017

Analysis on Practical Photon Counting Receiver in Optical Scattering Communication.

[BibT_eX]

[DOI]

CoRR, 2017

2016

Turbulence channel modeling and non-parametric estimation for optical wireless scattering communication.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Communication Systems, 2016

Performance of non-line-of-sight ultraviolet scattering communication under different altitudes.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE/CIC International Conference on Communications in China, 2016

Optical wireless scattering communication system with a non-ideal photon-counting receiver.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing, 2016

2014

Improving the NLOS optical scattering channel via beam reshaping.

[BibT_eX]

[DOI]

Shang-Bin Li