Ruoyu Sun

Orcid: 0000-0003-2487-5322

Affiliations:
  • Chinese University of Hong Kong-Shenzhen (CUHK-SZ), Shenzhen, China
  • University of Illinois Urbana-Champaign, IL, USA (former)
  • University of Minnesota, Department of ECE, MN, USA (PhD)


According to our database1, Ruoyu Sun authored at least 82 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Archilles' Heel in Semi-open LLMs: Hiding Bottom against Recovery Attacks.
CoRR, 2024

Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity.
CoRR, 2024

MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning.
CoRR, 2024

Adam-mini: Use Fewer Learning Rates To Gain More.
CoRR, 2024

On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond.
CoRR, 2024

Why Transformers Need Adam: A Hessian Perspective.
CoRR, 2024

AceGPT, Localizing Large Language Models in Arabic.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Provable Adaptivity of Adam under Non-uniform Smoothness.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

How Graph Neural Networks Learn: Lessons from Training Dynamics.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

LEMON: Lossless model expansion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Bridging the Gap: Rademacher Complexity in Robust and Standard Generalization.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
CoRR, 2023

How Graph Neural Networks Learn: Lessons from Training Dynamics in Function Space.
CoRR, 2023

AceGPT, Localizing Large Language Models in Arabic.
CoRR, 2023

Restricted Generative Projection for One-Class Classification and Anomaly Detection.
CoRR, 2023

Double Dynamic Sparse Training for GANs.
CoRR, 2023

Invariant Layers for Graphs with Nodes of Different Types.
CoRR, 2023

PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Balanced Training for Sparse GANs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

NTK-SAP: Improving neural network pruning by aligning training dynamics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A GNN-Guided Predict-and-Search Framework for Mixed-Integer Linear Programming.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity.
SIAM J. Optim., December, 2022

Suboptimal Local Minima Exist for Wide Neural Networks with Smooth Activations.
Math. Oper. Res., November, 2022

On the Benefit of Width for Neural Networks: Disappearance of Basins.
SIAM J. Optim., September, 2022

Adversarial Rademacher Complexity of Deep Neural Networks.
CoRR, 2022

On the landscape of one-hidden-layer sparse networks and beyond.
Artif. Intell., 2022

Adam Can Converge Without Any Modification On Update Rules.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Stability Analysis and Generalization Bounds of Adversarial Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Does Momentum Change the Implicit Regularization on Separable Data?
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Separation of Metabolites and Macromolecules for Short-TE <sup>1</sup>H-MRSI Using Learned Component-Specific Representations.
IEEE Trans. Medical Imaging, 2021

Two Symmetrized Coordinate Descent Methods Can Be O(n<sup>2)</sup> Times Slower Than the Randomized Version.
SIAM J. Optim., 2021

Worst-case complexity of cyclic coordinate descent: O(n<sup>2)</sup> gap with randomized version.
Math. Program., 2021

Towards Understanding the Impact of Model Size on Differential Private Classification.
CoRR, 2021

Federated Semi-Supervised Learning with Class Distribution Mismatch.
CoRR, 2021

Momentum Doesn't Change the Implicit Bias.
CoRR, 2021

Achieving Small Test Error in Mildly Overparameterized Neural Networks.
CoRR, 2021

On a Faster R-Linear Convergence Rate of the Barzilai-Borwein Method.
CoRR, 2021

When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

RMSprop converges with proper hyper-parameter.
Proceedings of the 9th International Conference on Learning Representations, 2021

PenDer: Incorporating Shape Constraints via Penalized Derivatives.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
The Global Landscape of Neural Networks: An Overview.
IEEE Signal Process. Mag., 2020

On the Efficiency of Random Permutation for ADMM and Coordinate Descent.
Math. Oper. Res., 2020

Landscape of Sparse Linear Network: A Brief Investigation.
CoRR, 2020

Global Convergence and Induced Kernels of Gradient-Based Meta-Learning with Neural Nets.
CoRR, 2020

DEED: A General Quantization Scheme for Communication Efficiency in Bits.
CoRR, 2020

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Towards a Better Global Loss Landscape of GANs.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Globally Optimal Joint Uplink Base Station Association and Beamforming.
IEEE Trans. Commun., 2019

Optimization for deep learning: theory and algorithms.
CoRR, 2019

Sub-Optimal Local Minima Exist for Almost All Over-parameterized Neural Networks.
CoRR, 2019

Understanding Limitation of Two Symmetrized Orders by Worst-case Complexity.
CoRR, 2019

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization.
Proceedings of the 7th International Conference on Learning Representations, 2019

Max-Sliced Wasserstein Distance and Its Use for GANs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Over-Parameterized Deep Neural Networks Have No Strict Local Minima For Any Continuous Activations.
CoRR, 2018

Adding One Neuron Can Eliminate All Bad Local Minima.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Understanding the Loss Surface of Neural Networks for Binary Classification.
Proceedings of the 35th International Conference on Machine Learning, 2018

Understanding the Loss Surface of Single-Layered Neural Networks for Binary Classification.
Proceedings of the 6th International Conference on Learning Representations, 2018

2016
Guaranteed Matrix Completion via Non-Convex Factorization.
IEEE Trans. Inf. Theory, 2016

Worst-case Complexity of Cyclic Coordinate Descent: $O(n^2)$ Gap with Randomized Version.
CoRR, 2016

Optimization algorithms for big data with application in wireless networks.
Proceedings of the Big Data over Networks, 2016

2015
Interference Alignment Using Finite and Dependent Channel Extensions: The Single Beam Case.
IEEE Trans. Inf. Theory, 2015

Joint Downlink Base Station Association and Power Control for Max-Min Fairness: Computation and Complexity.
IEEE J. Sel. Areas Commun., 2015

Globally Optimal Joint Uplink Base Station Association and Beamforming.
CoRR, 2015

Interference alignment via Feasible Point Pursuit.
Proceedings of the 16th IEEE International Workshop on Signal Processing Advances in Wireless Communications, 2015

Improved Iteration Complexity Bounds of Cyclic Block Coordinate Descent for Convex Problems.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Guaranteed Matrix Completion via Nonconvex Factorization.
Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, 2015

2014
Cross-Layer Provision of Future Cellular Networks: A WMMSE-based approach.
IEEE Signal Process. Mag., 2014

Cross Layer Provision of Future Cellular Networks.
CoRR, 2014

Globally optimal joint uplink base station association and power control for max-min fairness.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Joint Base Station Clustering and Beamformer Design for Partial Coordinated Transmission in Heterogeneous Networks.
IEEE J. Sel. Areas Commun., 2013

Two Performance-limiting Factors for Interference Alignment: Channel Diversity Order and the Number of Data Streams Per User.
CoRR, 2013

Long-term transmit point association for coordinated multipoint transmission by stochastic optimization.
Proceedings of the 14th IEEE Workshop on Signal Processing Advances in Wireless Communications, 2013

2012
Robust SINR-Constrained MISO Downlink Beamforming: When is Semidefinite Programming Relaxation Tight?
EURASIP J. Wirel. Commun. Netw., 2012

Joint Base Station Clustering and Beamformer Design for Partial Coordinated Transmission in Heterogenous Networks
CoRR, 2012

Optimal joint base station assignment and power allocation in a cellular network.
Proceedings of the 13th IEEE International Workshop on Signal Processing Advances in Wireless Communications, 2012

Joint transceiver design and base station clustering for heterogeneous networks.
Proceedings of the Conference Record of the Forty Sixth Asilomar Conference on Signals, 2012


  Loading...