Huishuai Zhang

Dongyan Zhao

CoRR, 2024

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules.

[BibT_eX]

[DOI]

CoRR, 2024

Efficient Continual Pre-training by Mitigating the Stability Gap.

[BibT_eX]

[DOI]

CoRR, 2024

Automatic Jailbreaking of the Text-to-Image Generative AI Systems.

[BibT_eX]

[DOI]

CoRR, 2024

xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token.

[BibT_eX]

[DOI]

CoRR, 2024

\copyright Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model.

[BibT_eX]

[DOI]

CoRR, 2024

On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond.

[BibT_eX]

[DOI]

CoRR, 2024

Provable Adaptivity of Adam under Non-uniform Smoothness.

[BibT_eX]

[DOI]

Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Differentially Private Synthetic Data via Foundation Model APIs 2: Text.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023

Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Exploring Transferability for Randomized Smoothing.

[BibT_eX]

[DOI]

CoRR, 2023

Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2023

FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?

[BibT_eX]

[DOI]

CoRR, 2023

When and Why Momentum Accelerates SGD: An Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2023

ResiDual: Transformer with Dual Residual Connections.

[BibT_eX]

[DOI]

CoRR, 2023

DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation.

[BibT_eX]

[DOI]

Kaipeng Zheng

Weiran Huang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Closing the gap between the upper bound and lower bound of Adam's iteration complexity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

On the Generalization Properties of Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Denoising Masked Autoencoders Help Robust Classification.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

UADB: Unsupervised Anomaly Detection Booster.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Similarity Distribution Based Membership Inference Attack on Person Re-identification.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Understanding generalization error of SGD in nonconvex optimization.

[BibT_eX]

[DOI]

Mach. Learn., 2022

Stabilize deep ResNet with a sharp scaling factor τ.

[BibT_eX]

[DOI]

Mach. Learn., 2022

Denoising Masked AutoEncoders are Certifiable Robust Vision Learners.

[BibT_eX]

[DOI]

CoRR, 2022

Provable Adaptivity in Adam.

[BibT_eX]

[DOI]

CoRR, 2022

Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization.

[BibT_eX]

[DOI]

CoRR, 2022

Per-Instance Privacy Accounting for Differentially Private Stochastic Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2022

Robust Quantity-Aware Aggregation for Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Does Momentum Change the Implicit Regularization on Separable Data?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Availability Attacks Create Shortcuts.

[BibT_eX]

[DOI]

Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Indiscriminate Poisoning Attacks Are Shortcuts.

[BibT_eX]

[DOI]

CoRR, 2021

Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD.

[BibT_eX]

[DOI]

CoRR, 2021

Momentum Doesn't Change the Implicit Bias.

[BibT_eX]

[DOI]

CoRR, 2021

Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit.

[BibT_eX]

[DOI]

CoRR, 2021

Adversarial Training with Rectified Rejection.

[BibT_eX]

[DOI]

CoRR, 2021

Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Large Scale Private Learning via Low-rank Reparametrization.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Do not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

How Does Data Augmentation Affect Privacy in Machine Learning?

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Convergence of Distributed Stochastic Variance Reduced Methods Without Sampling Extra Data.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2020

Well-Conditioned Methods for Ill-Conditioned Systems: Linear Regression with Semi-Random Noise.

[BibT_eX]

[DOI]

CoRR, 2020

Membership Inference with Privately Augmented Data Endorses the Benign while Suppresses the Adversary.

[BibT_eX]

[DOI]

CoRR, 2020

Adai: Separating the Effects of Adaptive Learning Rate and Momentum Inertia.

[BibT_eX]

[DOI]

CoRR, 2020

Gradient Perturbation is Underrated for Differentially Private Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

On Layer Normalization in the Transformer Architecture.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

Training Over-parameterized Deep ResNet Is almost as Easy as Training a Two-layer Network.

[BibT_eX]

[DOI]

CoRR, 2019

BN-invariant Sharpness Regularizes the Training Model to Better Generalization.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

SGD Converges to Global Minimum in Deep Learning via Star-convex Path.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Capacity Control of ReLU Neural Networks by Basis-Path Norm.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Median-Truncated Nonconvex Approach for Phase Retrieval With Outliers.

[BibT_eX]

[DOI]

Yuejie Chi

IEEE Trans. Inf. Theory, 2018

Train Feedfoward Neural Network with Layer-wise Adaptive Rate via Approximating Back-matching Propagation.

[BibT_eX]

[DOI]

Wei Chen

Tie-Yan Liu

CoRR, 2018

Generalization Error Bounds with Probabilistic Guarantee for SGD in Nonconvex Optimization.

[BibT_eX]

[DOI]

CoRR, 2018

On the Local Hessian in Back-propagation.

[BibT_eX]

[DOI]

Wei Chen

Tie-Yan Liu

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

2017

Multi-Key Generation Over a Cellular Model With a Helper.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2017

A Nonconvex Approach for Phase Retrieval: Reshaped Wirtinger Flow and Incremental Algorithms.

[BibT_eX]

[DOI]

Yuejie Chi

J. Mach. Learn. Res., 2017

Block-diagonal Hessian-free Optimization for Training Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2017

Nonconvex Low-Rank Matrix Recovery with Arbitrary Outliers via Median-Truncated Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2017

2016

Reshaped Wirtinger Flow for Solving Quadratic Systems of Equations.

[BibT_eX]

[DOI]

CoRR, 2016

Reshaped Wirtinger Flow for Solving Quadratic System of Equations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Provable Non-convex Phase Retrieval with Outliers: Median TruncatedWirtinger Flow.

[BibT_eX]

[DOI]

Yuejie Chi

Proceedings of the 33nd International Conference on Machine Learning, 2016

Geometrical properties and accelerated gradient solvers of non-convex phase retrieval.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Allerton Conference on Communication, 2016

On Compressive orthonormal Sensing.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Allerton Conference on Communication, 2016

2015

Analysis of Robust PCA via Local Incoherence.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Two-key generation for a cellular model with a helper.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2015

Secret key capacity: Talk or keep silent?

[BibT_eX]

[DOI]

Lifeng Lai

Proceedings of the IEEE International Symposium on Information Theory, 2015

2014

The Capacity Region of the Source-Type Model for Secret Key and Private Key Generation.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2014

Key capacity region for a cellular source model.

[BibT_eX]

[DOI]

Lifeng Lai

Proceedings of the 2014 IEEE Information Theory Workshop, 2014

Secret key-private key generation over three terminals: Capacity region.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Symposium on Information Theory, Honolulu, HI, USA, June 29, 2014

Helper-assisted asymmetric two key generation.

[BibT_eX]

[DOI]