2025
Unnatural Languages Are Not Bugs but Features for LLMs.
CoRR, March, 2025

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations.
CoRR, February, 2025

Training-Free Activation Sparsity in Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models.
CoRR, 2024

Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention.
CoRR, 2024

JetMoE: Reaching Llama2 Performance with 0.1M Dollars.
CoRR, 2024

Accelerating Greedy Coordinate Gradient via Probe Sampling.
CoRR, 2024

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models.
CoRR, 2024

BitDelta: Your Fine-Tune May Only Be Worth One Bit.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

SnapKV: LLM Knows What You are Looking for Before Generation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

REST: Retrieval-Based Speculative Decoding.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Large Language Models as Tool Makers.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

FlexAttention for Efficient High-Resolution Vision-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Scaling In-Context Demonstrations with Structured Attention.
CoRR, 2023

Reward Collapse in Aligning Large Language Models.
CoRR, 2023

What Makes Convolutional Models Great on Long Sequence Modeling?
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond.
CoRR, 2022

2021
First Place Solution of KDD Cup 2021 & OGB Large-Scale Challenge Graph Prediction Track.
CoRR, 2021

Do Transformers Really Perform Bad for Graph Representation?
CoRR, 2021

Towards Certifying 𝓁<sub>∞</sub> Robustness using Neural Networks with 𝓁<sub>∞</sub>-dist Neurons.
CoRR, 2021

Do Transformers Really Perform Badly for Graph Representation?
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards a Theoretical Framework of Out-of-Distribution Generalization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons.
Proceedings of the 38th International Conference on Machine Learning, 2021

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

A Theory of Label Propagation for Subpopulation Shift.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
RANDOM MASK: Towards Robust Convolutional Neural Networks.
CoRR, 2020

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Locally Differentially Private (Contextual) Bandits Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Defective Convolutional Layers Learn Robust CNNs.
CoRR, 2019

Convergence of Adversarial Training in Overparametrized Networks.
CoRR, 2019

Adversarially Robust Generalization Just Requires More Unlabeled Data.
CoRR, 2019

A Gram-Gauss-Newton Method Learning Overparameterized Deep Neural Networks for Regression Problems.
CoRR, 2019

Convergence of Adversarial Training in Overparametrized Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019