Zhenyu Zhang

Affiliations:
  • University of Texas at Austin, TX, USA
  • University of Science and Technology of China


According to our database1, Zhenyu Zhang authored at least 36 papers between 2021 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients.
CoRR, 2024

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
CoRR, 2024

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.
CoRR, 2024

Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CaM: Cache Merging for Memory-efficient LLMs Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity.
CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
CoRR, 2023

QuantumSEA: In-Time Sparse Exploration for Noise Adaptive Quantum Circuits.
Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Are Large Kernels Better Teachers than Transformers for ConvNets?
Proceedings of the International Conference on Machine Learning, 2023

Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?
Proceedings of the International Conference on Machine Learning, 2023

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Accelerable Lottery Tickets with the Mixed-Precision Quantization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Can You Win Everything with A Lottery Ticket?
Trans. Mach. Learn. Res., 2022

QuanGCN: Noise-Adaptive Training for Robust Quantum Graph Convolutional Networks.
CoRR, 2022

Sparse Winning Tickets are Data-Efficient Image Recognizers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness.
Proceedings of the International Conference on Machine Learning, 2022

Data-Efficient Double-Win Lottery Tickets from Robust Pre-training.
Proceedings of the International Conference on Machine Learning, 2022

Sparsity Winning Twice: Better Robust Generalization from More Efficient Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Efficient Lottery Ticket Finding: Less Data is More.
Proceedings of the 38th International Conference on Machine Learning, 2021

GANs Can Play Lottery Tickets Too.
Proceedings of the 9th International Conference on Learning Representations, 2021

Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

Robust Overfitting may be mitigated by properly learned smoothening.
Proceedings of the 9th International Conference on Learning Representations, 2021

"BNN - BN = ?": Training Binary Neural Networks Without Batch Normalization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021


  Loading...