Qingru Zhang

According to our database¹, Qingru Zhang authored at least 14 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering.

[BibT_eX]

[DOI]

CoRR, 2024

Robust Reinforcement Learning from Corrupted Human Feedback.

[BibT_eX]

[DOI]

CoRR, 2024

GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM.

[BibT_eX]

[DOI]

CoRR, 2024

Robust Reinforcement Learning from Corrupted Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Less is More: Task-aware Layer-wise Distillation for Language Model Compression.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022

MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

2021

A Biased Graph Neural Network Sampler with Near-Optimal Regret.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2019

AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Qingru Zhang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...