Guangxuan Xiao

According to our database1, Guangxuan Xiao authored at least 17 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training.
Proc. VLDB Endow., February, 2024

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads.
CoRR, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving.
CoRR, 2024

Retrieval Head Mechanistically Explains Long-Context Factuality.
CoRR, 2024

BitDelta: Your Fine-Tune May Only Be Worth One Bit.
CoRR, 2024

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory.
CoRR, 2024

AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Efficient Streaming Language Models with Attention Sinks.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-level Backdoor Attacks.
Mach. Intell. Res., April, 2023

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention.
CoRR, 2023

Offsite-Tuning: Transfer Learning without Full Model.
CoRR, 2023

ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training.
CoRR, 2023

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models.
Proceedings of the International Conference on Machine Learning, 2023

2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models.
CoRR, 2022

Sparse and Local Networks for Hypergraph Reasoning.
Proceedings of the Learning on Graphs Conference, 2022

2021
Red Alarm for Pre-trained Models: Universal Vulnerabilities by Neuron-Level Backdoor Attacks.
CoRR, 2021


  Loading...