Xingkun Yang
According to our database1,
Xingkun Yang
authored at least 2 papers
in 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
AttentionStore: Cost-effective Attention Reuse across Multi-turn Conversations in Large Language Model Serving.
CoRR, 2024
Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024