Puru Sharma
Orcid: 0009-0000-5854-1086
According to our database1,
Puru Sharma
authored at least 6 papers
between 2022 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
AttentionStore: Cost-effective Attention Reuse across Multi-turn Conversations in Large Language Model Serving.
CoRR, 2024
Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024
2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
2022
Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022