2024
Context Compression and Extraction: Efficiency Inference of Large Language Models.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2024