2025

Efficient LLM Inference with Activation Checkpointing and Hybrid Caching.

[DOI]

Sanghyeon Lee

Hongbeen Kim

CoRR, January, 2025

16.2 RNGD: A 5nm Tensor-Contraction Processor for Power-Efficient Inference on Large Language Models.

[DOI]

Proceedings of the IEEE International Solid-State Circuits Conference, 2025

2024

TCP: A Tensor Contraction Processor for AI Workloads Industrial Product.

[DOI]

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024