16.2 RNGD: A 5nm Tensor-Contraction Processor for Power-Efficient Inference on Large Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE International Solid-State Circuits Conference, 2025
Squeezed Attention: Accelerating Long Context Length LLM Inference.
CoRR, 2024
TCP: A Tensor Contraction Processor for AI Workloads Industrial Product.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
RNGD - Tensor Contraction Processor for Sustainable AI Computing.
Proceedings of the 36th IEEE Hot Chips Symposium, 2024