2024
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput.
CoRR, 2024