[ACL’25] A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models
Published in 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025
We propose a drop-in solution for dynamically adapting speculative decoding in LLMs, improving efficiency while maintaining accuracy.
Recommended citation: Jiesong Liu, Brian Park, Xipeng Shen. (2025). "A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models."