[ACL’25] A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models

Published in 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025

We propose a drop-in solution for dynamically adapting speculative decoding in LLMs, improving efficiency while maintaining accuracy.

Recommended citation: Jiesong Liu, Brian Park, Xipeng Shen. (2025). "A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models."