[ACL’25 (In submission)] A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models

Published in openreivew, 2025

We propose a drop-in solution for dynamically adapting speculative decoding in LLMs, improving efficiency while maintaining accuracy.

Recommended citation: Jiesong Liu, Brian Park, Xipeng Shen. (2025). "A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models."