ECHO: Elastic Speculative Decoding with Sparse Gating for High-Concurrency Scenarios
ICML Oral/Spotlight, 2026
Hey, I'm Yuhao Shen, a direct PhD student at the College of Control Science and Engineering, Zhejiang University, advised by Prof. Cong Wang. I also received my B.E. degree from Zhejiang University.
My research lies in MLSys, LLM Inference, AI Infrastructure, and Edge Computing. Over the past two years, I have been deeply engaged in the field of speculative sampling and decoding. I was previously a research intern at Qwen Application and received an internship offer from the Tencent Hunyuan Qingyun Project. Currently, I am researching RL rollout acceleration in the Tongyi Qwen Foundation Model Infra group. Outside academia, I enjoy playing basketball, video games, and photography.
|
Zhejiang University, Hangzhou, China Direct Ph.D. in Control Science and Engineering, 2024 - 2029 Advisor: Cong Wang |
|
Zhejiang University, Hangzhou, China Bachelor of Engineering in Control Science and Engineering, 2020 - 2024 Advisor: Cong Wang |
ICML Oral/Spotlight, 2026
ACL 2026 Findings
IEEE 45th International Conference on Distributed Computing Systems (ICDCS), 2025
IEEE Transactions on Neural Networks and Learning Systems, 2024
arXiv (2026)
|
Research Intern, Qwen Foundation Model May. 2026 - Present Topic: Speculative Decoding, RL Infra Advisor: Yucheng Li, Huiqiang Jiang Hangzhou, China |
|
Research Intern, Qwen Application Nov. 2025 - May. 2026 Topic: Speculative Decoding, AI Infra Advisor: Ye Shuang, Jun Dai, Lei Chen Hangzhou, China |