Venue: Preprint Year: 2026
Selected
Large Language Models Can Control Their Own Attention Span
Authors: Namgyu Ho*, Huzama Ahmad*, Woosung Koh*, Cicero Nogueira dos Santos, Tal Schuster, Se-Young Yun

Ph.D. Candidate, KAIST AI.
I work on efficient language modeling — the architectures and systems that make large models cheaper to run at long context. Advised by Se-Young Yun in the OSI Lab. Recent work spans speculative decoding, sparse attention, and letting models control their own attention span.
Authors: Namgyu Ho*, Huzama Ahmad*, Woosung Koh*, Cicero Nogueira dos Santos, Tal Schuster, Se-Young Yun
Authors: Huzama Ahmad, Se-Young Yun
Active: May 2026 – Present
Stealth — details after publication