Ke-Han Lu

Ke-Han Lu

I’m Ke-Han Lu, a second-year Ph.D. student at National Taiwan University, advised by Prof. Hung-Yi Lee. My research focuses on multimodal language models, particularly on cross-modal alignment and utilizing large language models to enhance multimodal understanding.

Selected Publications

For the full publication list, please refer to my Google Scholar page.

  • Arxiv preprint
    DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
    Ke-Han Lu et al.
  • InterSpeech 2025
    Speech-IFEval: Evaluating Instruction-Following and Quantifying Catastrophic Forgetting in Speech-Aware Language Models
    Ke-Han Lu, Chun-Yi Kuan, Hung-Yi Lee
  • ICLR 2025
    Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
    Chien-Yu Huang et al.
  • ICASSP 2025
    Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
    Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, Chao-Han Huck Yang, Jagadeesh Balam, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee
  • Technical Report
    Building a taiwanese mandarin spoken language model: A first attempt
    Chih-Kai Yang, Yu-Kuan Fu, Chen-An Li, Yi-Cheng Lin, Yu-Xiang Lin, Wei-Chih Chen, Ho Lam Chung, Chun-Yi Kuan, Wei-Ping Huang, Ke-Han Lu*, Tzu-Quan Lin, Hsiu-Hsuan Wang, En-Pei Hu, Chan-Jan Hsu, Liang-Hsuan Tseng, I Chiu, Ulin Sanga, Xuanjun Chen, Po-chun Hsu, Shu-wen Yang, Hung-yi Lee
    [Paper]
  • InterSpeech 2024
    DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
    Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, He Huang, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee
    [Paper]
  • ICASSP 2024
    Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
    Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee
  • IEEE SLT 2022
    A Context-aware Knowledge Transferring Strategy for CTC-based ASR
    Ke-Han Lu, Kuan-Yu Chen
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
    Non-autoregressive ASR Modeling using Pre-trained Language Models for Chinese Speech Recognition
    Fu-Hao Yu, Kuan-Yu Chen, Ke-Han Lu
    [Paper]
  • Poster spotlight, VQA workshop, CVPR 2021
    A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021
    Ke-Han Lu, Bo-Han Fang, Kuan-Yu Chen

Education

  • Ph.D. student in Communication Engineering, National Taiwan University
    • Feb 2024 - Present
    • Supervisor: Prof. Hung-Yi Lee
  • M.S. in Computer Science and Information Engineering, National Taiwan University of Science and Technology
    • Sep 2020 - Feb 2023
    • Supervisor: Prof. Kuan-Yu Chen
  • B.S. in Computer Science and Information Engineering, National Taiwan University of Science and Technology
    • Sep 2016 - Jun 2020

Honors

  • NVIDIA Academic Grant Program
  • NSTC Graduate Research Fellowship(NSTC-GRF)
  • 16th TaiwanTech Outstanding Youth Award

Skills

  • Programming: Python, PyTorch, Javascript, Latex
  • Software and tools: Linux, Docker, Git, NeMo, Megatron-LM, ESPNET, Huggingface Transformers, fairseq
  • Language: Mandarin(native), English(fluent)

© 2025 Ke-Han Lu.