Hi! I am Jiawei Du, currently a Research Assistant in the Graduate Institute of Networking and Multimedia at National Taiwan University, advised by Dr. Hung-Yi Lee. My research interests lie in speech and audio processing, particularly audio large language models for real-time conversational and multi-party speech understanding.
I received my M.S. degree in Computer Science and Information Engineering from National Taiwan University, where I worked closely with Dr. Jyh-Shing Roger Jang and Dr. Hung-Yi Lee on audio anti-spoofing. During my graduate studies, I was a Research Intern at Samsung Research SRC-B, focusing on streaming and lightweight neural audio codecs. Prior to that, I earned my B.S. degree (ranked 1st) in Information and Telecommunications Engineering from Ming Chuan University under the supervision of Dr. Shu-Yin Chiang. I also studied in Computer Science and Engineering at Shanghai Jiao Tong University as an exchange student.

Jiawei Du*, Xuanjun Chen*, Haibin Wu, Lin Zhang, I-Ming Lin, I-Hsiang Chiu, Wenze Ren, Yuan Tseng, Yu Tsao, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)
IEEE Transactions on Audio, Speech and Language Processing (TASLP) 2026
CodecFake+ is a large-scale dataset and taxonomy for detecting codec-based deepfake speech generated by neural audio codecs. It provides diverse training and evaluation data across many codec architectures and enables more systematic analysis for building stronger audio anti-spoofing models.
Jiawei Du*, Xuanjun Chen*, Haibin Wu, Lin Zhang, I-Ming Lin, I-Hsiang Chiu, Wenze Ren, Yuan Tseng, Yu Tsao, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)
IEEE Transactions on Audio, Speech and Language Processing (TASLP) 2026
CodecFake+ is a large-scale dataset and taxonomy for detecting codec-based deepfake speech generated by neural audio codecs. It provides diverse training and evaluation data across many codec architectures and enables more systematic analysis for building stronger audio anti-spoofing models.

Haibin Wu, Jiawei Du*, Xuanjun Chen*, Yi-Cheng Lin*, Kai-Wei Chang*, Ke-Han Lu*, Alexander H. Liu*, Ho-Lam Chung*, Yuan-Kuei Wu*, Dongchao Yang*, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James Glass, Shinji Watanabe, Hung-Yi Lee (* equal contribution)
IEEE Spoken Language Technology Workshop (SLT) 2024 Special Session
Codec-SUPERB introduces a lightweight and standardized benchmark for evaluating neural audio codec models across multiple speech tasks. It enables fair comparison under consistent settings and reveals key trade-offs in preserving linguistic content, speaker characteristics, and audio quality at low bitrates.
Haibin Wu, Jiawei Du*, Xuanjun Chen*, Yi-Cheng Lin*, Kai-Wei Chang*, Ke-Han Lu*, Alexander H. Liu*, Ho-Lam Chung*, Yuan-Kuei Wu*, Dongchao Yang*, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James Glass, Shinji Watanabe, Hung-Yi Lee (* equal contribution)
IEEE Spoken Language Technology Workshop (SLT) 2024 Special Session
Codec-SUPERB introduces a lightweight and standardized benchmark for evaluating neural audio codec models across multiple speech tasks. It enables fair comparison under consistent settings and reveals key trade-offs in preserving linguistic content, speaker characteristics, and audio quality at low bitrates.

Jiawei Du*, I-Ming Lin*, I-Hsiang Chiu*, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-Yi Lee, Jyh-Shing Roger Jang (* equal contribution)
IEEE Spoken Language Technology Workshop (SLT) 2024
DFADD is the first dataset for audio deepfake detection that focuses on speech synthesized by diffusion- and flow-matching-based TTS models, and reveals that current anti-spoofing systems still struggle with these more realistic fake audios.
Jiawei Du*, I-Ming Lin*, I-Hsiang Chiu*, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-Yi Lee, Jyh-Shing Roger Jang (* equal contribution)
IEEE Spoken Language Technology Workshop (SLT) 2024
DFADD is the first dataset for audio deepfake detection that focuses on speech synthesized by diffusion- and flow-matching-based TTS models, and reveals that current anti-spoofing systems still struggle with these more realistic fake audios.

Jiawei Du*, Xuanjun Chen*, Haibin Wu, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)
Interspeech 2024
This paper proposes a neural codec-based method to detect adversarial samples for speaker verification by comparing ASV score differences before and after codec re-synthesis. Experiments across 15 open-source neural codecs show that the approach outperforms seven prior baselines, with Descript Audio Codec giving the best results.
Jiawei Du*, Xuanjun Chen*, Haibin Wu, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)
Interspeech 2024
This paper proposes a neural codec-based method to detect adversarial samples for speaker verification by comparing ASV score differences before and after codec re-synthesis. Experiments across 15 open-source neural codecs show that the approach outperforms seven prior baselines, with Descript Audio Codec giving the best results.