Jiawei (Gavin) Du 杜嘉炜
Research Assistant
Speech Processing and Machine Learning Lab
National Taiwan University

Hi! I am Jiawei Du, currently a Research Assistant in the Graduate Institute of Networking and Multimedia at National Taiwan University, advised by Dr. Hung-Yi Lee. My research interests lie in speech and audio processing, particularly audio large language models for real-time conversational and multi-party speech understanding.

I received my M.S. degree in Computer Science and Information Engineering from National Taiwan University, where I worked closely with Dr. Jyh-Shing Roger Jang and Dr. Hung-Yi Lee on audio anti-spoofing. During my graduate studies, I was a Research Intern at Samsung Research SRC-B, focusing on streaming and lightweight neural audio codecs. Prior to that, I earned my B.S. degree (ranked 1st) in Information and Telecommunications Engineering from Ming Chuan University under the supervision of Dr. Shu-Yin Chiang. I also studied in Computer Science and Engineering at Shanghai Jiao Tong University as an exchange student.


Education
  • Georgia Institute of Technology
    Georgia Institute of Technology
    Atlanta
    Incoming Ph.D. Student
    School of Electrical and Computer Engineering
    Aug. 2026 - present
  • National Taiwan University
    National Taiwan University
    Taipei
    M.S. in Computer Science and Information Engineering
    Sep. 2022 - Jun. 2025
  • Ming Chuan University
    Ming Chuan University
    Taoyuan
    B.S. in Information and Telecommunications Engineering
    Sep. 2018 - Jun. 2022
Experience
  • National Taiwan University
    National Taiwan University
    Taipei
    Research Assistant
    Speech Processing and Machine Learning Lab
    Sep. 2025 - Jun. 2026
  • Samsung Research SRC-B
    Samsung Research SRC-B
    Beijing
    Research Intern
    Speech Team
    Feb. 2025 - May. 2025
  • Shanghai Jiao Tong University
    Shanghai Jiao Tong University
    Shanghai
    Exchange Student in Computer Science and Engineering
    Sep. 2020 - Jan. 2021
News
2026
One journal paper accepted to IEEE TASLP.
May 02
I decided to pursue my Ph.D. in Electrical and Computer Engineering at Georgia Tech, looking forward to Atlanta!
Apr 15
2025
I started as a Research Assistant in GINM at National Taiwan University.
Sep 08
I graduated from National Taiwan University (M.S. in CSIE).
Jun 10
I completed my four-month research internship at Samsung, a great experience!
May 15
Selected Publications (view all )
CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset

Jiawei Du*, Xuanjun Chen*, Haibin Wu, Lin Zhang, I-Ming Lin, I-Hsiang Chiu, Wenze Ren, Yuan Tseng, Yu Tsao, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)

IEEE Transactions on Audio, Speech and Language Processing (TASLP) 2026

CodecFake+ is a large-scale dataset and taxonomy for detecting codec-based deepfake speech generated by neural audio codecs. It provides diverse training and evaluation data across many codec architectures and enables more systematic analysis for building stronger audio anti-spoofing models.

CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset

Jiawei Du*, Xuanjun Chen*, Haibin Wu, Lin Zhang, I-Ming Lin, I-Hsiang Chiu, Wenze Ren, Yuan Tseng, Yu Tsao, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)

IEEE Transactions on Audio, Speech and Language Processing (TASLP) 2026

CodecFake+ is a large-scale dataset and taxonomy for detecting codec-based deepfake speech generated by neural audio codecs. It provides diverse training and evaluation data across many codec architectures and enables more systematic analysis for building stronger audio anti-spoofing models.

Codec-SUPERB @ SLT 2024: A Lightweight Benchmark for Neural Audio Codec Models
Codec-SUPERB @ SLT 2024: A Lightweight Benchmark for Neural Audio Codec Models

Haibin Wu, Jiawei Du*, Xuanjun Chen*, Yi-Cheng Lin*, Kai-Wei Chang*, Ke-Han Lu*, Alexander H. Liu*, Ho-Lam Chung*, Yuan-Kuei Wu*, Dongchao Yang*, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James Glass, Shinji Watanabe, Hung-Yi Lee (* equal contribution)

IEEE Spoken Language Technology Workshop (SLT) 2024 Special Session

Codec-SUPERB introduces a lightweight and standardized benchmark for evaluating neural audio codec models across multiple speech tasks. It enables fair comparison under consistent settings and reveals key trade-offs in preserving linguistic content, speaker characteristics, and audio quality at low bitrates.

Codec-SUPERB @ SLT 2024: A Lightweight Benchmark for Neural Audio Codec Models

Haibin Wu, Jiawei Du*, Xuanjun Chen*, Yi-Cheng Lin*, Kai-Wei Chang*, Ke-Han Lu*, Alexander H. Liu*, Ho-Lam Chung*, Yuan-Kuei Wu*, Dongchao Yang*, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James Glass, Shinji Watanabe, Hung-Yi Lee (* equal contribution)

IEEE Spoken Language Technology Workshop (SLT) 2024 Special Session

Codec-SUPERB introduces a lightweight and standardized benchmark for evaluating neural audio codec models across multiple speech tasks. It enables fair comparison under consistent settings and reveals key trade-offs in preserving linguistic content, speaker characteristics, and audio quality at low bitrates.

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset

Jiawei Du*, I-Ming Lin*, I-Hsiang Chiu*, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-Yi Lee, Jyh-Shing Roger Jang (* equal contribution)

IEEE Spoken Language Technology Workshop (SLT) 2024

DFADD is the first dataset for audio deepfake detection that focuses on speech synthesized by diffusion- and flow-matching-based TTS models, and reveals that current anti-spoofing systems still struggle with these more realistic fake audios.

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset

Jiawei Du*, I-Ming Lin*, I-Hsiang Chiu*, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-Yi Lee, Jyh-Shing Roger Jang (* equal contribution)

IEEE Spoken Language Technology Workshop (SLT) 2024

DFADD is the first dataset for audio deepfake detection that focuses on speech synthesized by diffusion- and flow-matching-based TTS models, and reveals that current anti-spoofing systems still struggle with these more realistic fake audios.

Neural Codec-based Adversarial Sample Detection for Speaker Verification
Neural Codec-based Adversarial Sample Detection for Speaker Verification

Jiawei Du*, Xuanjun Chen*, Haibin Wu, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)

Interspeech 2024

This paper proposes a neural codec-based method to detect adversarial samples for speaker verification by comparing ASV score differences before and after codec re-synthesis. Experiments across 15 open-source neural codecs show that the approach outperforms seven prior baselines, with Descript Audio Codec giving the best results.

Neural Codec-based Adversarial Sample Detection for Speaker Verification

Jiawei Du*, Xuanjun Chen*, Haibin Wu, Jyh-Shing Roger Jang, Hung-Yi Lee (* equal contribution)

Interspeech 2024

This paper proposes a neural codec-based method to detect adversarial samples for speaker verification by comparing ASV score differences before and after codec re-synthesis. Experiments across 15 open-source neural codecs show that the approach outperforms seven prior baselines, with Descript Audio Codec giving the best results.

All publications