Whisper-to-speech voice conversion

This demo page presents samples for the “Streaming ASR encoder for whisper-to-speech online voice conversion” paper. Here are examples of default whisper (original whisper), PPG-VC approach (ppg_original) and two our conversion systems (default_hubert and our_hubert).

CHAINS dataset
speaker irm04
original whisper	ppg_original	default_hubert	our_hubert
speaker irm03
original whisper	ppg_original	default_hubert	our_hubert
SPONTAN dataset
speaker 0002
original whisper	ppg_original	default_hubert	our_hubert
speaker 0005
original whisper	ppg_original	default_hubert	our_hubert

Whisper-to-speech voice conversion

CHAINS dataset

SPONTAN dataset