Audio samples for Streaming ASR encoder for whisper-to-speech online voice conversion paper.
This demo page presents samples for the “Streaming ASR encoder for whisper-to-speech online voice conversion” paper. Here are examples of default whisper (original whisper), PPG-VC approach (ppg_original) and two our conversion systems (default_hubert and our_hubert).
CHAINS dataset |
|||
speaker irm04 | |||
|
|
|
|
speaker irm03 | |||
|
|
|
|
SPONTAN dataset |
|||
speaker 0002 | |||
|
|
|
|
speaker 0005 | |||
|
|
|
|