Audio samples for Streaming ASR encoder for whisper-to-speech online voice conversion paper.
This demo page presents samples for the “Streaming ASR encoder for whisper-to-speech online voice conversion” paper. Here are examples of default whisper (original whisper), PPG-VC approach (ppg_original) and two our conversion systems (default_hubert and our_hubert).
CHAINS dataset |
|||
| speaker irm04 | |||
|
|
|
|
|
| speaker irm03 | |||
|
|
|
|
|
SPONTAN dataset |
|||
| speaker 0002 | |||
|
|
|
|
|
| speaker 0005 | |||
|
|
|
|
|