Audio samples for Streaming ASR encoder for whisper-to-speech online voice conversion paper.
This demo page presents samples for the “Streaming ASR encoder for whisper-to-speech online voice conversion” paper. Here are examples of default whisper (original whisper), PPG-VC approach (ppg_original) and two our conversion systems (default_hubert and our_hubert).
| CHAINS dataset | |||
| speaker irm04 | |||
|  |  |  |  | 
| speaker irm03 | |||
|  |  |  |  | 
| SPONTAN dataset | |||
| speaker 0002 | |||
|  |  |  |  | 
| speaker 0005 | |||
|  |  |  |  |