Non-parallel voice conversion based on source-to-target direct mapping

Categories: Dynamic Time Warping | Fourier Transform | Mel-spectrogram

Tags: 2020 | Hoirin Kim | Sunghee Jung | Yeunju Choi | Youngjoo Suh

Recent works of utilizing phonetic posteriograms (PPGs) for non-parallel voice conversion have significantly increased the usability of voice conversion since the source and target DBs are no longer required for matching contents. In this approach, the PPGs are used as the linguistic bridge between source and target speaker features. However, this PPG-based non-parallel voice conversion has some limitation that it needs two cascading networks at conversion time, making it less suitable for real-time applications and vulnerable to source speaker intelligibility at conversion stage. To address t...

Tag: Youngjoo Suh