Example of an expert musician singing the same scale
Other examples of conditioning from the dataset
We present plots of the ground truth pitch contour along with the coarse conditioning signal extracted from it (described in section 5.2), and the generated pitch contour along with the coarse contour extracted from it.
Note: Ground truth audios are resynthesized with our Spectrogram Generator + vocoder.
Example 1
Even though the coarse contour of the input and generated contour are pretty similar, the generated samples tend to become extra smooth compared to the input.
Ground truth resynthesized audio (from which coarse pitch conditioning is extracted)
Generated audio
Example 2
Ground truth resynthesized audio (from which coarse pitch conditioning is extracted)