Descending Scale in Kafi Thaat

Conditioning Signal

Audio Samples

Conditioning Contour

Conditioned Generation

Fig 5: A staircase descending scale (in blue) as a coarse input. This input is then processed as described in Section 5.2 and fed into the model. The generated fine-grain contour (in orange) has glides (mindh) and a jerky movement (gamak) characteristic to Hindustani music.

Spectrogram of generated sample in Fig 5

Example of an expert musician singing the same scale

Other examples of conditioning from the dataset

We present plots of the ground truth pitch contour along with the coarse conditioning signal extracted from it (described in section 5.2), and the generated pitch contour along with the coarse contour extracted from it.

Note: Ground truth audios are resynthesized with our Spectrogram Generator + vocoder.

Example 1

Even though the coarse contour of the input and generated contour are pretty similar, the generated samples tend to become extra smooth compared to the input.