GaMaDHaNi: Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music

5.2 Coarse Pitch Conditioning

Descending Scale in Kafi Thaat

Conditioning Signal

Depiction of conditioning signal in sheet music.

Depiction of conditioning signal in sheet music.

Audio Samples

Fig 5: A staircase descending scale (in blue) as a coarse input. This input is then processed as described in Section 5.2 and fed into the model. The generated fine-grain contour (in orange) has glides (mindh) and a jerky movement (gamak) characteristic to Hindustani music.

Fig 5: A staircase descending scale (in blue) as a coarse input. This input is then processed as described in Section 5.2 and fed into the model. The generated fine-grain contour (in orange) has glides (mindh) and a jerky movement (gamak) characteristic to Hindustani music.

Spectrogram of generated sample in Fig 5

Spectrogram of generated sample in Fig 5

Example of an expert musician singing the same scale

Other examples of conditioning from the dataset

We present plots of the ground truth pitch contour along with the coarse conditioning signal extracted from it (described in section 5.2), and the generated pitch contour along with the coarse contour extracted from it.

Note: Ground truth audios are resynthesized with our Spectrogram Generator + vocoder.

Example 1

Even though the coarse contour of the input and generated contour are pretty similar, the generated samples tend to become extra smooth compared to the input.

(Top) Ground truth contour and input coarse conditioning. (Bottom) Generated contour and ‘coarse’ contour extracted from the generated sample.

(Top) Ground truth contour and input coarse conditioning. (Bottom) Generated contour and ‘coarse’ contour extracted from the generated sample.

Example 2

(Top) Ground truth contour and input coarse conditioning. (Bottom) Generated contour and ‘coarse’ contour extracted from the generated sample.

(Top) Ground truth contour and input coarse conditioning. (Bottom) Generated contour and ‘coarse’ contour extracted from the generated sample.