COLA == Training Instability? #51

zaptrem · 2024-05-05T00:53:29Z

I'm training a Vocos decoder for my DAC autoencoder. When I set hop length = 256 and n_fft = 1024 in the iSTFT head the discriminators quickly win within 1000 steps. However, this doesn't happen when I set n_fft = 512, 768, or 1026. Do you know why this is happening and whether using 1026 would affect quality? I don't completely understand the COLA property.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COLA == Training Instability? #51

COLA == Training Instability? #51

zaptrem commented May 5, 2024

COLA == Training Instability? #51

COLA == Training Instability? #51

Comments

zaptrem commented May 5, 2024