Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 634 Bytes

README.md

File metadata and controls

16 lines (12 loc) · 634 Bytes

Text-to-Speech Model employing Content-Style Disentanglement using GANS

This is the code for the natual language generating Text-to-Speech model built upon Generative Adversarial Networks. It employs content-stle disentanglement and synthesizes high fidelity audio with the correct verbal content and the desired auditory style and tone.

This model was developed when I was working as a Research Assistant at the Center for Cloud Computing and Big Data.

Preprocess

python preprocess/make_dataset_vctk.py vctk.h5
python preprocess/make_single_samples.py vctk.h5 index.json

Training

python main.py