Text-to-Speech Model employing Content-Style Disentanglement using GANS

This is the code for the natual language generating Text-to-Speech model built upon Generative Adversarial Networks. It employs content-stle disentanglement and synthesizes high fidelity audio with the correct verbal content and the desired auditory style and tone.

This model was developed when I was working as a Research Assistant at the Center for Cloud Computing and Big Data.

Preprocess

python preprocess/make_dataset_vctk.py vctk.h5
python preprocess/make_single_samples.py vctk.h5 index.json

Training

python main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Text-to-Speech Model employing Content-Style Disentanglement using GANS

Preprocess

Training

Files

README.md

Latest commit

History

README.md

File metadata and controls

Text-to-Speech Model employing Content-Style Disentanglement using GANS

Preprocess

Training