SOFA_AI: Singing-Oriented Forced Aligner for Automatic Inference

Introduction

SOFA_AI (Singing-Oriented Forced Aligner for Automatic Inference) utilizes FunASR and SOFA to achieve the task of directly obtaining phoneme-level labels for target dry vocals in the absence of lyric annotations or speech transcription labels. This tool can to some extent optimize the phoneme labeling process for DiffSinger, reducing the burden of phoneme labeling.

Note:

The current code is assisted and corrected by ChatGPT-4, which may contain potential bugs and recognition errors. If any issues are found, you are welcome to raise an issue.

This project has plans to integrate with the openai/whisper project, as well as to add ideas about combining ASR with SOFA for confidence level assessment. Stay tuned.

How to Use

Environment Setup

Create and enter a Python 3.10 environment:

conda create -n SOFA_AI python=3.10 -y
conda activate SOFA_AI

Visit the Pytorch official website and download torch for your device.
(Optional, to avoid downloading multiple versions of the same library) Install pytorch-lightning separately:
```
pip install lightning
```

Clone the repository and enter the code directory:

git clone https://github.com/colstone/SOFA_AI.git
cd SOFA_AI

Install the remaining libraries:
```
pip install -r requirements.txt
```

Inference

Run the code:
```
python SOFA_AI.py
```
After the code runs, it will download the FunASR model from Modelscope. Once the model is downloaded, the code will ask for:
- WAV file or folder path: Drag and drop the WAV file or folder into the command line window.
- SOFA model path: Drag and drop the SOFA model into the command line window.
- Dictionary path: Drag and drop the dictionary path into the command line window.
- Phoneme label format (TextGrid or HTK lab): Enter textgrid or htk.

Then, simply wait for the code to finish running.

If you need the text labs or pinyin labs inferred by FunASR for correcting labels or for inference with MFA/SOFA, please go to the character or pinyin folder and proceed accordingly.

Open Source Projects Used in This Project

qiuqiao/SOFA: SOFA: Singing-Oriented Forced Aligner

alibaba-damo-academy/FunASR: A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.

We sincerely thank the developers/development teams of the above projects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_EN.md

README_EN.md

SOFA_AI: Singing-Oriented Forced Aligner for Automatic Inference

Introduction

How to Use

Environment Setup

Inference

Open Source Projects Used in This Project

Files

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

SOFA_AI: Singing-Oriented Forced Aligner for Automatic Inference

Introduction

How to Use

Environment Setup

Inference

Open Source Projects Used in This Project