The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

FLORES-101 is a Many-to-Many multilingual translation benchmark dataset for 101 languages.

Paper: The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation.
Download FLORES-101 dataset.
Read the blogpost and paper.
Evaluation server: dynabench, Instructions to submit model

Looking for FLORESv1, which included Nepali, Sinhala, Pashto, and Khmer? Click here

Learn More in This Intro Video

Abstract

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.

Download FLORES-101 Dataset

The data can be downloaded from: Here.

Evaluation

SPM-BLEU

For evaluation, we use SentencePiece BLEU (spBLEU) which uses a SentencePiece (SPM) tokenizer with 256K tokens and then BLEU score is computed on the sentence-piece tokenized text. This requires installing sacrebleu using a specific branch:

git clone --single-branch --branch adding_spm_tokenized_bleu https://github.com/ngoyal2707/sacrebleu.git
cd sacrebleu
python setup.py install

Offline Evaluation

Download FLORES-101 dev and devtest dataset

cd ~/
wget https://dl.fbaipublicfiles.com/flores101/dataset/flores101_dataset.tar.gz
tar -xvzf flores101_dataset.tar.gz

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
dynalab		dynalab
floresv1		floresv1
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
flores_logo.png		flores_logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Learn More in This Intro Video

Abstract

Download FLORES-101 Dataset

Evaluation

SPM-BLEU

Offline Evaluation

Download FLORES-101 dev and devtest dataset

Compute spBLEU

License

vaxelrod/flores

Folders and files

Latest commit

History

Repository files navigation

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Learn More in This Intro Video

Abstract

Download FLORES-101 Dataset

Evaluation

SPM-BLEU

Offline Evaluation

Download FLORES-101 dev and devtest dataset

Compute spBLEU