Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can mT5 model on Huggingface be used for machine translation?

The mT5 model is pretrained on the mC4 corpus, covering 101 languages:

Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu.

Can it do machine translation?

Many users have tried something like this but it fails to generate a translation:

from transformers import MT5ForConditionalGeneration, T5Tokenizer

model = MT5ForConditionalGeneration.from_pretrained("google/mt5-small")

tokenizer = T5Tokenizer.from_pretrained("google/mt5-small")

article = "translate to french: The capital of France is Paris."

batch = tokenizer.prepare_seq2seq_batch(src_texts=[article], return_tensors="pt")
output_ids = model.generate(input_ids=batch.input_ids, num_return_sequences=1, num_beams=8, length_penalty=0.1)

tokenizer.decode(output_ids[0])

[out]:

>>> <pad> <extra_id_0></s>

How do we make the mt5 model do machine translation?

like image 365
alvas Avatar asked Oct 29 '25 17:10

alvas


1 Answers

Can it do machine translation?

From the doc:

Note: mT5 was only pre-trained on mC4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.

therefore, no, it cannot do machine translation out of the box.

See also https://github.com/huggingface/transformers/issues/8704

How do we make the mt5 model do machine translation?

No, it can't do machine translation out of the box. But you can fine-tune the model on parallel data.

There are multiple MT models fine-tuned and shared on https://huggingface.co/models?pipeline_tag=translation&sort=downloads&search=mt5

But if you want to fine-tune mT5 on your own data, here's a sample reference code: https://github.com/ejmejm/multilingual-nmt-mt5/blob/main/nmt_full_version.ipynb

like image 87
alvas Avatar answered Oct 31 '25 07:10

alvas