Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the prefix for "Helsinki-NLP/opus-mt-zh-en" when using "AutoModelForSeq2SeqLM"? #98

Open
blackli7 opened this issue Feb 23, 2024 · 0 comments

Comments

@blackli7
Copy link

blackli7 commented Feb 23, 2024

I was trying to translate a large amount of text in Chinese into English. So I tried to use "AutoModelForSeq2SeqLM" instead of "Pipeline" to run parallelism on gpus. But I found that this was a different between the two results.

When using "Pipeline" as follows:
pipe1 = pipeline("translation", model="Helsinki-NLP/opus-mt-zh-en") outputs1 = pipe1("比特币系统性能的428倍!世界第三代公链拥有方与上海这所高校达成合作") print(outputs1)
I got "That's 428 times the amount of Bitcoin's systematic power! The third generation of public chain owners in the world has made a partnership with Shanghai's high school."

However when using "AutoModelForSeq2SeqLM":
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-zh-en") model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-zh-en") encoded_input = tokenizer(["比特币系统性能的428倍!世界第三代公链拥有方与上海这所高校达成合作"], return_tensors="pt", padding=True, truncation=True, max_length=512) output = model.generate(**encoded_input, max_new_tokens=128, num_beams=1, num_return_sequences=1) output = tokenizer.batch_decode(output, skip_special_tokens=True) print(output)
I got "The world's third generation public chain owners have worked with Shanghai's college."

It seems "AutoModelForSeq2SeqLM" just translate the later part of the Chinese sentence. I guess it is because the prefix for this task is missing, but I find "model.config.prefix" is None. Could you show me the prefix for this task and this model, or could you help me do translation in a right way with "AutoModelForSeq2SeqLM"?

The colab link is https://colab.research.google.com/drive/1OTi_Nc8x1UTFUufbx7HsJfhn4VCUCkAs?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant