Skip to content
This repository has been archived by the owner on Feb 23, 2024. It is now read-only.

[google cloud translate] as a result of translation, the second sentence disappeared #9

Closed
InspiringPeople opened this issue Apr 3, 2020 · 3 comments
Assignees
Labels
api: translation Issues related to the Cloud Translation API API. 🚨 This issue needs some love. triage me I really want to be triaged.

Comments

@InspiringPeople
Copy link

Environment details

  1. Specify the API at the beginning of the title (for example, "BigQuery: ...")
    Google Cloud Translate
  2. OS type and version
    Linux ANLZ-G1 4.4.0-130-generic Removing overloaded global "type" from storage.acl. google-cloud-python#156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  3. Python version and virtual environment information: python --version
    python 3.7
  4. google-cloud- version: pip show google-<service> or pip freeze
    google-cloud-translate==2.0.1

Steps to reproduce

  1. Translate the two sentences written in Korean without spaces after the special symbols (.,! ...) using the google cloud translate API.

Code example

# example
from google.cloud import translate_v2 as translate
translate_client = translate.Client()

# case 1
comments = '야 너 왜 자꾸 문제 생겨.자꾸 이러면 못써'
myTrans1 = translate_client.translate(comments, target_language = 'en')
print('case 1 : ',myTrans1['translatedText'])

# case 2
comments = '야 너 왜 자꾸 문제 생겨. 자꾸 이러면 못써'
myTrans2 = translate_client.translate(comments, target_language = 'en')
print('case 2 : ', myTrans2['translatedText'])

Result

case 1 : Hey, why do you keep having problems?
case 2 : Hey, why do you keep having trouble? If I keep doing this, I can't use it

This is 100% reproducible. It needs credential key (GOOGLE_APPLICATION_CREDENTIALS) because I use paid service (Google Cloud Translate V2)

The difference in input data is just whether there is a space after the period or not. However, as a result of translation, in case 2, the second sentence disappeared without being translated.

@busunkim96 busunkim96 transferred this issue from googleapis/google-cloud-python Apr 3, 2020
@product-auto-label product-auto-label bot added the api: translation Issues related to the Cloud Translation API API. label Apr 3, 2020
@busunkim96
Copy link
Contributor

Interesting. This looks like something coming from the backend. I tried the v2 REST API directly, the v3 client, and translate.google.com

REST API:

https://translation.googleapis.com/language/translate/v2/?q=%EC%95%BC%20%EB%84%88%20%EC%99%9C%20%EC%9E%90%EA%BE%B8%20%EB%AC%B8%EC%A0%9C%20%EC%83%9D%EA%B2%A8.%EC%9E%90%EA%BE%B8%20%EC%9D%B4%EB%9F%AC%EB%A9%B4%20%EB%AA%BB%EC%8D%A8&source=ko&target=en&key=YOUR_API_KEY

v3 client

from google.cloud import translate

client = translate.TranslationServiceClient()

parent = client.location_path('PROJECT_ID', "global")

text_no_space = '야 너 왜 자꾸 문제 생겨.자꾸 이러면 못써'
text_with_space = '야 너 왜 자꾸 문제 생겨. 자꾸 이러면 못써'


for text in [text_no_space, text_with_space]:
    response = client.translate_text(
        parent=parent,
        contents=[text],
        mime_type="text/plain",
        source_language_code="ko",
        target_language_code="en",
    )
    # Display the translation for each input text provided
    for translation in response.translations:
        print(u"Translated text: {}".format(translation.translated_text))

From poking around with different inputs this doesn't seem to always happen.
'너 왜 자꾸 문제 생겨.이러면 못써 -> Why do you keep having problems?
너 왜 자꾸 문제 생겨.너 이러면 못써 -> Why do you keep having trouble, you can't use this

If feasible, you could insert spaces around the punctuation as a work-around.

@czahedi Where is the correct place to provide feedback on translation quality?

@czahedi
Copy link

czahedi commented Apr 3, 2020

Howdy! Just checked with the translation team and they have a public feedback / issue tracking forum here: https://issuetracker.google.com/issues?q=componentid:187144

@czahedi
Copy link

czahedi commented Apr 3, 2020

@InspiringPeople I encourage you to file an issue above and get your feedback directly to the product team, and refer to Bu Sun's workaround in the meantime. Thanks!

@czahedi czahedi closed this as completed Apr 3, 2020
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Apr 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: translation Issues related to the Cloud Translation API API. 🚨 This issue needs some love. triage me I really want to be triaged.
Projects
None yet
Development

No branches or pull requests

4 participants