Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing files not at beginning to 'ResumableUpload' #165

Open
retnikt opened this issue Dec 19, 2019 · 6 comments
Open

Allow passing files not at beginning to 'ResumableUpload' #165

retnikt opened this issue Dec 19, 2019 · 6 comments
Labels
api: storage Issues related to the googleapis/google-resumable-media-python API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@retnikt
Copy link

retnikt commented Dec 19, 2019

Environment details

Ubuntu, Python 3.7, Google Cloud Storage 1.23.0

Steps to reproduce

  1. Create a Blob
  2. Use blob.upload_from_file(fp) where fp is not at the beginning of the stream (i.e. it has been sought)

Code example

from google.cloud import storage
client = storage.Client()
bucket = client.bucket("somebucket")
blob = bucket.blob("path/to/blob.txt")
fp = io.BytesIO(b'\0' * 100)
fp.seek(40)
blob.upload_from_file(fp)

Traceback

Traceback (most recent call last):
  ...
  File "/.../google/cloud/storage/blob.py", line 1262, in upload_from_file
    created_json = self._do_upload(
  File "/.../google/cloud/storage/blob.py", line 1172, in _do_upload
    response = self._do_resumable_upload(
  File ".../google/cloud/storage/blob.py", line 1110, in _do_resumable_upload
    upload, transport = self._initiate_resumable_upload(
  File "/.../google/cloud/storage/blob.py", line 1059, in _initiate_resumable_upload
    upload.initiate(
  File "/.../google/resumable_media/requests/upload.py", line 338, in initiate
    method, url, payload, headers = self._prepare_initiate_request(
  File "/.../google/resumable_media/_upload.py", line 415, in _prepare_initiate_request
    raise ValueError(u"Stream must be at beginning.")
ValueError: Stream must be at beginning.

Description

I cannot see any reason why there is the check if stream.tell() != 0. It seems to serve only to frustrate users trying to upload from custom unseekable file-like objects (e.g. googleapis/google-cloud-python#7282), or only parts of files. Can this not be removed, or moved to somewhere else more relevant in the codebase if it serves a purpose only thence.

@retnikt
Copy link
Author

retnikt commented Dec 19, 2019

whoa, ten-thousandth issue!

@shunghsiyu
Copy link

shunghsiyu commented Dec 26, 2019

I second this, it would be nice if we don't need wrapper around unseekable file-like object to make it work with upload_from_file.

The if stream.tell() != 0 check code seems to belong to googleapis/google-resumable-media-python though.

@crwilcox crwilcox transferred this issue from googleapis/google-cloud-python Jan 31, 2020
@vperezb
Copy link

vperezb commented Apr 22, 2020

Hi all, I arrived here because I was struggling to upload an image transformed using PIL to do a reduction in size. Was trying in the same way you did to upload an object from "bytes" type.

I hope this is the way I should participate in an issue like this one, by providing a fix that worked for me in my particular case. And would have helped me in the first place when I reached the issue.

From the issue snippet provided by @retnikt :

from google.cloud import storage
client = storage.Client()
bucket = client.bucket("somebucket")
blob = bucket.blob("path/to/blob.txt")
fp = io.BytesIO(b'\0' * 100)
fp.seek(40)
blob.upload_from_file(fp)

I've just changed the last line using the upload_from_string method instead of the upload_from_file and the get_value() method from io.Bytes object.

import io
from google.cloud import storage
client = storage.Client()
bucket = client.bucket("somebucket")
blob = bucket.blob("path/to/blob.txt")
fp = io.BytesIO(b'\0' * 100)
fp.seek(40)
blob.upload_from_string(fp.getvalue())

Hope it helps other people reaching this issue for the same reason I did, apologies if is not the way to do it.

@retnikt
Copy link
Author

retnikt commented Apr 22, 2020

@vperezb the issue with this is, if the file-like object is a stream rather than in memory, e.g. network or disk, then using a string requires loading all the data into memory.

@tseaver tseaver changed the title Storage: Blob.upload_from_file Stream must be at beginning Allow passing unseekable files to 'Blob.upload_from_file' Aug 17, 2020
@tseaver tseaver changed the title Allow passing unseekable files to 'Blob.upload_from_file' Allow passing files not at beginning to 'Blob.upload_from_file' Aug 17, 2020
@tseaver
Copy link
Contributor

tseaver commented Aug 17, 2020

google.resumable_media requires:

It is possible that some combination of stream-not-at-start, a passed total_bytes might still work, but this issue needs to be traced in that repository.

@tseaver tseaver transferred this issue from googleapis/python-storage Aug 17, 2020
@tseaver tseaver changed the title Allow passing files not at beginning to 'Blob.upload_from_file' Allow passing files not at beginning to 'ResumableUpload' Aug 17, 2020
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Aug 17, 2020
@busunkim96 busunkim96 added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Aug 18, 2020
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/google-resumable-media-python API. label Mar 4, 2021
@Thirunayan22
Copy link

Thirunayan22 commented Mar 30, 2021

Hi all, I arrived here because I was struggling to upload an image transformed using PIL to do a reduction in size. Was trying in the same way you did to upload an object from "bytes" type.

I hope this is the way I should participate in an issue like this one, by providing a fix that worked for me in my particular case. And would have helped me in the first place when I reached the issue.

From the issue snippet provided by @retnikt :

from google.cloud import storage
client = storage.Client()
bucket = client.bucket("somebucket")
blob = bucket.blob("path/to/blob.txt")
fp = io.BytesIO(b'\0' * 100)
fp.seek(40)
blob.upload_from_file(fp)

I've just changed the last line using the upload_from_string method instead of the upload_from_file and the get_value() method from io.Bytes object.

import io
from google.cloud import storage
client = storage.Client()
bucket = client.bucket("somebucket")
blob = bucket.blob("path/to/blob.txt")
fp = io.BytesIO(b'\0' * 100)
fp.seek(40)
blob.upload_from_string(fp.getvalue())

Hope it helps other people reaching this issue for the same reason I did, apologies if is not the way to do it.

Thank you, worked for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/google-resumable-media-python API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

7 participants