Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seek file before retry #94

Merged
merged 13 commits into from
Dec 22, 2020
Merged

Seek file before retry #94

merged 13 commits into from
Dec 22, 2020

Conversation

maingoh
Copy link
Contributor

@maingoh maingoh commented Dec 22, 2020

When a retry is done due to an API error, the files where not seeked back to the beginning, thus the second time we were sending an empty file.

Also fixed pip install requirements

deepomatic/api/http_helper.py Show resolved Hide resolved
Comment on lines 219 to 220
if hasattr(f, 'seek'):
f.seek(0)
Copy link
Contributor

@thomas-riccardi thomas-riccardi Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about non-seekable files? maybe we should fail instead of keeping the current bug of sending empty data?

Related issue in google storage python client: googleapis/google-resumable-media-python#165
they mandate seekable streams. (that being said I remember seeing in their code that they memory buffered everything that is in flight and not yet confirmed by API, via application-level multi part uploads, so it's strange... maybe it has changed?)

or maybe it doesn't make sense for the files argument to have non seekable inputs; but the scenario should apply to other ways of passing input data: data argument.

Copy link
Contributor Author

@maingoh maingoh Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which non seekable files are you refering to ? Overall either those are bytes or stream that once read need to be reseeked.
I don't think the bug will still be there bug after this PR.

I don't want to break the current behavior now, though we can think about it for later (I would like also to remove the magic one day #33)

Copy link
Contributor

@thomas-riccardi thomas-riccardi Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reference, after meeting with hugo and some tests:

  • files accepts non-streamable too, the real constraint is: if it's streamable and non seekable, then we cannot support retries: raise error on second try (maybe on first too?)
  • data accepts generator/iterator, or a dict of such thing: hard to check all that for now. maybe types constraints will help.

Comment on lines -5 to -11
try:
# for pip >= 10
from pip._internal.req import parse_requirements
except ImportError:
# for pip <= 9.0.3
from pip.req import parse_requirements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe only fix the bug in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevented me to install it on my computer, it would be nice to include it in this PR

@@ -359,7 +359,7 @@ def download_file(url):
filename = os.path.join(tempfile.gettempdir(),
hashlib.sha1(url.encode()).hexdigest() + ext)
if os.path.exists(filename): # avoid redownloading
logger.info("Skipping download of {}: file already exist in ".format(url, filename))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe only fix the bug in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake got bumped automatically, I think either we freeze it, or if we let it flexible we can embed those kind of small fixes in PRs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requirements-dev.txt should be fully frozen, even for python libs: it's not used by the lib users, it's OK to freeze/pin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok it wasn't in requirement.dev so i moved it there and froze it

Co-authored-by: Thomas Riccardi <thomas@deepomatic.com>
Copy link
Contributor

@thomas-riccardi thomas-riccardi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo last typos to fix

@sonarcloud
Copy link

sonarcloud bot commented Dec 22, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@maingoh maingoh merged commit 8f80d3c into master Dec 22, 2020
@maingoh maingoh deleted the seek_file_retry branch December 22, 2020 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants