Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: multipart uploads broken by progress logging #2040

Merged
merged 1 commit into from
Jul 22, 2024

Conversation

thundergolfer
Copy link
Contributor

@thundergolfer thundergolfer commented Jul 22, 2024

Describe your changes

Multi-part uploads broke in #1969.

$ dd if=/dev/zero of=/tmp/tempfile bs=1024 count=1423195
1423195+0 records in
1423195+0 records out
1457351680 bytes (1.5 GB, 1.4 GiB) copied, 1.77675 s, 820 MB/s
$ ls -la /tmp/tempfile
-rw-rw-r-- 1 ubuntu ubuntu 1457351680 Jul 22 16:29 /tmp/tempfile
$ MODAL_PROFILE=modal_labs modal volume put volumes-syn-mon /tmp/tempfile tempfile
╭─ Error ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ 0                                                                                                                                                                                     │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
(modal) ubuntu@ip-10-1-1-198:~/modal$ echo $?
1

The issue is that in a multipart upload the first part to finish completes the progress item which makes every subsequent progress update from the other tasks crash. When they crash the HTTP client produces an obscure error because it got an unexpected exception from the readable.

@kramstrom actually suggested preventing these bugs from crashing uploads but I wanted them to crash so we'd find the bugs quicker: https://github.com/modal-labs/modal-client/pull/1969/files#r1678674924

I didn't realize though that the bugs would surface in such an obscure way:

ClientConnectionError("Failed to send bytes into the underlying connection Connection<ConnectionKey(host='modal-blobs.s3-accelerate.amazonaws.com', port=443, is_ssl=True, ssl=True

Fixes

  1. Now want to do the try except wrapping because we didn't catch this bug for ~1 week.
  2. Move the completion of the progress item so that it happens only once per upload not once-per-part.
Backward/forward compatibility checks

Check these boxes or delete any item (or this section) if not relevant for this PR.

  • Client+Server: this change is compatible with old servers
    • no change to communication with server here
  • Client forward compatibility: this change ensures client can accept data intended for later versions of itself
    • no change to IDL here

Note on protobuf: protobuf message changes in one place may have impact to
multiple entities (client, server, worker, database). See points above.


@thundergolfer
Copy link
Contributor Author

@kramstrom I don't have time to add a test before merge as a customer needs the fix, but I think we should add a test of volume put when multipart and progress logging is active.

@thundergolfer
Copy link
Contributor Author

@prbot approve

Copy link

@modal-pr-review-automation modal-pr-review-automation bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved 👍. @RohanArepally will follow-up review this.

@thundergolfer thundergolfer force-pushed the jonathon/fix-uploads-downloads branch from abe89e1 to 1b4a7b0 Compare July 22, 2024 19:08
@thundergolfer thundergolfer merged commit 7f7810d into main Jul 22, 2024
22 checks passed
@thundergolfer thundergolfer deleted the jonathon/fix-uploads-downloads branch July 22, 2024 19:26
@kramstrom
Copy link
Contributor

I think we can even skip the progress_report_cb(complete=True) call which only serves the purpose of deleting the progress bar, which will happen anyway once the upload and post-processing is complete, so it will only stay there during post-processing if we delete it which I think is fine

Copy link
Contributor

@RohanArepally RohanArepally left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants