Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record zstd:chunked format, and annotations, in BlobInfoCache #2487

Merged
merged 7 commits into from
Aug 7, 2024

Conversation

mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Jul 22, 2024

WIP, premature to review.

This should allow re-using on-registry zstd:chunked data on push, instead of re-pushing every time, or making the output not chunked.

This includes #2321, I’m filing it early just to have an URL to link to in #2189.

@mtrmac mtrmac force-pushed the chunked-bic2 branch 2 times, most recently from 9a182e2 to 1d6935c Compare July 24, 2024 20:28
mtrmac added a commit to mtrmac/libpod that referenced this pull request Jul 24, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the chunked-bic2 branch 2 times, most recently from 8ffa23f to e0ff591 Compare July 27, 2024 15:30
mtrmac added a commit to mtrmac/libpod that referenced this pull request Jul 27, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the chunked-bic2 branch 3 times, most recently from 6256d87 to 8622767 Compare July 27, 2024 20:01
mtrmac added a commit to mtrmac/libpod that referenced this pull request Jul 29, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the chunked-bic2 branch 7 times, most recently from 9d1e0c0 to ca35041 Compare July 30, 2024 18:36
@mtrmac mtrmac force-pushed the chunked-bic2 branch 4 times, most recently from 5b924e1 to 1c9557d Compare August 6, 2024 08:55
mtrmac added a commit to mtrmac/libpod that referenced this pull request Aug 6, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 6, 2024

To test:

## CAREFUL, this is destructive
# podman rmi -a
# rm -f /var/lib/containers/cache/blob-info-cache-v1.sqlite 

- When asking for zstd:chunked, we can reuse an existing blob, incl. annotations
# podman pull quay.io/libpod/alpine
## compressses to chunked
# podman --log-level=debug push --compression-format zstd:chunked --force-compression quay.io/libpod/alpine localhost:50000/chunked
## image is chunked, has annotations
# skopeo inspect --raw docker://localhost:50000/chunked | jq .
## OLD: does not find a match, compresses again
## NEW: reuses existing layer
# podman --log-level=debug push quay.io/libpod/alpine localhost:50000/push2
## OLD: image is compressed using gzip
## NEW: image is chunked, has annotations
# skopeo inspect --raw docker://localhost:50000/push2 | jq .

- When asking for zstd, we can reuse zstd:chunked, incl. annotations
## OLD: Does not find a match, compressses again
## NEW: reuses existing layer
# podman --log-level=debug push --compression-format zstd --force-compression quay.io/libpod/alpine localhost:50000/zstd
## OLD: image is zstd, no annotations
## NEW: image is chunked, has annotations
# skopeo inspect --raw docker://localhost:50000/zstd | jq .

- A copy of zstd:chunked source does not recompress, but the data is only recorded as zstd
# rm -f /var/lib/containers/cache/blob-info-cache-v1.sqlite
## With BIC erased, does not find a match, compressses again
# skopeo --debug copy --dest-compress-format zstd:chunked docker://quay.io/libpod/alpine docker://localhost:50000/recompress
# rm -f /var/lib/containers/cache/blob-info-cache-v1.sqlite 
## Create a dir: copy of the chunked image (and establish blob existence)
# skopeo --debug copy docker://localhost:50000/recompress dir:chunked-dir
## A chunked -> chunked copy of a known-existing image reuses existing blobs
# skopeo --debug copy docker://localhost:50000/recompress docker://localhost:50000/chunked-copy2
## Pull from dir: instead of docker://, it is not partial = TOC-based, and just records general zstd data.
# podman --log-level=debug pull dir:chunked-dir
## See what is the image ID of the pulled image
# podman images
## Reuses an existing blob, does not repush
# podman --log-level=debug push --tls-verify=false 4819930274ea localhost:50000/repush
## The image is zstd (not chunked!) does not have annotations
# skopeo inspect --tls-verify=false --raw docker://localhost:50000/repush | jq .

If we don't know an uncompressed digest, don't try using "".

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The cache implementations are recording both the base and specific compression variant;
CandidateLocations2 all call CandidateTemplateWithCompression to choose the
appropriate variants to return based on CandidateLocations2Options.

This way, neither the BIC implementations nor the transports are not responsible for
converting zstd:chunked entries to zstd entries if the user wants the latter.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Introduce distinct uploadedCompressorBaseVariantName and
uploadedCompressorSpecificVariantName fields; that way we now never
call RecordDigestCompressorData with inconsistent zstd / zstd:chunked in one field,
so we can always record data when we see, or create, a zstd:chunked layer,
removing the current hack.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
- Add a CompressionAnnotations field
- Allow turning a known-zstd blob into a zstd:chunked one if we
  know the right annotations

This just adds the fields, nothing sets them yet, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
- Return the required annotations, if we have them
- If we have a zstd blob and the BIC contains the annotations,
  we don't check for the blob's presence initially. In that case,
  don't skip it if we find the BIC annotations.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... instead of only treating it as zstd.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/libpod that referenced this pull request Aug 6, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 6, 2024

Now ready for review. Podman tests are exercised in containers/podman#23397 .

@giuseppe PTAL ; Cc: @kolyshkin

@mtrmac mtrmac marked this pull request as ready for review August 6, 2024 21:16
@mtrmac mtrmac changed the title WIP: Record zstd:chunked format, and annotations, in BlobInfoCache Record zstd:chunked format, and annotations, in BlobInfoCache Aug 6, 2024
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

LGTM

@rhatdan
Copy link
Member

rhatdan commented Aug 7, 2024

LGTM

@rhatdan rhatdan merged commit 8c7c58c into containers:main Aug 7, 2024
10 checks passed
mtrmac added a commit to mtrmac/libpod that referenced this pull request Aug 7, 2024
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac deleted the chunked-bic2 branch August 7, 2024 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants