Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching ignores filenames when COPYing a directory of files #2241

Open
fordhurley opened this issue Sep 9, 2022 · 2 comments · May be fixed by #3203
Open

Caching ignores filenames when COPYing a directory of files #2241

fordhurley opened this issue Sep 9, 2022 · 2 comments · May be fixed by #3203
Labels
area/caching For all bugs related to cache issues area/performance issues related to kaniko performance enhancement cmd/copy kind/bug Something isn't working priority/p1 Basic need feature compatibility with docker build. we should be working on this next. priority/p2 High impact feature/bug. Will get a lot of users happy

Comments

@fordhurley
Copy link

fordhurley commented Sep 9, 2022

Actual behavior

Kaniko appears to ignore filenames when deciding when it can use a cached layer for a COPY command. Changing the name of a file, then COPYing the directory containing the file can cause Kaniko to reuse a cache layer containing a file with the old filename.

Expected behavior

Kaniko would not reuse a cache layer that was created with a different set of input files. The resulting image would contain the set of files that should have been added with the COPY command.

To Reproduce

Steps to reproduce the behavior:

  1. Build an image using the Dockerfile below, copying in 1.txt and 2.txt.
  2. Run the image to verify that the image contains those two files.
  3. Rename 2.txt to 3.txt in the host file system.
  4. Build a new image. Note the log message Using caching version of cmd: COPY test test
  5. Run the image to verify that the image still contains 1.txt and 2.txt and not 3.txt.

Additional Information

  • Dockerfile

    FROM alpine:3.16
    
    COPY test test
    
    CMD ["ls", "-al", "test"]
    
  • Build Context

    .
    ├── Dockerfile
    └── test
        ├── 1.txt
        └── 2.txt
    

    Builds use the command:

    docker run -ti --rm \
     -v `pwd`:/workspace \
     -v `pwd`/config.json:/kaniko/.docker/config.json:ro \
     gcr.io/kaniko-project/executor:v1.9.0 \
     --cache=true \
     --cache-repo=fordhurley/kaniko-cache-issue-cache \
     --cache-copy-layers \
     --dockerfile=Dockerfile \
     --destination=fordhurley/kaniko-cache-issue
    
  • Kaniko Image (fully qualified with digest)

    Unable to find image 'gcr.io/kaniko-project/executor:v1.9.0' locally
    v1.9.0: Pulling from kaniko-project/executor
    Digest: sha256:1f982af0b54be748221d9a35dcfa608660ab3d51229aa56bde5416f75aff7561
    Status: Downloaded newer image for gcr.io/kaniko-project/executor:v1.9.0
    

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@fordhurley
Copy link
Author

This doesn't really fit the issue report template, but I believe this shell output shows a concise reproduction of the issue:

$ ls -al test
total 16
drwxr-xr-x  4 ford  staff  128 Sep  9 16:49 .
drwxr-xr-x  8 ford  staff  256 Sep  9 16:24 ..
-rw-r--r--  1 ford  staff   21 Sep  9 16:03 1.txt
-rw-r--r--  1 ford  staff   28 Sep  9 16:59 2.txt

$ docker run -ti --rm \
    -v `pwd`:/workspace \
    -v `pwd`/config.json:/kaniko/.docker/config.json:ro \
    gcr.io/kaniko-project/executor:v1.9.0 \
    --cache=true \
    --cache-repo=fordhurley/kaniko-cache-issue-cache \
    --cache-copy-layers \
    --dockerfile=Dockerfile \
    --destination=fordhurley/kaniko-cache-issue
Unable to find image 'gcr.io/kaniko-project/executor:v1.9.0' locally
v1.9.0: Pulling from kaniko-project/executor
Digest: sha256:1f982af0b54be748221d9a35dcfa608660ab3d51229aa56bde5416f75aff7561
Status: Downloaded newer image for gcr.io/kaniko-project/executor:v1.9.0
INFO[0000] Retrieving image manifest alpine:3.16
INFO[0000] Retrieving image alpine:3.16 from registry index.docker.io
INFO[0000] Retrieving image manifest alpine:3.16
INFO[0000] Returning cached image manifest
INFO[0001] Built cross stage deps: map[]
INFO[0001] Retrieving image manifest alpine:3.16
INFO[0001] Returning cached image manifest
INFO[0001] Retrieving image manifest alpine:3.16
INFO[0001] Returning cached image manifest
INFO[0001] Executing 0 build triggers
INFO[0001] Building stage 'alpine:3.16' [idx: '0', base-idx: '-1']
INFO[0001] Checking for cached layer fordhurley/kaniko-cache-issue-cache:d3e764c4be2943fb8a0415e179e4f042c2ffa2f3bf5c047930899306bad2671e...
INFO[0001] No cached layer found for cmd COPY test test
INFO[0001] Unpacking rootfs as cmd COPY test test requires it.
INFO[0001] COPY test test
INFO[0001] Taking snapshot of files...
INFO[0001] CMD ["ls", "-al", "test"]
INFO[0001] No files changed in this command, skipping snapshotting.
INFO[0001] Pushing layer fordhurley/kaniko-cache-issue-cache:d3e764c4be2943fb8a0415e179e4f042c2ffa2f3bf5c047930899306bad2671e to cache now
INFO[0001] Pushing image to fordhurley/kaniko-cache-issue-cache:d3e764c4be2943fb8a0415e179e4f042c2ffa2f3bf5c047930899306bad2671e
INFO[0003] Pushed index.docker.io/fordhurley/kaniko-cache-issue-cache@sha256:e7d67dd91f8f24b1f61d1931934b55c1fa919e848f5e2d5fe5a7e7b166368091
INFO[0003] Pushing image to fordhurley/kaniko-cache-issue
INFO[0004] Pushed index.docker.io/fordhurley/kaniko-cache-issue@sha256:90cdd719174ca145c37741bb81db69549c8d5a66f26933f2b2529d55760529ba

$ docker run --rm index.docker.io/fordhurley/kaniko-cache-issue@sha256:90cdd719174ca145c37741bb81db69549c8d5a66f26933f2b2529d55760529ba
Unable to find image 'fordhurley/kaniko-cache-issue@sha256:90cdd719174ca145c37741bb81db69549c8d5a66f26933f2b2529d55760529ba' locally
docker.io/fordhurley/kaniko-cache-issue@sha256:90cdd719174ca145c37741bb81db69549c8d5a66f26933f2b2529d55760529ba: Pulling from fordhurley/kaniko-cache-issue
9b18e9b68314: Already exists
d1109d7eb4ad: Pull complete
Digest: sha256:90cdd719174ca145c37741bb81db69549c8d5a66f26933f2b2529d55760529ba
Status: Downloaded newer image for fordhurley/kaniko-cache-issue@sha256:90cdd719174ca145c37741bb81db69549c8d5a66f26933f2b2529d55760529ba
total 16
drwxr-xr-x    2 root     root          4096 Sep  9 21:21 .
drwxr-xr-x    1 root     root          4096 Sep  9 21:22 ..
-rw-r--r--    1 root     root            21 Sep  9 21:21 1.txt
-rw-r--r--    1 root     root            28 Sep  9 21:21 2.txt

$ mv test/2.txt test/3.txt

$ ls -al test
total 16
drwxr-xr-x  4 ford  staff  128 Sep  9 17:22 .
drwxr-xr-x  8 ford  staff  256 Sep  9 16:24 ..
-rw-r--r--  1 ford  staff   21 Sep  9 16:03 1.txt
-rw-r--r--  1 ford  staff   28 Sep  9 16:59 3.txt

$ docker run -ti --rm \
    -v `pwd`:/workspace \
    -v `pwd`/config.json:/kaniko/.docker/config.json:ro \
    gcr.io/kaniko-project/executor:v1.9.0 \
    --cache=true \
    --cache-repo=fordhurley/kaniko-cache-issue-cache \
    --cache-copy-layers \
    --dockerfile=Dockerfile \
    --destination=fordhurley/kaniko-cache-issue
INFO[0000] Retrieving image manifest alpine:3.16
INFO[0000] Retrieving image alpine:3.16 from registry index.docker.io
INFO[0000] Retrieving image manifest alpine:3.16
INFO[0000] Returning cached image manifest
INFO[0001] Built cross stage deps: map[]
INFO[0001] Retrieving image manifest alpine:3.16
INFO[0001] Returning cached image manifest
INFO[0001] Retrieving image manifest alpine:3.16
INFO[0001] Returning cached image manifest
INFO[0001] Executing 0 build triggers
INFO[0001] Building stage 'alpine:3.16' [idx: '0', base-idx: '-1']
INFO[0001] Checking for cached layer fordhurley/kaniko-cache-issue-cache:d3e764c4be2943fb8a0415e179e4f042c2ffa2f3bf5c047930899306bad2671e...
INFO[0001] Using caching version of cmd: COPY test test
INFO[0001] Skipping unpacking as no commands require it.
INFO[0001] COPY test test
INFO[0001] Found cached layer, extracting to filesystem
INFO[0001] CMD ["ls", "-al", "test"]
INFO[0001] No files changed in this command, skipping snapshotting.
INFO[0001] Pushing image to fordhurley/kaniko-cache-issue
INFO[0003] Pushed index.docker.io/fordhurley/kaniko-cache-issue@sha256:e4d710db85b0fda62cb0ea455b6db0c2a9cb0517c50bf9ad9727e9a95291b147

$ docker run --rm index.docker.io/fordhurley/kaniko-cache-issue@sha256:e4d710db85b0fda62cb0ea455b6db0c2a9cb0517c50bf9ad9727e9a95291b147
Unable to find image 'fordhurley/kaniko-cache-issue@sha256:e4d710db85b0fda62cb0ea455b6db0c2a9cb0517c50bf9ad9727e9a95291b147' locally
docker.io/fordhurley/kaniko-cache-issue@sha256:e4d710db85b0fda62cb0ea455b6db0c2a9cb0517c50bf9ad9727e9a95291b147: Pulling from fordhurley/kaniko-cache-issue
9b18e9b68314: Already exists
d1109d7eb4ad: Already exists
Digest: sha256:e4d710db85b0fda62cb0ea455b6db0c2a9cb0517c50bf9ad9727e9a95291b147
Status: Downloaded newer image for fordhurley/kaniko-cache-issue@sha256:e4d710db85b0fda62cb0ea455b6db0c2a9cb0517c50bf9ad9727e9a95291b147
total 16
drwxr-xr-x    2 root     root          4096 Sep  9 21:21 .
drwxr-xr-x    1 root     root          4096 Sep  9 21:22 ..
-rw-r--r--    1 root     root            21 Sep  9 21:21 1.txt
-rw-r--r--    1 root     root            28 Sep  9 21:21 2.txt

@fordhurley
Copy link
Author

And the equivalent steps using docker build:

$ ls -al test
total 16
drwxr-xr-x  4 ford  staff  128 Sep  9 17:37 .
drwxr-xr-x  8 ford  staff  256 Sep  9 16:24 ..
-rw-r--r--  1 ford  staff   21 Sep  9 16:03 1.txt
-rw-r--r--  1 ford  staff   28 Sep  9 16:59 2.txt

$ docker build -t kaniko-cache-issue .
[+] Building 0.7s (8/8) FINISHED
 => [internal] load build definition from Dockerfile                                                                     0.0s
 => => transferring dockerfile: 36B                                                                                      0.0s
 => [internal] load .dockerignore                                                                                        0.0s
 => => transferring context: 2B                                                                                          0.0s
 => [internal] load metadata for docker.io/library/alpine:3.16                                                           0.6s
 => [auth] library/alpine:pull token for registry-1.docker.io                                                            0.0s
 => [internal] load build context                                                                                        0.0s
 => => transferring context: 122B                                                                                        0.0s
 => [1/2] FROM docker.io/library/alpine:3.16@sha256:bc41182d7ef5ffc53a40b044e725193bc10142a1243f395ee852a8d9730fc2ad     0.0s
 => CACHED [2/2] COPY test test                                                                                          0.0s
 => exporting to image                                                                                                   0.0s
 => => exporting layers                                                                                                  0.0s
 => => writing image sha256:f2fa9cdd45517aeff7a4374848ba51f89a98f5e3fdd31edb42969f339114928f                             0.0s
 => => naming to docker.io/library/kaniko-cache-issue                                                                    0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them

$ docker run --rm kaniko-cache-issue
total 16
drwxr-xr-x    2 root     root          4096 Sep  9 20:08 .
drwxr-xr-x    1 root     root          4096 Sep  9 21:37 ..
-rw-r--r--    1 root     root            21 Sep  9 20:03 1.txt
-rw-r--r--    1 root     root            28 Sep  9 20:07 2.txt

$ mv test/2.txt test/3.txt

$ ls -al test
total 16
drwxr-xr-x  4 ford  staff  128 Sep  9 17:37 .
drwxr-xr-x  8 ford  staff  256 Sep  9 16:24 ..
-rw-r--r--  1 ford  staff   21 Sep  9 16:03 1.txt
-rw-r--r--  1 ford  staff   28 Sep  9 16:59 3.txt

$ docker build -t kaniko-cache-issue .
[+] Building 0.4s (7/7) FINISHED
 => [internal] load build definition from Dockerfile                                                                      0.0s
 => => transferring dockerfile: 36B                                                                                       0.0s
 => [internal] load .dockerignore                                                                                         0.0s
 => => transferring context: 2B                                                                                           0.0s
 => [internal] load metadata for docker.io/library/alpine:3.16                                                            0.3s
 => [internal] load build context                                                                                         0.0s
 => => transferring context: 122B                                                                                         0.0s
 => [1/2] FROM docker.io/library/alpine:3.16@sha256:bc41182d7ef5ffc53a40b044e725193bc10142a1243f395ee852a8d9730fc2ad      0.0s
 => CACHED [2/2] COPY test test                                                                                           0.0s
 => exporting to image                                                                                                    0.0s
 => => exporting layers                                                                                                   0.0s
 => => writing image sha256:43ca5ae45c022bdc97cca3031b70d934c5aa079243be27afdccc274e00b33bda                              0.0s
 => => naming to docker.io/library/kaniko-cache-issue                                                                     0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them

$ docker run --rm kaniko-cache-issue
total 16
drwxr-xr-x    2 root     root          4096 Sep  9 20:09 .
drwxr-xr-x    1 root     root          4096 Sep  9 21:38 ..
-rw-r--r--    1 root     root            21 Sep  9 20:03 1.txt
-rw-r--r--    1 root     root            28 Sep  9 20:07 3.txt

Note the last line of output that shows 3.txt, as expected.

@aaron-prindle aaron-prindle added kind/bug Something isn't working area/caching For all bugs related to cache issues cmd/copy priority/p1 Basic need feature compatibility with docker build. we should be working on this next. priority/p2 High impact feature/bug. Will get a lot of users happy area/performance issues related to kaniko performance enhancement labels Jun 21, 2023
SJrX pushed a commit to SJrX/kaniko that referenced this issue Jun 16, 2024
Issues GoogleContainerTools#2241 GoogleContainerTools#1678 both point to cases where renames can point to incorrect images being used with caching. This
commit adds the path of the file (relative to the build context to the hash).

A different approach would be to change the underlying function in CacheHasher to include the name (and maybe file size), this was avoided
for two reasons:

1. It was unclear whether this would change or break the computed digests outside the context of caching.
2. The CacheHasher does not know the prefix to strip in the filename to compute the hash.
@SJrX SJrX linked a pull request Jun 16, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/caching For all bugs related to cache issues area/performance issues related to kaniko performance enhancement cmd/copy kind/bug Something isn't working priority/p1 Basic need feature compatibility with docker build. we should be working on this next. priority/p2 High impact feature/bug. Will get a lot of users happy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants