Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached layers duplicated some of the time #1138

Closed
raijinsetsu opened this issue Mar 16, 2020 · 5 comments
Closed

Cached layers duplicated some of the time #1138

raijinsetsu opened this issue Mar 16, 2020 · 5 comments

Comments

@raijinsetsu
Copy link

Actual behavior
I noticed that my image size jumped from 1.1GB to over 2GB overnight. After reviewing the images, it looks like the new image has two versions of some of the layers. Most notably: there is a layer that is the entirety of the node_modules folder, which is ~900MB, for this large mono-repository and there appear to be two different versions of that layer in the new image as I can see docker download two >900MB layers. All other layers are 100KB or less, which makes this 900MB really stand out.

Clearing the cache corrects the issue and subsequent builds shrink in size.

Expected behavior
We expect the layers to only be included once.

To Reproduce
It is unclear how to reproduce this outside our repository.

Additional Information

  • Dockerfile
    Note that our Dockerfile is made from an EJS template.

ARG node_version

FROM node:${node_version}

ARG rest_path

# ===== Create Build Environment
WORKDIR ${rest_path}
ENV REST_BUILD Docker

# ===== Copy Sources for Bootstrap
# copy just files needed for bootstrap
COPY modules/ modules/
COPY packages/ng/ packages/ng/
COPY scripts/ scripts/
COPY configs/ configs/
COPY check-changes.sh lerna.json tsconfig.json package*.json ./
COPY tokens/READ_ONLY_TOKEN.npmrc /root/.npmrc

# copy just the package.json and package-lock.json files so other changes do not affect
#   bootstrapping
<%- packageFiles %>

# ===== Bootstrap
# Build the modules and link local modules
RUN scripts/bootstrap.sh -f

# ===== Compile
COPY packages/base-build/ packages/base-build/
COPY packages/server/ packages/server/
COPY apps/admin-panel/ apps/admin-panel/
RUN scripts/build.sh --ci

# creates the build-data

# note: need to send it as a file and not an ARG due to https://github.com/GoogleContainerTools/kaniko/issues/1008
# ARG COMMIT_SHA
COPY last-commit-sha ./
RUN npx lerna run --stream --scope=@meditech/webapi-server grunt -- create-build-data:ci &&\
    npx lerna run --stream --scope=base-build grunt -- copy:buildData

# copy source for later compilations
COPY packages/cloud-platform-integration/ packages/cloud-platform-integration/
COPY packages/subscription-messaging-common/ packages/subscription-messaging-common/
COPY packages/subscription-messaging-publisher/ packagse/subscription-messaging-publisher/
COPY packages/cloud-platform/ packages/cloud-platform/
COPY apps/cp-cloud-admin-plugin/ apps/cp-cloud-admin-plugin/
COPY apps/cp-user-profile/ apps/cp-user-profile/
COPY apps/invite/ apps/invite/
COPY apps/tenant-admin/ apps/tenant-admin/
  • Build Context
    Too many files to list...
  • Kaniko Image (fully qualified with digest)
    gcr.io/kaniko-project/executor:v0.18.0
    digest: sha256:2b54a743d46b5c4eff5772c68177958c6876bdf77b31bcda5a6af376b6c31428

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@tejal29 tejal29 added this to the Release v1.0.0 milestone Mar 17, 2020
@tejal29
Copy link
Member

tejal29 commented Mar 18, 2020

Thanks @raijinsetsu I was able to reproduce this issue like this
For Dockerfile with gcr.io/kaniko-project/executor:debug

FROM alpine:latest

COPY large.tar .
RUN ["echo", "test"]

I ran the kaniko/executor with --cache.

I observed the cache layer corresponding to Run ["echo", "test"] also had the layer.tar

docker run -it --entrypoint /busybox/sh -v /usr/local/google/home/tejaldesai/.config/gcloud:/root/.config/gcloud -v /usr/local/google/home/tejaldesai/workspace/kaniko/integration:/workspace gcr.io/kaniko-project/executor:debug


/ # kaniko/executor -f dockerfiles/Dockerfile1 --context=dir://workspace --destination=gcr.io/tejal-test/test-cache-latest --cache
INFO[0000] Resolved base name alpine:latest to alpine:latest 
...  
INFO[0003] Checking for cached layer gcr.io/tejal-test/test-cache-latest/cache:84e2eefcc6fb24bcc4fd82a0d530141c09e1c59bf14621a85f41a6466b58a84f... 
INFO[0004] Using caching version of cmd: COPY large.tar . 
INFO[0004] Checking for cached layer gcr.io/tejal-test/test-cache-latest/cache:a6a737350f9f5578e3aa5c0fd024252a79a1f9ece9bd7272f9bb057a94ba3200... 
INFO[0005] No cached layer found for cmd RUN ["echo", "test"] 
INFO[0005] Unpacking rootfs as cmd RUN ["echo", "test"] requires it. 
INFO[0005] Taking snapshot of full filesystem...        
INFO[0005] Resolving paths                              
INFO[0005] COPY large.tar .                             
INFO[0005] Found cached layer, extracting to filesystem 
INFO[0009] RUN ["echo", "test"]                         
INFO[0009] cmd: echo                                    
INFO[0009] args: [test]                                 
test
INFO[0009] Taking snapshot of full filesystem...        
INFO[0009] Resolving paths                              
INFO[0015] Pushing layer gcr.io/tejal-test/test-cache-latest/cache:a6a737350f9f5578e3aa5c0fd024252a79a1f9ece9bd7272f9bb057a94ba3200 to cache now 

The cached layer "gcr.io/tejal-test/test-cache-latest/cache:a6a737350f9f5578e3aa5c0fd024252a79a1f9ece9bd7272f9bb057a94ba3200" doubled in size (206 MB) and hence the image size doubled.

I verified this on master and looks like the bug got fixed.
On the same dockerfile with gcr.io/tejal-test/executor:debug

tejaldesai@@skaffold (prototype)$ docker run -it --entrypoint /busybox/sh -v /usr/local/google/home/tejaldesai/.config/gcloud:/root/.config/gcloud -v /usr/local/google/home/tejaldesai/workspace/kaniko/integration:/workspace gcr.io/tejal-test/executor:debug
/ # kaniko/executor -f dockerfiles/Dockerfile1 --context=dir://workspace --destination=gcr.io/tejal-test/test-cache-edge --cache
INFO[0000] Resolved base name alpine:latest to alpine:latest 
INFO[0000] Using dockerignore file: /workspace/.dockerignore 
INFO[0000] Resolved base name alpine:latest to alpine:latest 
INFO[0000] Retrieving image manifest alpine:latest      
INFO[0001] Retrieving image manifest alpine:latest      
INFO[0002] Built cross stage deps: map[]                
INFO[0002] Retrieving image manifest alpine:latest      
INFO[0003] Retrieving image manifest alpine:latest      
INFO[0004] Checking for cached layer gcr.io/tejal-test/test-cache-edge/cache:84e2eefcc6fb24bcc4fd82a0d530141c09e1c59bf14621a85f41a6466b58a84f... 
INFO[0004] No cached layer found for cmd COPY large.tar . 
INFO[0004] Unpacking rootfs as cmd COPY large.tar . requires it. 
INFO[0004] Taking snapshot of full filesystem...        
INFO[0004] Resolving paths                              
INFO[0005] COPY large.tar .                             
INFO[0005] Resolving paths                              
INFO[0005] Taking snapshot of files...                  
INFO[0011] Pushing layer gcr.io/tejal-test/test-cache-edge/cache:84e2eefcc6fb24bcc4fd82a0d530141c09e1c59bf14621a85f41a6466b58a84f to cache now 
INFO[0011] RUN ["echo", "test"]                         
INFO[0011] cmd: echo                                    
INFO[0011] args: [test]                                 
test
INFO[0011] Taking snapshot of full filesystem...        
INFO[0011] Resolving paths                              
INFO[0011] No files were changed, appending empty layer to config. No layer added to image. 
INFO[0011] Pushing layer gcr.io/tejal-test/test-cache-edge/cache:ba4c47cafb2974c0220e883c181f93ba7e95c39f0f15515c071b056e934f2bd1 to cache now 

If you see the logs mentioned

INFO[0011] No files were changed, appending empty layer to config. No layer added to image. 

creating the corresponding cache layer for "gcr.io/tejal-test/test-cache-edge/cache:ba4c47cafb2974c0220e883c181f93ba7e95c39f0f15515c071b056e934f2bd1" of size 35M

@tejal29
Copy link
Member

tejal29 commented Mar 18, 2020

I am going to keep this open with pending verification after v.0.19.0 release.

@raijinsetsu
Copy link
Author

I've switched my builds over to 0.19.0. I'll let you know how it goes.

@raijinsetsu
Copy link
Author

I have confirmed that 0.19.0 has corrected the issue for me. Thanks!

@tejal29 tejal29 closed this as completed Mar 19, 2020
@tejal29
Copy link
Member

tejal29 commented Mar 19, 2020

Thanks a lot @raijinsetsu for confirming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants