Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cached layers for kaniko builds #300

Closed
priyawadhwa opened this issue Aug 21, 2018 · 1 comment · Fixed by #353
Closed

Adding cached layers for kaniko builds #300

priyawadhwa opened this issue Aug 21, 2018 · 1 comment · Fixed by #353
Assignees
Labels

Comments

@priyawadhwa
Copy link
Collaborator

@mattmoor had this super cool idea, which I've copied below for reference:

tl;dr FTL-style caching for kaniko

Today FTL elides recomputing the dependency layer by publishing an image like:

  gcr.io/mattmoor-images/image-to-publish/cache/python-blah-blah:<hash-of-stuff>

... when asked to publish: gcr.io/mattmoor-images/image-to-publish:foo-bar

<hash of stuff> includes the requirements.txt, (should) include the base image version, 
and could include a timestamp (like what day) to enable some level of freshness.


The idea here is that kaniko would, prior to materializing FROM, fast-forward as far as it has cached:

  FROM ubuntu:latest           # This would be resolved to digest (first step in pull anyways)

  RUN apt-get update           # Check cache for hash(^^ digest, hash("apt-get update"))
  RUN apt-get install foo bar  # Check cache for hash(^^ hash, hash("apt-get install foo bar"))

  ADD baz /blah                # Check cache for hash(^^ hash, hash(relevant files))
  USER sockpuppet              # ...
  WORKDIR /app                 # ...

  RUN echo Hello World         # ...


If at any point we miss the cache, we treat the prior hit as the new "FROM" 
and begin evaluating from the miss.

Phase two of this would be to enable the caching layer to simulate non-RUN operations 
(e.g. ADD/COPY/USER/WORKDIR) against the registry API without downloading the base image.
This would enable Dockerfile's like the following to iterate *very* rapidly 
without ever downloading the base or cache (a la FTL):

  FROM ubuntu:latest           # Same digest, different day

  RUN apt-get update           # No change
  RUN apt-get install foo bar  # No change

  ADD baz /blah                # Oh noes, a change, but upload the layer and continue
  USER sockpuppet              # Metadata-only, post a new config
  WORKDIR /app                 # Metadata-only, post a new config

As Matt suggested, I agree that getting started with a prototype for the first phase would be a good starting point. After we have a prototype, we could do some basic benchmarking comparing no-cache kaniko, cached kaniko, and regular "docker build".

@dlorenc
Copy link
Collaborator

dlorenc commented Aug 21, 2018

IIUC phase one of this would be basically equivalent to docker build --cache-from, except we could infer the cached layers and build a slightly larger cache.

Phase two would be an improvement on that for a subset of Dockerfiles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants