How to save and load docker images using actions/cache or other mechanisms? #103495

Nefcanto · 2024-02-04T06:11:50Z

Nefcanto
Feb 4, 2024

Select Topic Area

Question

Body

This is a sample of my action YAML file:

      - name: Build production docker
        run: |
          docker build -t ghcr.io/my-org/my-repo/api:latest . --no-cache --progress=plain

      - name: Log in to GitHub Container Registry
        run: |
          echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin

      - name: Push the image
        run: |
          docker push ghcr.io/my-org/my-repo/api:latest

      - name: Logout from GitHub Container Registry
        run: |
          docker logout

As I see logs for actions, the docker build takes a lot of time, and a major part of it is because it downloads a FROM image from the docker hub.

The image name is holism/api. I want to cache this image but I'm stuck at how to cache it and how to load it in subsequent action runs.

I can't find a good documentation for this and ChatGPT is not helping either. Can anybody please help me with this?

ghost · 2024-02-11T17:27:26Z

ghost
Feb 11, 2024

GitHub Actions doesn't directly support Docker image caching in the same way it does for dependencies with actions/cache.

0 replies

ghost · 2024-02-11T17:51:15Z

ghost
Feb 11, 2024

you can achieve similar results by combining a few techniques. First, ensure Docker's BuildKit is enabled as it has improved caching mechanisms. You can enable it by setting the environment variable DOCKER_BUILDKIT=1 before your docker commands. BuildKit stores its cache separately from docker images and layers. You can leverage actions/cache to cache this BuildKit cache. try like this:

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2

    - name: Set up Docker BuildKit
      run: |
        echo "DOCKER_BUILDKIT=1" >> $GITHUB_ENV

    - name: Cache Docker layers
      uses: actions/cache@v2
      with:
        path: /tmp/.buildx-cache
        key: ${{ runner.os }}-buildx-${{ github.sha }}
        restore-keys: |
          ${{ runner.os }}-buildx-

    - name: Build and cache Docker image
      run: |
        docker build --tag ghcr.io/my-org/my-repo/api:latest . \
          --cache-from=type=local,src=/tmp/.buildx-cache \
          --cache-to=type=local,dest=/tmp/.buildx-cache-new

    - name: Move new cache
      run: |
        rm -rf /tmp/.buildx-cache
        mv /tmp/.buildx-cache-new /tmp/.buildx-cache

Log in to GitHub Container Registry, Push, Logout steps remain the same

0 replies

Nefcanto · 2024-02-12T05:03:10Z

Nefcanto
Feb 12, 2024
Author

@h0vhann1syan, thank you for responding and helping. The way I understood it, the BuiltKit is for the building phase. That part does not bother us. We are stuck at the pull phase. Our build needs some base images and it takes a lot of time to pull them each time. That's where we want to leverage caching. We tried to use docker save and docker load for that reason, and it became even worse as in GitHub Actions it seems that the network is faster than CPU utilization.

So, does this BuildKit cache the pulled images too?

0 replies

ghost · 2024-02-12T05:27:14Z

ghost
Feb 12, 2024

Probably for your specific use case where the pulling of base images is the bottleneck, BuildKit's caching mechanisms might not provide the direct benefit you're looking for, as it focuses more on caching build steps rather than the initial image pulls. If your base images don't change frequently, you could set up a Docker registry proxy closer to your GitHub Actions runners. This approach involves more setup but could reduce pull times if your images are large and network transfer is a significant portion of your build time. GitHub Actions runners are hosted in GitHub's data centers, so having a proxy registry within the same network could potentially speed up image pulls.

0 replies

Nefcanto · 2024-02-12T06:41:59Z

Nefcanto
Feb 12, 2024
Author

@h0vhann1syan, we tried to host our images on the ghcr.io, because we heard that hosting images there would drastically improve pull time. But for us, it did not change anything. The pull time was the same. So I guess creating a proxy won't help. The only option is to somehow tell docker that this image is already pulled in previous runs, so do not pull it again and read it from the cache. Yet we are stuck at this simple step.

0 replies

ghost · 2024-02-12T07:20:56Z

ghost
Feb 12, 2024

Tag your base images with specific versions. Instead of relying on latest, use specific tags for your base images. This practice ensures that you know exactly which version of the image you're using and can more effectively determine whether a pull is necessary. And use a conditional step to check for image presence. Before the step where you'd typically pull your base image, add a step that checks if the specific version of the image is already present in the local Docker cache. If the image is present, you can skip the pull step.
You can do something like this conceptual approach:

- name: Check if Base Image Exists
  id: check-image
  run: |
    if docker image inspect holism/api:specific-version > /dev/null 2>&1; then
      echo "::set-output name=exists::true"
    else
      echo "::set-output name=exists::false"
    fi

- name: Pull Base Image
  if: steps.check-image.outputs.exists == 'false'
  run: docker pull holism/api:specific-version

In this setup, the check if base image exists step sets an output variable based on whether the specified version of your base image exists locally. The pull base image step then only runs if the image does not exist, as determined by the conditional if statement.

0 replies

Nefcanto · 2024-02-12T08:13:44Z

Nefcanto
Feb 12, 2024
Author

@h0vhann1syan, I appreciate your time. Thank you. The problem is that the conditional statement would always be false. Because each time we run our GitHub Action, we are given a new clean runner and that runner does not have our image. That's exactly the problem here.

We have many APIs and many web interfaces and for each, we have a GitHub Action. And we build each of them a couple of times per day.

If we want to get numeric, we have more than 50 actions, being run more than 200 times per day. And they are all dependent on our base image. This means that we need to pull them 200 times per day. Each time consumes 2GBs of bandwidth and wastes more than 1 minute of our build time.

If we use docker save and docker load it reduces the bandwidth 200 times, but it even consumes more of our build time.

1 reply

nisbet-hubbard Sep 17, 2024

If we use docker save and docker load it reduces the bandwidth 200 times, but it even consumes more of our build time.

This is a good example of the trade-off between bandwidth and build time when it comes to dockers. docker load is a major bottleneck, so it really boils down to where a team’s priority lies.

Similarly with the Docker Cache action: ScribeMD/docker-cache#785

2024-05-08T17:16:07Z

github-actions[bot]
bot May 8, 2024

🕒 Discussion Activity Reminder 🕒

This Discussion has been labeled as dormant by an automated system for having no activity in the last 60 days. Please consider one the following actions:

1️⃣ Close as Out of Date: If the topic is no longer relevant, close the Discussion as out of date at the bottom of the page.

2️⃣ Provide More Information: Share additional details or context — or let the community know if you've found a solution on your own.

3️⃣ Mark a Reply as Answer: If your question has been answered by a reply, mark the most helpful reply as the solution.

Note: This dormant notification will only apply to Discussions with the Question label. To learn more, see our recent announcement.

Thank you for helping bring this Discussion to a resolution! 💬

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Community

How to save and load docker images using actions/cache or other mechanisms? #103495

{{title}}

Replies: 8 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

GitHub Community

How to save and load docker images using actions/cache or other mechanisms? #103495

Nefcanto Feb 4, 2024

Select Topic Area

Body

Replies: 8 comments · 1 reply

ghost Feb 11, 2024

ghost Feb 11, 2024

Nefcanto Feb 12, 2024 Author

ghost Feb 12, 2024

Nefcanto Feb 12, 2024 Author

ghost Feb 12, 2024

Nefcanto Feb 12, 2024 Author

nisbet-hubbard Sep 17, 2024

github-actions[bot] bot May 8, 2024

Nefcanto
Feb 4, 2024

Replies: 8 comments 1 reply

ghost
Feb 11, 2024

ghost
Feb 11, 2024

Nefcanto
Feb 12, 2024
Author

ghost
Feb 12, 2024

Nefcanto
Feb 12, 2024
Author

ghost
Feb 12, 2024

Nefcanto
Feb 12, 2024
Author

github-actions[bot]
bot May 8, 2024