Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to build without QEMU #420

Merged
merged 4 commits into from
Apr 12, 2023
Merged

Attempt to build without QEMU #420

merged 4 commits into from
Apr 12, 2023

Conversation

JayKickliter
Copy link
Contributor

Building with QEMU is extremely simple, but it comes at the cost of very long Aarch64 builds. Let's see if we can make building on the host architecture work while selectively enabling TPM on only x86.

This is PoC PR until marked as "ready for review", and expect a corresponding level of jank.

@shawaj
Copy link
Contributor

shawaj commented Apr 12, 2023

Not sure if this is useful, and it's not free like github actions is, but thought I'd mention it anyway in case you hadn't heard of it...
https://buildjet.com/for-github-actions/docs/introduction

Sadly they don't seem to have a free option for open source projects!

Hopefully github will actually implement native aarch64 runners at some point too

@madninja
Copy link
Member

madninja commented Apr 12, 2023

Interesting, thanks.. We have an ARM machine in AWS as well but problem is that if we're building multi arch images you're paying the overhead on one side. Love how much faster build jet appears to be though

@shawaj
Copy link
Contributor

shawaj commented Apr 12, 2023

@madninja - another idea for you - in some of our containers, for reasons of speed, we split the build stages of arm64 and amd64 for example into two separate matrix workflows, we then use a secondary following workflow to combine them into a single multi-arch image. Not sure if this will work nicely on quay.io, but it works for ghcr.io and dockerhub:
https://github.com/NebraLtd/hm-diag/blob/master/.github/workflows/compile-docker.yml

This would also allow you to use a separate Dockerfile for aarch64 with none of the tpm stuff in it, and then combine the resulting images into a single multi-arch manifest. And if you do it this way, the builds are often quicker because the arm64 and amd64 build parts happen on different runners in GitHub and are concurrent.

Lastly - do you even need to build the Rust stuff in the image at all? Since you have the release already being compiled in the build stage of this ci.yml workflow, could you not just copy the build artifacts from there into the docker_buildx stage?

@JayKickliter
Copy link
Contributor Author

Lastly - do you even need to build the Rust stuff in the image at all? Since you have the release already being compiled in the build stage of this ci.yml workflow, could you not just copy the build artifacts from there into the docker_buildx stage?

I've wondered the same thing. It seems ideal.

@JayKickliter
Copy link
Contributor Author

However, I've really wanted to reduce complexity. Having a single Dockerfile that a anyone cloning the repo can build with docker build . is a primary goal. That said, this PR increases complexity a little.

@JayKickliter
Copy link
Contributor Author

@isergieienkov can you review this please? I am able to run the x86/tpm image, but I don't have TPM hardware. I was kinda expecting an error that didn't happen.

@shawaj
Copy link
Contributor

shawaj commented Apr 12, 2023

Lastly - do you even need to build the Rust stuff in the image at all? Since you have the release already being compiled in the build stage of this ci.yml workflow, could you not just copy the build artifacts from there into the docker_buildx stage?

I've wondered the same thing. It seems ideal.

I have something working here - which is similar but not exactly that - pulling the release from the releases section of this github repo instead of directly from the workflow...
https://github.com/NebraLtd/hm-gatewayrs/pull/41/files

I think doing something similar is possible by:

  1. Change needs in docker_buildx stage to needs: [hygiene, build]
  2. Add to ci.yml docker_buildx stage:
      - name: Setup | Artifacts
        uses: actions/download-artifact@v3
        with:
          path: helium-gateway-*.tar.gz
  1. dockerfile could then look like this, with no build stage (i havent tested this yet, but should work i think):
FROM alpine:3.17.3
ENV RUST_BACKTRACE=1
ENV GW_LISTEN="0.0.0.0:1680"
ARG TARGETPLATFORM

# We will never enable TPM on anything other than x86
RUN \
if [ "$TARGETPLATFORM" = "linux/amd64" ]; \
    then apk add --no-cache --update \
    libstdc++ \
    tpm2-tss-esys \
    tpm2-tss-fapi \
    tpm2-tss-mu \
    tpm2-tss-rc \
    tpm2-tss-tcti-device && \
    export BUILD_BOARD="x86_64-tpm-debian-gnu" ; \
else export BUILD_BOARD="aarch64-unknown-linux-musl" ; \
fi

WORKDIR /etc/helium_gateway
COPY helium-gateway*"$BUILD_BOARD".tar.gz helium-gateway.tar.gz

RUN tar -xzf /etc/helium_gateway/helium-gateway.tar.gz && \
    mv /etc/helium_gateway/helium_gateway /usr/local/bin/helium_gateway && \
    rm -f /etc/helium_gateway/helium-gateway.tar.gz

CMD ["helium_gateway", "server"]

This could also then be more easily extended for other architectures in the future.

@shawaj
Copy link
Contributor

shawaj commented Apr 12, 2023

I just realised you had said this @JayKickliter

However, I've really wanted to reduce complexity. Having a single Dockerfile that a anyone cloning the repo can build with docker build . is a primary goal. That said, this PR increases complexity a little.

My solution would not allow for that, per se, as it would require the artifacts from the previous job or the dockerfile would fail.

The only thing I can think of to solve that bit is having a 3 stage dockerfile:

  • first stage: ci-runner
  • second stage: builder
  • third stage: runner

Then in the ci you can just --target ci-runner and on any other machine --target runner

Using buildx this should automatically only build the necessary stages on each environment...
Ref: https://docs.docker.com/build/building/multi-stage/#differences-between-legacy-builder-and-buildkit

@madninja
Copy link
Member

madninja commented Apr 12, 2023

I'd love to see a PR to make the build times even better than this, but this is a great start.. @shawaj if you'd like to propose a PR to improve the docker build setup more, I'd be grateful

@JayKickliter JayKickliter marked this pull request as ready for review April 12, 2023 21:29
@madninja madninja merged commit d94ed0f into main Apr 12, 2023
@madninja madninja deleted the jsk/try-no-qemu-again branch April 12, 2023 21:31
@madninja
Copy link
Member

Bonus points for making the docker build use the rust cache too.. that'll speed it up quite a bit

@shawaj
Copy link
Contributor

shawaj commented Apr 12, 2023

@madninja I'll definitely give it a go, as part of us trying to add the armv6/armv7 builds from before.

I'm not super familiar with rust caching and how that works ... Where does that come from? Is it just local or it's saved online somewhere?

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants