Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help with Dockerfile to cache build products #1563

Closed
richb-hanover opened this issue Jan 18, 2023 · 11 comments
Closed

Need help with Dockerfile to cache build products #1563

richb-hanover opened this issue Jan 18, 2023 · 11 comments

Comments

@richb-hanover
Copy link
Contributor

The existing Dockerfile works great, but has an annoying attribute. The container doesn't cache many of the build products that take a long time to create. This means that many operations endure two (or five) minute startup delay before they operate as desired. (After that, changes I make to the Book or Website are speedy.)

Although I created the original Dockerfile, I do not know enough about either Rust or Docker to be able to work out a good strategy to improve caching.

My request: Is there someone who could tease out a solution to this caching problem? Here are specific steps / symptoms:

  • docker build -t prql . works as expected. It takes a looooong time to build the cargo tools, but this is a one-time operation.

  • docker run -it -v $(pwd)/:/src -p 3000:3000 prql starts the container and drops you into the container's command line. The USING_DOCKER.md file gives more instructions for checking various tools, or check that section's README.md file.

  • After starting the container, cd book; mdbook serve -n 0.0.0.0 -p 3000 starts the process of building the Book and making it available to a web browser at http://localhost:3000

  • But... that command frequently (seemingly) needs to recompile many of the same build products as created when building the container

  • Similarly, cd /src; task test-rust (seemingly) always rebuilds a ton of files - even if I ran the same task test-rust command moments earlier.

Any thoughts? Many thanks.

PS There's an intriguing article in SO that describes one technique at: https://stackoverflow.com/a/58474618/1827982
PPS Unfortunately, the SO article cites a blog post has gone 404. The Wayback Machine has it at: https://web.archive.org/web/20221028051630/https://blog.mgattozzi.dev/caching-rust-docker-builds/

@richb-hanover
Copy link
Contributor Author

PPPS - A Docker container that has good caching might be another way to address #1561

This is my time to put in another plug for using Docker... I find it very convenient to use the container because it bundles up all the tools (rust compiler, cargo, mdbook, node, npm, etc) into what I think of a "PRQL Development Engine" that's independent of the platform (Mac, Linux, Windows) that you're running on. A one-time docker build -t prql . recreates the container using the specified (known-compatible) versions of all the tools.

All the source code lives in the repo on your local machine. You fire up the container and start editing files. The container notices the changes, "does its thing", and shows the results.

Because the files always remain on your local machine, when you're done editing, you can use your git tools to commit/push/etc. as normal.

@eitsupi
Copy link
Member

eitsupi commented Jan 18, 2023

Generally, I believe this can be done by using Docker Volume.
https://docs.docker.com/storage/volumes/

Something like:

$ docker volume create my-vol
$ docker run --rm -it --mount source=my-vol,target=/cache-dir my-image bash

You can define the command in the Taskfile or use docker compose.

@richb-hanover
Copy link
Contributor Author

richb-hanover commented Jan 18, 2023

Ahah... If I understand correctly, you're recommending setting up a Docker Volume that will permanently hold the build products. (This makes perfect sense.) I can see how to give that "cache directory" a name (say, /cache-dir) in the container.

I guess I'm asking a Rust/Cargo question now: what do I need to do to configure the tools to write their products to that cache directory (and look for them there instead of building anew)? Many thanks!

@eitsupi
Copy link
Member

eitsupi commented Jan 18, 2023

The Cargo cache is created under the directory specified in the CARGO_HOME environment variable, so we can use any directory we wish.
I have not tried it, but something like this should work.

docker run --rm -it -e CARGO_HOME=/cargo-home --mount source=my-vol,target=/cargo-home my-image bash 

Of course we should be able to mount the volume to the default CARGO_HOME without setting the environment variable (I don't know where the CARGO_HOME is for that image!)

@richb-hanover
Copy link
Contributor Author

richb-hanover commented Jan 18, 2023

Terrific information! (I always enjoy working with smart people!)

I will take a crack at this later this week. (And your comment about docker-compose begins to make sense, too - the command is accumulating a lot of options...) Thanks.

@max-sixty
Copy link
Member

@eitsupi is correct! I'll add one more piece:

There are two places where caching matters — the dependencies and the build artifacts. Dependencies are at CARGO_HOME and the build artifacts are at CARGO_TARGET_DIR, by default target.

Currently target is in .dockerignore, so it's not mapped over when building an image or running a container. If we include it (i.e. remove from .dockerignore), it means the image will be huge, because it will include all the cache, but it would have the advantage that the artifacts will remain on disk after stopping the container. It might be worth seeing if removing it gives a better workflow for you @richb-hanover .

Probably the ideal think to do is indeed the command that @eitsupi suggests, plus another volume for $(pwd)/target — that way we get the caching saved between container runs without bloating the image.

We could write a docker-compose.yaml file which had these settings, which would give us a short known good command.

It's also worth having a look whether folks have done this in other projects — I know this is much less common in rust than other langs given Cargo is good at isolation already — but I'm guessing there's some prior art out there.

@max-sixty
Copy link
Member

This SO answer is an example of what I suggested. Lmk if any of those work for you @richb-hanover

@richb-hanover
Copy link
Contributor Author

@max-sixty Thanks. I'm up to my elbows in other (non-computer) projects now, so I'll get back to this after a while.

@richb-hanover
Copy link
Contributor Author

NB: #1774 doesn't address caching (in this issue). It simply installs hugo so that I can now develop the Book and Website via the Docker container.

@eitsupi
Copy link
Member

eitsupi commented Jun 2, 2023

Closed by #2624

@eitsupi eitsupi closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2023
@richb-hanover
Copy link
Contributor Author

Yes, close this. Our efforts can go toward improving the Dev Container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants