Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change base image #2116

Closed
jpkrohling opened this issue Mar 5, 2020 · 10 comments
Closed

Change base image #2116

jpkrohling opened this issue Mar 5, 2020 · 10 comments

Comments

@jpkrohling
Copy link
Contributor

Requirement - what kind of business use case are you trying to solve?

When an application container has a problem in production, it's often useful to be able to enter the container and execute a few commands, like curl, or to copy files from inside the container using commands like kubectl cp . Unfortunately, using a scratch image means that there's nothing in the container, breaking the two use cases above. Concretely, I'm debugging a problem with the badger storage and tried to get the files out of the container using kubectl cp.

While all those problems can be worked around by either building a custom image with another base image, or by using another image to mount the volumes we want to access, not having a proper base image slows down progress when trying to find the root cause of problems related to Jaeger.

Proposal - what do you suggest to solve the problem or improve the existing situation?

There are a few images that we could use. The image I have most experience with is Red Hat's UBI, which is used in quite a few operators, including OpenTelemetry's (minimal) and Jaeger's (regular). I know this is a secure and well-maintained image, but we should certainly consider other images as well. I'm open for suggestions here.

Using the ubi-minimal would add about 90MB to the image size, which isn't that much, especially considering that this is mostly relevant for the initial pull.

@ghost ghost added the needs-triage label Mar 5, 2020
@yurishkuro
Copy link
Member

Adding 90Mb means a 6x increase in the current image size. I would rather publish another group of -dev images.

@jpkrohling
Copy link
Contributor Author

It really sounds dramatic when we talk about "6x" increase, but in absolute terms, 90MB isn't even a minute away from most domestic broadband connections, to be shared across all components.

Is there a concrete concern about the image size, other than the time it takes to download it?

@yurishkuro
Copy link
Member

yurishkuro commented Mar 5, 2020

Jaeger has often been praised for being lean, I think our original 10Mb images contributed to it (now they are 20Mb). I also think that having dev tools in the image is something that only a small % of users need, e.g. when setting up Jaeger for the first time, figuring out networking namespaces, etc.

@jpkrohling
Copy link
Contributor Author

"Lean" as in the container image size, or as in binary size? The container isn't placed as a whole in memory, just the binary. The process binary is exactly the same no matter the base image, and the RSS is virtually identical:

## scratch
$ ps -q 362569 uxwww
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
jpkroeh+  362569  0.5  0.1 141136 25336 ?        Ssl  10:13   0:00 /go/bin/all-in-one-linux --sampling.strategies-file=/etc/jaeger/sampling_strategies.json

## ubi-minimal
$ ps -q 363519 uxwww
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
jpkroeh+  363519  1.2  0.1 141136 24304 ?        Ssl  10:14   0:00 /go/bin/all-in-one-linux --sampling.strategies-file=/etc/jaeger/sampling_strategies.json

## ubi
$ ps -q 364489 uxwww
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
jpkroeh+  364489  1.0  0.1 141136 25260 ?        Ssl  10:23   0:00 /go/bin/all-in-one-linux --sampling.strategies-file=/etc/jaeger/sampling_strategies.json

I uploaded the three images to my quay namespace, so that you can try it out by yourself :-)

quay.io/jpkroehling/all-in-one:ubi-regular
quay.io/jpkroehling/all-in-one:ubi
quay.io/jpkroehling/all-in-one:scratch

Looks like I was wrong about the image sizes. ubi-minimal would add about 34.7MB and ubi (regular) would add 73.4MB: https://quay.io/repository/jpkroehling/all-in-one?tab=tags

I also think that having dev tools in the image is something that only a small % of users need, e.g. when setting up Jaeger for the first time, figuring out networking namespaces, etc.

You are right that this is mostly useful when setting up first, but right now, we have a case where having those tools would have helped us diagnose a production problem faster and more easily. Without having those tools, we have to resort to either rebuilding a Jaeger image with a different base, or to mount the same badger volumes in another, more suitable image.

While we can ask people to change their Deployment objects (or Jaeger CR, if they are using the operator) to use their custom images, or a -dev image, those deployments are usually managed using automation tools, preventing most people from directly changing those objects in production.

In short: I don't really understand what do we gain by preventing people from entering the container when they are facing an issue. There's no hit in performance, there's no hit in memory usage, there's little to no hit in the disk space (especially when using a production setup, as jaeger-[collector|query|agent] would share the same base).

@jpkrohling
Copy link
Contributor Author

@yurishkuro, do you have any feedback on the definition of "lean", based on my comment above?

@yurishkuro
Copy link
Member

I'm still not clear why we can't publish both scratch and ubi-minimal. Can we not provide a global config option in the Operator to pick one vs. the other?

If we want a shell & basic tool set, why not use alpine?

@jpkrohling
Copy link
Contributor Author

jpkrohling commented Mar 13, 2020

We can certainly publish two sets of images, I just don't see the benefits in doing so vs. the maintenance work that it adds (CI, docs, users questions, ...). The only benefit of using scratch vs. ubi-minimal is that disk space per node (not container) would be about 35MB lower.

If we do decide to provide the two base images, the operator can certainly use a new flag to decide which image to use.

About Alpine: I have nothing against, and we can very well use it instead of ubi-minimal. I'm just not familiar enough with it to recommend it.

@yurishkuro
Copy link
Member

I'd suggest we switch from scratch to alpine, which is 7x smaller than ubi-minimal.

@jpkrohling
Copy link
Contributor Author

Sure, I'll try to get a PR ready soon for alpine, but I'm still wondering why is "7x smaller" (or rather, one seventh) even relevant when we are talking about such small figures...

@yurishkuro
Copy link
Member

@jpkrohling should this be closed? It seems #2545 made a switch to alpine even for non-debug images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants