Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runbook: Add extra runbook information for distroless images #8235

Merged
merged 7 commits into from
Jun 3, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

### Grafana Mimir

* [CHANGE] Build: `grafana/mimir` docker image is now based on `gcr.io/distroless/static-debian12` image. Alpine-based docker image is still available as `grafana/mimir-alpine`, until Mimir 2.15. #8204
* [CHANGE] Build: `grafana/mimir` docker image is now based on `gcr.io/distroless/static-debian12` image. Alpine-based docker image is still available as `grafana/mimir-alpine`, until Mimir 2.15. #8204 #8235
* [CHANGE] Ingester: `/ingester/flush` endpoint is now only allowed to execute only while the ingester is in `Running` state. The 503 status code is returned if the endpoint is called while the ingester is not in `Running` state. #7486
* [CHANGE] Distributor: Include label name in `err-mimir-label-value-too-long` error message: #7740
* [CHANGE] Ingester: enabled 1 out 10 errors log sampling by default. All the discarded samples will still be tracked by the `cortex_discarded_samples_total` metric. The feature can be configured via `-ingester.error-sample-rate` (0 to log all errors). #7807
Expand Down
85 changes: 85 additions & 0 deletions docs/sources/mimir/manage/mimir-runbooks/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2485,6 +2485,91 @@ gsutil cp $file ${file%#*}
done < full-deleted-file-list
```

### Debugging distroless container images (in Kubernetes)

Mimir publishes "distroless" container images. A [distroless image](https://github.com/GoogleContainerTools/distroless/blob/main/README.md)
contains very little outside of what is needed to run a single binary.
They don't include any text editors, process managers, package managers, or other debugging tools, unless the application itself requires these.

This can pose a challenge when diagnosing problems. There exists no shell inside the container
to attach to or any tools to inspect configuration files and so on.

However, to debug distroless containers we can take the approach of attaching a more complete
container to the existing container's namespace. This allows us to bring in all of the
tools we may need and to not disturb the existing environment.
That is, we do not need to restart the running container to attach our debug tools.

## Creating a debug container

Kubernetes gives us a command that allows us to start an ephemeral debug container in a pre-existing pod,
attaching it to the same namespace as other containers in that pod. More detail about the command and
how to debug running pods is available in [the Kubernetes docs](https://kubernetes.io/docs/tasks/debug/debug-application/debug-running-pod/#ephemeral-container).

```bash
kubectl --namespace mimir debug -it pod/compactor-0 --image=ubuntu:latest --target=compactor -c mimir-debug-container
jhesketh marked this conversation as resolved.
Show resolved Hide resolved
```

- `pod/name` is the pod to attach to.
- `--target=` is the container within that pod with which to share a kernel namespace.
- `--image=` is the image of the debug container you wish to use.
- `-c` is the name to use for the ephemeral container. This is optional, but useful if you want to re-use it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Should we use the long form of the option here as well for consistency?

jhesketh marked this conversation as resolved.
Show resolved Hide resolved

You can now see all of the processes running in this space. For example:

```
/ # ps aux
PID USER TIME COMMAND
1 root 5:36 /usr/bin/mimir -flags
31 root 0:00 /bin/bash
36 root 0:00 ps aux
```

PID 1 is the process that is executed in the target container. You can now use
tools within your debug image to interact with the running process. However, note
that your root path and important environment variables like $PATH will be different to
that of the target container.

The root filesystem of the target container is available in `/proc/1/root`. For
example, `/data` would be found at `/proc/1/root/data`, and
binaries of the target container would be somewhere like `/proc/1/root/usr/bin/mimir`.

## Copying files from a distroless container

Because distroless images do not have `tar` in them, it is not possible to copy files using `kubectl cp`.

To work around this, you can create a debug container attached to the pod (as per above) and then use `kubectl cp` against that.
The debug container cannot have terminated in order for us to be able to use it. This means if you run a debug container to get a shell,
you need to keep the shell open in order to do the following.
jhesketh marked this conversation as resolved.
Show resolved Hide resolved

For example, after having created a debug container called `mimir-debug-container` for the `compactor-0` pod, run the following to copy `/etc/hostname` from the compactor pod to `./hostname` on your local machine:

```bash
kubectl --namespace mimir cp compactor-0:/proc/1/root/etc/hostname -c mimir-debug-container ./hostname
```

- `-c` is the debug container to execute in.

Note, however, that there is a limitation with `kubectl cp` wherein it cannot follow symlinks. To get around this, we can similarly use `exec`
to create a tar.

For example, you can create a tar of the path you are interested in, and then extract it locally:

```bash
kubectl --namespace mimir exec compactor-0 -c mimir-debug-container -- tar cf - "/proc/1/root/etc/cortex" | tar xf -
```

## Cleanup and Limitations

One downside of using [ephemeral containers](https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/#understanding-ephemeral-containers)
(which is what `kubectl debug` is a wrapper around), is that they cannot be changed
after they have been added to a pod. This includes not being able to delete them.
If the process in the debug container has finished (for example, the shell has exited), the container
will remain in the `Terminated` state. This is harmless and will remain there until the pod is deleted (eg. due to a rollout).

However, if you wish to clean up the ephemeral containers, then re-creating the pod is necessary.
This can be done by deleting the (target) pod and allowing the Deployment or
StatefulSet to recreate it. Proceed with caution, particularly around StatefulSet's and un-zoned deployments.

jhesketh marked this conversation as resolved.
Show resolved Hide resolved
jhesketh marked this conversation as resolved.
Show resolved Hide resolved
## Log lines

### Log line containing 'sample with repeated timestamp but different value'
Expand Down