From 8be53a44bbcdc0eafaf7fa8294cbcd4b353a24eb Mon Sep 17 00:00:00 2001 From: Zespre Chang Date: Mon, 10 Jul 2023 14:32:31 +0800 Subject: [PATCH 1/2] Add section about image cleaning for disk space Signed-off-by: Zespre Chang --- docs/faq.md | 23 +++++++++++++++++++++++ versioned_docs/version-v1.1/faq.md | 23 +++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/docs/faq.md b/docs/faq.md index f5e5b85fb3f..354d9d1c1fd 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -48,3 +48,26 @@ New password for default administrator (user-xxxxx): ### I added an additional disk with partitions. Why is it not getting detected? As of Harvester v1.0.2, we no longer support adding additional partitioned disks, so be sure to delete all partitions first (e.g., using `fdisk`). + +### Why are there some Harvester pods that become ErrImagePull/ImagePullBackOff? + +This is likely because your Harvester cluster is an air-gapped setup, and some pre-loaded container images are missing. Kubernetes has a mechanism that does garbage collection against bloated image stores. When the partition which stores container images is over 85% full, `kubelet` will try to prune some least used images to save disk space until the occupancy is lower than 80%. These numbers (85% and 80%) are default High/Low thresholds that come with Kubernetes. + +To recover from this state, do one of the following depending on the cluster's configuration: +- Pull the missing images from sources outside of the cluster (if it's an air-gapped environment, you might need to set up an HTTP proxy beforehand) +- Manually import the images from the Harvester ISO image +- Find the missing images on one node on the other nodes, and then export the images from the node still with them and import them on the missing one + +To prevent this from happening, we recommend cleaning up unused container images from the previous version after each successful Harvester upgrade if the image store disk space is stressed. We provided a [harv-purge-images script](https://github.com/harvester/upgrade-helpers/blob/main/bin/harv-purge-images.sh) that makes cleaning up disk space easy, especially for container image storage. The script has to be executed on each Harvester node. For example, if the cluster was originally in v1.1.2, and now it gets upgraded to v1.2.0, you can do the following to discard the container images that are only used in v1.1.2 but no longer needed in v1.2.0: + +```shell +# on each node +$ ./harv-purge-images.sh v1.1.2 v1.2.0 +``` + +:::caution + +- The script only downloads the image lists and compares the two to calculate the difference between the two versions. It does not communicate with the cluster and, as a result, does not know what version the cluster was upgraded from. +- We published image lists for each version released since v1.1.0. For clusters older than v1.1.0, users have to clean up the old images manually. + +::: diff --git a/versioned_docs/version-v1.1/faq.md b/versioned_docs/version-v1.1/faq.md index 0affe008cca..486ca7778f3 100644 --- a/versioned_docs/version-v1.1/faq.md +++ b/versioned_docs/version-v1.1/faq.md @@ -48,3 +48,26 @@ New password for default administrator (user-xxxxx): ### I added an additional disk with partitions. Why is it not getting detected? As of Harvester v1.0.2, we no longer support adding additional partitioned disks, so be sure to delete all partitions first (e.g., using `fdisk`). + +### Why are there some Harvester pods that become ErrImagePull/ImagePullBackOff? + +This is likely because your Harvester cluster is an air-gapped setup, and some pre-loaded container images are missing. Kubernetes has a mechanism that does garbage collection against bloated image stores. When the partition which stores container images is over 85% full, `kubelet` will try to prune some least used images to save disk space until the occupancy is lower than 80%. These numbers (85% and 80%) are default High/Low thresholds that come with Kubernetes. + +To recover from this state, do one of the following depending on the cluster's configuration: +- Pull the missing images from sources outside of the cluster (if it's an air-gapped environment, you might need to set up an HTTP proxy beforehand) +- Manually import the images from the Harvester ISO image +- Find the missing images on one node on the other nodes, and then export the images from the node still with them and import them on the missing one + +To prevent this from happening, we recommend cleaning up unused container images from the previous version after each successful Harvester upgrade if the image store disk space is stressed. We provided a [harv-purge-images script](https://github.com/harvester/upgrade-helpers/blob/main/bin/harv-purge-images.sh) that makes cleaning up disk space easy, especially for container image storage. The script has to be executed on each Harvester node. For example, if the cluster was originally in v1.1.1, and now it gets upgraded to v1.1.2, you can do the following to discard the container images that are only used in v1.1.1 but no longer needed in v1.1.2: + +```shell +# on each node +$ ./harv-purge-images.sh v1.1.1 v1.1.2 +``` + +:::caution + +- The script only downloads the image lists and compares the two to calculate the difference between the two versions. It does not communicate with the cluster and, as a result, does not know what version the cluster was upgraded from. +- We published image lists for each version released since v1.1.0. For clusters older than v1.1.0, users have to clean up the old images manually. + +::: From d249e5ea8b944878ec3ba42d77521ca5c6ca584d Mon Sep 17 00:00:00 2001 From: Lucas Saintarbor Date: Wed, 19 Jul 2023 13:37:37 -0700 Subject: [PATCH 2/2] Update docs/faq.md Adding @starbops feedback Co-authored-by: Zespre Chang --- docs/faq.md | 41 ++++++++++++++++++++++++++---- versioned_docs/version-v1.1/faq.md | 41 ++++++++++++++++++++++++++---- 2 files changed, 72 insertions(+), 10 deletions(-) diff --git a/docs/faq.md b/docs/faq.md index 354d9d1c1fd..9a20d1a6463 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -51,12 +51,43 @@ As of Harvester v1.0.2, we no longer support adding additional partitioned disks ### Why are there some Harvester pods that become ErrImagePull/ImagePullBackOff? -This is likely because your Harvester cluster is an air-gapped setup, and some pre-loaded container images are missing. Kubernetes has a mechanism that does garbage collection against bloated image stores. When the partition which stores container images is over 85% full, `kubelet` will try to prune some least used images to save disk space until the occupancy is lower than 80%. These numbers (85% and 80%) are default High/Low thresholds that come with Kubernetes. +This is likely because your Harvester cluster is an air-gapped setup, and some pre-loaded container images are missing. Kubernetes has a mechanism that does garbage collection against bloated image stores. When the partition which stores container images is over 85% full, `kubelet` tries to prune the images based on the last time they were used, starting with the oldest, until the occupancy is lower than 80%. These numbers (85% and 80%) are default High/Low thresholds that come with Kubernetes. To recover from this state, do one of the following depending on the cluster's configuration: -- Pull the missing images from sources outside of the cluster (if it's an air-gapped environment, you might need to set up an HTTP proxy beforehand) -- Manually import the images from the Harvester ISO image -- Find the missing images on one node on the other nodes, and then export the images from the node still with them and import them on the missing one +- Pull the missing images from sources outside of the cluster (if it's an air-gapped environment, you might need to set up an HTTP proxy beforehand). +- Manually import the images from the Harvester ISO image. + +:::note + +Take v1.1.2 as an example, download the Harvester ISO image from the official URL. Then extract the image list from the ISO image to decide which image tarball we're going to import. For instance, we want to import the missing container image `rancher/harvester-upgrade` + +```shell +$ curl -sfL https://releases.rancher.com/harvester/v1.1.2/harvester-v1.1.2-amd64.iso -o harvester.iso + +$ xorriso -osirrox on -indev harvester.iso -extract /bundle/harvester/images-lists images-lists + +$ grep -R "rancher/harvester-upgrade" images-lists/ +images-lists/harvester-images-v1.1.2.txt:docker.io/rancher/harvester-upgrade:v1.1.2 +``` + +Find out the location of the image tarball, and extract it from the ISO image. Decompress the extracted zstd image tarball. + +```shell +$ xorriso -osirrox on -indev harvester.iso -extract /bundle/harvester/images/harvester-images-v1.1.2.tar.zst harvester.tar.zst + +$ zstd -d --rm harvester.tar.zst +``` + +Upload the image tarball to the Harvester nodes that need recover. Finally, execute the following command to import the container images on each of them. + +```shell +$ ctr -n k8s.io images import harvester.tar +$ rm harvester.tar +``` + +::: + +- Find the missing images on that node from the other nodes, then export the images from the node where the images still exist and import them on the missing node. To prevent this from happening, we recommend cleaning up unused container images from the previous version after each successful Harvester upgrade if the image store disk space is stressed. We provided a [harv-purge-images script](https://github.com/harvester/upgrade-helpers/blob/main/bin/harv-purge-images.sh) that makes cleaning up disk space easy, especially for container image storage. The script has to be executed on each Harvester node. For example, if the cluster was originally in v1.1.2, and now it gets upgraded to v1.2.0, you can do the following to discard the container images that are only used in v1.1.2 but no longer needed in v1.2.0: @@ -68,6 +99,6 @@ $ ./harv-purge-images.sh v1.1.2 v1.2.0 :::caution - The script only downloads the image lists and compares the two to calculate the difference between the two versions. It does not communicate with the cluster and, as a result, does not know what version the cluster was upgraded from. -- We published image lists for each version released since v1.1.0. For clusters older than v1.1.0, users have to clean up the old images manually. +- We published image lists for each version released since v1.1.0. For clusters older than v1.1.0, you have to clean up the old images manually. ::: diff --git a/versioned_docs/version-v1.1/faq.md b/versioned_docs/version-v1.1/faq.md index 486ca7778f3..64492ab8399 100644 --- a/versioned_docs/version-v1.1/faq.md +++ b/versioned_docs/version-v1.1/faq.md @@ -51,12 +51,43 @@ As of Harvester v1.0.2, we no longer support adding additional partitioned disks ### Why are there some Harvester pods that become ErrImagePull/ImagePullBackOff? -This is likely because your Harvester cluster is an air-gapped setup, and some pre-loaded container images are missing. Kubernetes has a mechanism that does garbage collection against bloated image stores. When the partition which stores container images is over 85% full, `kubelet` will try to prune some least used images to save disk space until the occupancy is lower than 80%. These numbers (85% and 80%) are default High/Low thresholds that come with Kubernetes. +This is likely because your Harvester cluster is an air-gapped setup, and some pre-loaded container images are missing. Kubernetes has a mechanism that does garbage collection against bloated image stores. When the partition which stores container images is over 85% full, `kubelet` tries to prune the images based on the last time they were used, starting with the oldest, until the occupancy is lower than 80%. These numbers (85% and 80%) are default High/Low thresholds that come with Kubernetes. To recover from this state, do one of the following depending on the cluster's configuration: -- Pull the missing images from sources outside of the cluster (if it's an air-gapped environment, you might need to set up an HTTP proxy beforehand) -- Manually import the images from the Harvester ISO image -- Find the missing images on one node on the other nodes, and then export the images from the node still with them and import them on the missing one +- Pull the missing images from sources outside of the cluster (if it's an air-gapped environment, you might need to set up an HTTP proxy beforehand). +- Manually import the images from the Harvester ISO image. + +:::note + +Take v1.1.2 as an example, download the Harvester ISO image from the official URL. Then extract the image list from the ISO image to decide which image tarball we're going to import. For instance, we want to import the missing container image `rancher/harvester-upgrade` + +```shell +$ curl -sfL https://releases.rancher.com/harvester/v1.1.2/harvester-v1.1.2-amd64.iso -o harvester.iso + +$ xorriso -osirrox on -indev harvester.iso -extract /bundle/harvester/images-lists images-lists + +$ grep -R "rancher/harvester-upgrade" images-lists/ +images-lists/harvester-images-v1.1.2.txt:docker.io/rancher/harvester-upgrade:v1.1.2 +``` + +Find out the location of the image tarball, and extract it from the ISO image. Decompress the extracted zstd image tarball. + +```shell +$ xorriso -osirrox on -indev harvester.iso -extract /bundle/harvester/images/harvester-images-v1.1.2.tar.zst harvester.tar.zst + +$ zstd -d --rm harvester.tar.zst +``` + +Upload the image tarball to the Harvester nodes that need recover. Finally, execute the following command to import the container images on each of them. + +```shell +$ ctr -n k8s.io images import harvester.tar +$ rm harvester.tar +``` + +::: + +- Find the missing images on that node from the other nodes, then export the images from the node where the images still exist and import them on the missing node. To prevent this from happening, we recommend cleaning up unused container images from the previous version after each successful Harvester upgrade if the image store disk space is stressed. We provided a [harv-purge-images script](https://github.com/harvester/upgrade-helpers/blob/main/bin/harv-purge-images.sh) that makes cleaning up disk space easy, especially for container image storage. The script has to be executed on each Harvester node. For example, if the cluster was originally in v1.1.1, and now it gets upgraded to v1.1.2, you can do the following to discard the container images that are only used in v1.1.1 but no longer needed in v1.1.2: @@ -68,6 +99,6 @@ $ ./harv-purge-images.sh v1.1.1 v1.1.2 :::caution - The script only downloads the image lists and compares the two to calculate the difference between the two versions. It does not communicate with the cluster and, as a result, does not know what version the cluster was upgraded from. -- We published image lists for each version released since v1.1.0. For clusters older than v1.1.0, users have to clean up the old images manually. +- We published image lists for each version released since v1.1.0. For clusters older than v1.1.0, you have to clean up the old images manually. :::