Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Known issue csi #2889

Merged
merged 9 commits into from
May 22, 2024
25 changes: 13 additions & 12 deletions docs/docs-content/release-notes/known-issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,19 @@ to review and stay informed about the status of known issues in Palette. As issu

The following table lists all known issues that are currently active and affecting users.

| Description | Workaround | Publish Date | Product Component |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------- | ----------------- |
| Conducting cluster node scaling operations on a cluster undergoing a backup can lead to issues and potential unresponsiveness. | To avoid this, ensure no backup operations are in progress before scaling nodes or performing other cluster operations that change the cluster state | April 14, 2024 | Clusters |
| Palette automatically creates an AWS security group for worker nodes using the format `<cluster-name>-node`. If a security group with the same name already exists in the VPC, the cluster creation process fails. | To avoid this, ensure that no security group with the same name exists in the VPC before creating a cluster. | April 14, 2024 | Clusters |
| K3s version 1.27.7 has been marked as _Deprecated_. This version has a known issue that causes clusters to crash. | Upgrade to a newer version of K3s to avoid the issue, such as versions 1.26.12, 1.28.5, and 1.27.11. You can learn more about the issue in the [K3s GitHub issue](https://github.com/k3s-io/k3s/issues/9047) page. | April 14, 2024 | Packs, Clusters |
| When deploying a multi-node AWS EKS cluster with the Container Network Interface (CNI) [Calico](../integrations/calico.md), the cluster deployment fails. | A workaround is to use the AWS VPC CNI in the interim while the issue is resolved. | April 14, 2024 | Packs, Clusters |
| If a Kubernetes cluster deployed onto VMware is deleted, and later re-created with the same name, the cluster creation process fails. The issue is caused by existing resources remaining inside the PCG, or the System PCG, that are not cleaned up during the cluster deletion process. | Refer to the [VMware Resources Remain After Cluster Deletion](../troubleshooting/pcg.md#scenario---vmware-resources-remain-after-cluster-deletion) troubleshooting guide for resolution steps. | April 14, 2024 | Clusters |
| In a VMware environment, self-hosted Palette instances do not receive a unique cluster ID when deployed, which can cause issues during a node repave event, such as a Kubernetes version upgrade. Specifically, Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) will experience start problems due to the lack of a unique cluster ID. | To resolve this issue, refer to the [Volume Attachment Errors Volume in VMware Environment](../troubleshooting/palette-upgrade.md#volume-attachment-errors-volume-in-vmware-environment) troubleshooting guide. | April 14, 2024 | Self-Hosted |
| Day-2 operations related to infrastructure changes, such as modifying the node size and count, when using MicroK8s are not taking effect. | No workaround is available. | April 14, 2024 | Packs, Clusters |
| If a cluster that uses the Rook-Ceph pack experiences network issues, it's possible for the file mount to become and remain unavailable even after the network is restored. | This a known issue disclosed in the [Rook GitHub repository](https://github.com/rook/rook/issues/13818). To resolve this issue, refer to [Rook-Ceph](../integrations/rook-ceph.md#file-mount-becomes-unavailable-after-cluster-experiences-network-issues) pack documentation. | April 14, 2024 | Packs, Edge |
| Edge clusters on Edge hosts with ARM64 processors may experience instability issues that cause cluster failures. | ARM64 support is limited to a specific set of Edge devices. Currently, Nvidia Jetson devices are supported. | April 14, 2024 | Edge |
| During the cluster provisioning process of new edge clusters, the palette webhook pods may not always deploy successfully, causing the cluster to be stuck in the provisioning phase. This issue does not impact deployed clusters. | Review the [Palette Webhook Pods Fail to Start](../troubleshooting/edge.md#scenario---palette-webhook-pods-fail-to-start) troubleshooting guide for resolution steps. | April 14, 2024 | Edge |
| Description | Workaround | Publish Date | Product Component |
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ----------------- |
| Deploying a self-hosted Palette or VerteX, to a vSphere environment fails if vCenter has standalone hosts directly under a Datacenter. Persistent Volume (PV) provisioning fails due to an upstream issue with the vSphere Container Storage Interface (CSI) for all versions before v3.2.0. Palette and VerteX use the vSphere CSI version 3.12 internally | Remove standalone hosts directly under the Datacenter from vCenter and allow the volume provisioning to complete. After the volume is provisioned, you can add the standalone hosts back. You can also use a service account that does not have access to the standalone hosts as the user that deployed Palette. | May 21, 2024 | Self-Hosted |
karl-cardenas-coding marked this conversation as resolved.
Show resolved Hide resolved
karl-cardenas-coding marked this conversation as resolved.
Show resolved Hide resolved
| Conducting cluster node scaling operations on a cluster undergoing a backup can lead to issues and potential unresponsiveness. | To avoid this, ensure no backup operations are in progress before scaling nodes or performing other cluster operations that change the cluster state | April 14, 2024 | Clusters |
| Palette automatically creates an AWS security group for worker nodes using the format `<cluster-name>-node`. If a security group with the same name already exists in the VPC, the cluster creation process fails. | To avoid this, ensure that no security group with the same name exists in the VPC before creating a cluster. | April 14, 2024 | Clusters |
| K3s version 1.27.7 has been marked as _Deprecated_. This version has a known issue that causes clusters to crash. | Upgrade to a newer version of K3s to avoid the issue, such as versions 1.26.12, 1.28.5, and 1.27.11. You can learn more about the issue in the [K3s GitHub issue](https://github.com/k3s-io/k3s/issues/9047) page. | April 14, 2024 | Packs, Clusters |
| When deploying a multi-node AWS EKS cluster with the Container Network Interface (CNI) [Calico](../integrations/calico.md), the cluster deployment fails. | A workaround is to use the AWS VPC CNI in the interim while the issue is resolved. | April 14, 2024 | Packs, Clusters |
| If a Kubernetes cluster deployed onto VMware is deleted, and later re-created with the same name, the cluster creation process fails. The issue is caused by existing resources remaining inside the PCG, or the System PCG, that are not cleaned up during the cluster deletion process. | Refer to the [VMware Resources Remain After Cluster Deletion](../troubleshooting/pcg.md#scenario---vmware-resources-remain-after-cluster-deletion) troubleshooting guide for resolution steps. | April 14, 2024 | Clusters |
| In a VMware environment, self-hosted Palette instances do not receive a unique cluster ID when deployed, which can cause issues during a node repave event, such as a Kubernetes version upgrade. Specifically, Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) will experience start problems due to the lack of a unique cluster ID. | To resolve this issue, refer to the [Volume Attachment Errors Volume in VMware Environment](../troubleshooting/palette-upgrade.md#volume-attachment-errors-volume-in-vmware-environment) troubleshooting guide. | April 14, 2024 | Self-Hosted |
karl-cardenas-coding marked this conversation as resolved.
Show resolved Hide resolved
karl-cardenas-coding marked this conversation as resolved.
Show resolved Hide resolved
| Day-2 operations related to infrastructure changes, such as modifying the node size and count, when using MicroK8s are not taking effect. | No workaround is available. | April 14, 2024 | Packs, Clusters |
| If a cluster that uses the Rook-Ceph pack experiences network issues, it's possible for the file mount to become and remain unavailable even after the network is restored. | This a known issue disclosed in the [Rook GitHub repository](https://github.com/rook/rook/issues/13818). To resolve this issue, refer to [Rook-Ceph](../integrations/rook-ceph.md#file-mount-becomes-unavailable-after-cluster-experiences-network-issues) pack documentation. | April 14, 2024 | Packs, Edge |
| Edge clusters on Edge hosts with ARM64 processors may experience instability issues that cause cluster failures. | ARM64 support is limited to a specific set of Edge devices. Currently, Nvidia Jetson devices are supported. | April 14, 2024 | Edge |
karl-cardenas-coding marked this conversation as resolved.
Show resolved Hide resolved
| During the cluster provisioning process of new edge clusters, the palette webhook pods may not always deploy successfully, causing the cluster to be stuck in the provisioning phase. This issue does not impact deployed clusters. | Review the [Palette Webhook Pods Fail to Start](../troubleshooting/edge.md#scenario---palette-webhook-pods-fail-to-start) troubleshooting guide for resolution steps. | April 14, 2024 | Edge |

## Resolved Known Issues

Expand Down
Loading