Skip to content

Commit

Permalink
docs: add new vsphere known issue (#3896)
Browse files Browse the repository at this point in the history
* docs: first draft

* Apply suggestions from code review

Co-authored-by: Karl Cardenas <29551334+karl-cardenas-coding@users.noreply.github.com>

* docs: add clarifying details

* docs: move old content to deprecated

* docs: clarify known issue not automatically resolved

* docs: add warning to more obvious places

* docs: add link to warning and xlinks

* Apply suggestions from code review

Co-authored-by: Karl Cardenas <29551334+karl-cardenas-coding@users.noreply.github.com>

* ci: auto-formatting prettier issues

---------

Co-authored-by: Lenny Chen <lenny.chen@spectrocloud.com>
Co-authored-by: Karl Cardenas <29551334+karl-cardenas-coding@users.noreply.github.com>
Co-authored-by: lennessyy <lennessyy@users.noreply.github.com>
(cherry picked from commit b46a973)
  • Loading branch information
lennessyy committed Sep 13, 2024
1 parent ef93ecb commit 96d50a9
Show file tree
Hide file tree
Showing 10 changed files with 247 additions and 143 deletions.
126 changes: 126 additions & 0 deletions docs/deprecated/troubleshooting/enterprise-install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
## Volume Attachment Errors Volume in VMware Environment

If you deployed Palette in a VMware vSphere environment and are experiencing volume attachment errors for the MongoDB
pods during the upgrade process, it may be due to duplicate resources in the cluster causing resource creation errors.
Palette versions between 4.0.0 and 4.3.0 are affected by a known issue where cluster resources are not receiving unique
IDs. Use the following steps to correctly identify the issue and resolve it.

### Debug Steps

1. Open up a terminal session in an environment that has network access to the Kubernetes cluster.

2. Configure kubectl CLI to connect to the self-hosted Palette or VerteX's Kubernetes cluster. Refer to the
[Access Cluster with CLI](../clusters/cluster-management/palette-webctl.md) for additional guidance.
3. Verify the MongoDB pods are not starting correctly by issuing the following command.

```shell
kubectl get pods --namespace=hubble-system --selector='app=spectro,role=mongo'
```

```shell {4} hideClipboard
NAME READY STATUS RESTARTS AGE
mongo-0 2/2 Running 0 17h
mongo-1 2/2 Running 0 17h
mongo-2 0/2 ContainerCreating 0 16m
```

4. Inspect the pod that is not starting correctly. Use the following command to describe the pod. Replace `mongo-2`
with the name of the pod that is not starting.

```shell
kubectl describe pod mongo-2 --namespace=hubble-system
```

5. Review the event output for any errors. If an error related to the volume attachment is present, proceed to the next
step.

```shell hideClipboard
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedAttachVolume 106s (x16 over 18m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-94cbb8f5-9145-4b18-9bf9-ee027b64d0c7" : volume attachment is being deleted
Warning FailedMount 21s (x4 over 16m) kubelet Unable to attach or mount volumes: unmounted volumes=[mongo-data], unattached volumes=[spectromongokey kube-api-access-sz5lz mongo-data spectromongoinit spectromongopost]: timed out waiting for the condition
```

6. The remaining steps may need to be performed on all MongoDB pods and their associated Persistent Volume (PV), and
Persistent Volume Claim (PVC). Do each step sequentially for each MongoDB pod that is encountering the volume
attachment error.

:::warning

Only do the steps for one MongoDB pod at a time to prevent data loss. Wait for the pod to come up correctly before
proceeding to the next pod.

:::

7. Delete the PVC associated with the MongoDB pod. Replace `mongo-2` with the name of the pod that is not starting.

```shell
kubectl delete pvc mongo-data-mongo-2 --namespace=hubble-system
```

8. Delete the PV associated with the MongoDB pod. Use the following command to list all PVs and find the PV associated
with the MongoDB pod you started with. In this example, the PV associated with `mongo-2` is
`pvc-94cbb8f5-9145-4b18-9bf9-ee027b64d0c7`. Make a note of this name.

```shell
kubectl get pv | grep 'mongo-data-mongo-2'
```

```shell hideClipboard
pvc-94cbb8f5-9145-4b18-9bf9-ee027b64d0c7 20Gi RWO Delete Bound hubble-system/mongo-data-mongo-2 spectro-storage-class 18h
```

9. Using the PV name from the previous step, delete the PV.

```shell
kubectl delete pv pvc-94cbb8f5-9145-4b18-9bf9-ee027b64d0c7
```

:::tip

The kubectl command may hang after issuing the delete command, press `Ctrl+C` to exit the command and proceed to the
next step.

:::

10. Delete the MongoDB pod that was not starting correctly. Replace `mongo-2` with the name of the pod that is not
starting.

```shell
kubectl delete pod mongo-2 --namespace=hubble-system
```

11. Wait for the pod to come up correctly. Use the following command to verify the pod is up and available.

```shell
kubectl get pods --namespace=hubble-system --selector='app=spectro,role=mongo'
```

```shell {4} hideClipboard
NAME READY STATUS RESTARTS AGE
mongo-0 2/2 Running 0 18h
mongo-1 2/2 Running 0 18h
mongo-2 2/2 Running 0 68s
```

:::warning

Once the pod is in the **Running** status, wait for at least five minutes for the replication to complete before
proceeding with the other pods.

:::

Palette will proceed with the upgrade and attempt to upgrade the remaining MongoDB pods. Repeat the steps for each
of the MongoDB pods that are not starting correctly due to the volume attachment error.

The upgrade process will continue once all MongoDB pods are up and available. Verify the new nodes deployed
successfully by checking the status of the nodes. Log in to the
[system console](../enterprise-version/system-management/system-management.md#access-the-system-console), navigate
to left **Main Menu** and select **Enterprise Cluster**. The **Nodes** tab will display the status of the nodes in
the cluster.

![A view of three nodes in a healthy status](/troubleshootig_palette-upgrade_nodes-healthy.webp)

If you continue to encounter issues, contact our support team by emailing
[support@spectrocloud.com](mailto:support@spectrocloud.com) so that we can provide you with further guidance.
9 changes: 9 additions & 0 deletions docs/docs-content/enterprise-version/upgrade/upgrade-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,15 @@ Palette 4.0 includes the following major enhancements that require user interven

### Upgrade with VMware

:::warning

A known issue impacts all self-hosted Palette instances older then 4.4.14. Before upgrading a Palette instance with
version older than 4.4.14, ensure that you execute a utility script to make all your cluster IDs unique in your
Persistent Volume Claim (PVC) metadata. For more information, refer to the
[Troubleshooting Guide](../../troubleshooting/enterprise-install.md#non-unique-vsphere-cns-mapping).

:::

From the Palette system console, click the **Update version** button. Palette will be temporarily unavailable while
system services update.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,16 @@ keywords: ["self-hosted", "enterprise"]
---

This guide takes you through the process of upgrading a self-hosted airgap Palette instance installed on VMware vSphere.

:::warning

Before upgrading Palette to a new major version, you must first update it to the latest patch version of the latest
minor version available. Refer to the [Supported Upgrade Paths](../upgrade.md#supported-upgrade-paths) section for
details.

:::warning

If you are upgrading from a Palette version that is older than 4.4.14, ensure that you have executed the utility script
to make the CNS mapping unique for the associated PVC. For more information, refer to the
[Troubleshooting guide](../../../troubleshooting/enterprise-install.md#non-unique-vsphere-cns-mapping).

:::

If your setup includes a PCG, you must also
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,15 @@ tags: ["palette", "self-hosted", "vmware", "non-airgap", "upgrade"]
keywords: ["self-hosted", "enterprise"]
---

This guide takes you through the process of upgrading a self-hosted Palette instance installed on VMware vSphere.
This guide takes you through the process of upgrading a self-hosted Palette instance installed on VMware vSphere. Before
upgrading Palette to a new major version, you must first update it to the latest patch version of the latest minor
version available. Refer to the [Supported Upgrade Paths](../upgrade.md#supported-upgrade-paths) section for details.

:::warning

Before upgrading Palette to a new major version, you must first update it to the latest patch version of the latest
minor version available. Refer to the [Supported Upgrade Paths](../upgrade.md#supported-upgrade-paths) section for
details.
If you are upgrading from a Palette version that is older than 4.4.14, ensure that you have executed the utility script
to make the CNS mapping unique for the associated PVC. For more information, refer to the
[Troubleshooting guide](../../../troubleshooting/enterprise-install.md#non-unique-vsphere-cns-mapping).

:::

Expand Down
Loading

0 comments on commit 96d50a9

Please sign in to comment.