Bug: Timed Out Problem with Edge Cluster Agent Installation #4105

szeyan543 · 2024-07-09T15:41:30Z

Describe the bug.

During the edge cluster installation, the agent deployment process times out after creating the necessary resources. The installation sequence proceeds as expected up to the point where it waits for the agent deployment to complete, but it fails to do so within the allocated time frame. This results in a timeout error.

2024-07-09 11:31:23 cronjob auto-upgrade-cronjob created
2024-07-09 11:31:24 persistentvolumeclaim/openhorizon-agent-pvc created
2024-07-09 11:31:24 persistent volume claim created
2024-07-09 11:31:25 deployment.apps/agent created
2024-07-09 11:31:25 Waiting up to 300 seconds for the agent deployment to complete...
error: timed out waiting for the condition

Describe the steps to reproduce the behavior.

No response

Expected behavior.

No response

Screenshots.

No response

Operating Environment

Linux

Additional Information

No response

The text was updated successfully, but these errors were encountered:

szeyan543 · 2024-07-09T15:42:05Z

@joewxboy

dlarson04 · 2024-07-10T00:05:42Z

responded with the following in the LFEdge messaging app

hi
The only time I have seen this is when something goes wrong with the persistentvolumeclaims. Please issue the following commands and paste the results

kubectl get storageclasses

and

kubectl get persistentvolumeclaims -A

and

kubectl -n <namespace_name> get deploy/agent -o=jsonpath='{$.spec.template.spec.containers[*].image}'; echo ""

szeyan543 · 2024-07-10T04:39:46Z

Hello, here are the results from the commands:

root@k:~# kubectl get storageclasses

NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  14h

root@k:~# kubectl get persistentvolumeclaims -A

NAMESPACE           NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
default             docker-registry-pvc     Bound    pvc-f1e564ad-ee0c-49a7-b688-184a06064550   10Gi       RWO            local-path     <unset>                 14h
openhorizon-agent   openhorizon-agent-pvc   Bound    pvc-dcd879b7-8d93-4025-982e-8f8582cf6eee   10Gi       RWO            local-path     <unset>                 5m31s

root@k:~# kubectl -n openhorizon-agent get deploy/agent -o=jsonpath='{$.spec.template.spec.containers[*].image}'; echo ""
10.43.195.246:5000/openhorizon-agent/amd64_anax_k8s:latest

dlarson04 · 2024-07-19T17:44:07Z

We debugged this and there was mismatch between HTTP and HTTPS on the k3s cluster... Fixing the cluster resolved the problem.

szeyan543 added the bug label Jul 9, 2024

joewxboy assigned szeyan543 Jul 10, 2024

dlarson04 added the invalid label Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Timed Out Problem with Edge Cluster Agent Installation #4105

Bug: Timed Out Problem with Edge Cluster Agent Installation #4105

szeyan543 commented Jul 9, 2024

szeyan543 commented Jul 9, 2024

dlarson04 commented Jul 10, 2024

szeyan543 commented Jul 10, 2024

dlarson04 commented Jul 19, 2024

Bug: Timed Out Problem with Edge Cluster Agent Installation #4105

Bug: Timed Out Problem with Edge Cluster Agent Installation #4105

Comments

szeyan543 commented Jul 9, 2024

Describe the bug.

Describe the steps to reproduce the behavior.

Expected behavior.

Screenshots.

Operating Environment

Additional Information

szeyan543 commented Jul 9, 2024

dlarson04 commented Jul 10, 2024

szeyan543 commented Jul 10, 2024

dlarson04 commented Jul 19, 2024