Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supervisor does no retry to connect to opamp server forever #33408

Closed
cforce opened this issue Jun 6, 2024 · 5 comments · Fixed by #34159
Closed

supervisor does no retry to connect to opamp server forever #33408

cforce opened this issue Jun 6, 2024 · 5 comments · Fixed by #34159
Labels
bug Something isn't working cmd/opampsupervisor needs triage New item requiring triage

Comments

@cforce
Copy link

cforce commented Jun 6, 2024

Component(s)

cmd/opampsupervisor

What happened?

Supervisor when started shall not give up to connect to the opamp backend, when errors with connectivity.
At least we shall be able to configure the timeout before giving up.
In term of resilience in a non stable (e.g. cellular) network environment this would need elsewise a external scheduler like systemd to restart instead of retries of the supervisor itself. such external restart would also increase load on the cpu. A "endless" loop with retry timeout is the best practice for client to sever communication retrs.

Despite the errors, the log indicates that there are retries happening (e.g., will retry message). However, if it seems like it's not retrying, it might be due to:

Immediate Failures: The connection attempts might be failing too quickly in succession, making it appear as if there's no retry mechanism.
There might be configuration settings limiting or controlling the retry behavior which i don't know. Why does the supervisor's has such fixed (instead of unlimited) retry policies or limits?
I feel the supervisor code is written like that to handle error situation, but it shall retry resilient

Collector version

o.101

Environment information

No response

OpenTelemetry Collector configuration

No response

Log output

2024-06-05T08:03:37.098+0200    DEBUG   commander/commander.go:74       Starting agent  {"agent": "./otelcollector"}
2024-06-05T08:03:37.100+0200    DEBUG   commander/commander.go:93       Agent process started   {"pid": 196962}
2024-06-05T08:03:37.262+0200    DEBUG   commander/commander.go:160      Stopping agent process  {"pid": 196962}
2024-06-05T08:03:37.267+0200    DEBUG   supervisor/logger.go:21 Agent disconnected: websocket: close 1000 (normal): Normal closure
2024-06-05T08:03:37.272+0200    DEBUG   commander/commander.go:176      Agent process successfully stopped.     {"pid": 196962}
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:151    Supervisor starting     {"id": "01HZKFXTR406AFVGQT5ZYC0GEK"}
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:369    Connecting to OpAMP server...   {"endpoint": "ws://xxx:XX/v1/opamp", "headers": {"Agent-ID":[""],"Authorization":["Secret-Key XXXXXXXXXXXXXXXXXXXXXXXXX"]}}
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:419    Starting OpAMP client...
2024-06-05T08:03:37.272+0200    DEBUG   supervisor/supervisor.go:426    OpAMP Client started.
2024-06-05T08:03:37.274+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.274+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.274+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.540+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.540+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:37.540+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:38.023+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:38.023+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:38.023+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:39.100+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:39.100+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:39.100+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:40.228+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:40.228+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:40.228+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:42.295+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:42.295+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:42.295+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:46.047+0200    ERROR   supervisor/supervisor.go:381    Failed to connect to the server {"error": "websocket: bad handshake"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*Supervisor).startOpAMP.func2
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/supervisor.go:381
github.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnConnectFailed
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/types/callbacks.go:147
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:144
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:46.047+0200    ERROR   supervisor/logger.go:26 Server responded with status=404 Not Found
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).tryConnectOnce
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:166
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:201
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:46.047+0200    ERROR   supervisor/logger.go:26 Connection failed (websocket: bad handshake), will retry.
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor.(*opAMPLogger).Errorf
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/supervisor/logger.go:26
github.com/open-telemetry/opamp-go/client.(*wsClient).ensureConnected
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:207
github.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:245
github.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/wsclient.go:330
github.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1
        /root/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.14.0/client/internal/clientcommon.go:197
2024-06-05T08:03:47.273+0200    ERROR   opampsupervisor/main.go:24      failed to connect to the OpAMP server: %!w(<nil>)
main.main
        /builds/otelcollector/opentelemetry-collector-contrib/cmd/opampsupervisor/main.go:24
runtime.main
        /usr/local/go/src/runtime/proc.go:267

Additional context

No response

@cforce cforce added bug Something isn't working needs triage New item requiring triage labels Jun 6, 2024
Copy link
Contributor

github-actions bot commented Jun 6, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@JaredTan95
Copy link
Member

You mean when the connection to oapserver fails supervisor should exit the process?

@cforce
Copy link
Author

cforce commented Jun 9, 2024

The opposite- it shall never exi but retry to re/connect forever

@tigrannajaryan
Copy link
Member

The opposite- it shall never exi but retry to re/connect forever

+1. This is the intent.

@cforce
Copy link
Author

cforce commented Jun 26, 2024

Will that MR solve it? Seems like @tigrannajaryan review in on your side ;)

#33275

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cmd/opampsupervisor needs triage New item requiring triage
Projects
None yet
3 participants