You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #654 we have introduced ability to view ingester ring state via endpoint in ingesters. This endpoint however uses "heartbeat period" for "heartbeat timeout" when rendering the page, which is a bug. Hearbeat period is often much shorter (eg. it defaults to 5s for -ingester.ring.heartbeat-period flag) than heartbeat timeout (-ingester.ring.heartbeat-timeout defaults to 1min), so ingesters show up as "Unhealthy", even if they are not.
Notice how lifecycler passes HeartbeatPeriod to the ring handler, while "ring.go" used by distributor passes correct HeartbeatTimeout:
To Reproduce
Start Mimir (SHA or version)
Access /ingester/ring endpoint on ingesters, notice unhealthy ingesters.
Access /ingester/ring endpoint on distributors, ingesters will look healthy there.
Expected behavior
Ring page should show the same information when accessed via distributors and ingesters. In particular, ingesters should not use "heartbeat period" instead of "heartbeat timeout".
Describe the bug
In #654 we have introduced ability to view ingester ring state via endpoint in ingesters. This endpoint however uses "heartbeat period" for "heartbeat timeout" when rendering the page, which is a bug. Hearbeat period is often much shorter (eg. it defaults to 5s for
-ingester.ring.heartbeat-period
flag) than heartbeat timeout (-ingester.ring.heartbeat-timeout
defaults to 1min), so ingesters show up as "Unhealthy", even if they are not.Link to the code: https://github.com/grafana/dskit/blob/25baa36b7a6fca2025c4e15d7e95d4810c85fa16/ring/http.go#L109
Notice how lifecycler passes
HeartbeatPeriod
to the ring handler, while "ring.go" used by distributor passes correctHeartbeatTimeout
:To Reproduce
Expected behavior
Ring page should show the same information when accessed via distributors and ingesters. In particular, ingesters should not use "heartbeat period" instead of "heartbeat timeout".
Additional Context
Original public Slack thread: https://grafana.slack.com/archives/C039863E8P7/p1653301187771079
The text was updated successfully, but these errors were encountered: