Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer querying ingester zones with the least number of non-ACTIVE ingesters #6727

Merged
merged 3 commits into from
Nov 26, 2023

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Nov 24, 2023

What this PR does

This PR builds on #6726 to try to mitigate the issue described near the end of the PR description:

Note that the downside of this change is that queriers will attempt to query ingesters that are possibly still starting up, and may not be ready to receive requests. In this case, hedging will cause the querier to try other ingesters, if available, after at most 2s, or sooner if the connection to the ingester fails before hedging is triggered, and one of the following will happen:

  • If ingesters in other zones are healthy, then the impact is simply increased latency. An improvement for this would be to prioritize querying zones where all ingesters are ACTIVE, but I will add this in a follow up PR.
  • ...

This PR mitigates this issue by prioritising querying ingester zones that have the least number of non-ACTIVE ingesters. This means a querier is less likely to choose a zone with a PENDING ingester and so is less likely to query an ingester that is still in the process of starting up, fail and need to initiate requests to a third zone.

This PR requires grafana/dskit#440 to be merged first.

Which issue(s) this PR fixes or relates to

Related: #6726
Related: grafana/dskit#440

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@charleskorn charleskorn force-pushed the charleskorn/query-pending-ingesters branch from 0d4a8f7 to 78a56d9 Compare November 24, 2023 03:15
@charleskorn charleskorn force-pushed the charleskorn/prioritise-non-pending-zones branch from 1ff80ba to dc876f9 Compare November 24, 2023 03:16
@charleskorn charleskorn marked this pull request as ready for review November 24, 2023 03:28
@charleskorn charleskorn requested review from grafanabot and a team as code owners November 24, 2023 03:28
@charleskorn charleskorn force-pushed the charleskorn/prioritise-non-pending-zones branch from dc876f9 to a8e619a Compare November 26, 2023 21:53
Base automatically changed from charleskorn/query-pending-ingesters to main November 26, 2023 22:04
@charleskorn charleskorn force-pushed the charleskorn/prioritise-non-pending-zones branch from a8e619a to 9a59932 Compare November 26, 2023 22:12
@charleskorn charleskorn enabled auto-merge (squash) November 26, 2023 22:12
@charleskorn charleskorn merged commit 42ccb78 into main Nov 26, 2023
28 checks passed
@charleskorn charleskorn deleted the charleskorn/prioritise-non-pending-zones branch November 26, 2023 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants