-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add distributor inflight request size limit #2413
Add distributor inflight request size limit #2413
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a few small (non-blocking) suggestions / questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dimitar for working on this!
5c99e7f
to
12c584b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! 👏
I relized that we only expose the |
i added that ☝️ + a label value for |
8e122f2
to
ee22ea5
Compare
pkg/distributor/distributor.go
Outdated
maxInflightPushRequestsFlag = "distributor.instance-limits.max-inflight-push-requests" | ||
maxIngestionRateFlag = "distributor.instance-limits.max-ingestion-rate" | ||
maxInflightPushRequestsFlag = "distributor.instance-limits.max-inflight-push-requests" | ||
maxInflightPushRequestsBytesFlag = "distributor.instance-limits.max-inflight-push-requests-total-bytes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Most other places don't include "total" but the flag does. Should this be changed to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I think I just didn't delete the suffix properly in 0eeed94. I will change
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
…l-size Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
11a4a3f
to
91dd66d
Compare
@@ -20,6 +20,7 @@ | |||
* [ENHANCEMENT] Store-gateway, listblocks: list of blocks now includes stats from `meta.json` file: number of series, samples and chunks. #2425 | |||
* [ENHANCEMENT] Added more buckets to `cortex_ingester_client_request_duration_seconds` histogram metric, to correctly track requests taking longer than 1s (up until 16s). #2445 | |||
* [ENHANCEMENT] Azure client: Improve memory usage for large object storage downloads. #2408 | |||
* [ENHANCEMENT] Distributor: Add `-distributor.instance-limits.max-inflight-push-requests-bytes`. This limit protects the distributor against multiple large requests that together may cause an OOM, but are only a few, so do not trigger the `max-inflight-push-requests` limit. #2413 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could ask: if we have this new limit available, what is the use of max-inflight-push-requests
? But let's leave that for now to not break backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't know if one works better than the other. size limits are probably more related to the actual work a distributor does compared to number of requests, but there may be some overhead in managing the limit (e.g. having to keep it up to date with distributor resources). I expect this to settle with time; then we can delete the other limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (modulo a nit). Thanks!
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Empirically this number lags the actual memory consumption. I suspect that computing it after protobuf decoding is part of the problem; the size is known after Line 151 in 8e900c2
but you could actually reduce (and then do equivalent things on the OTEL path) |
Signed-off-by: Dimitar Dimitrov dimitar.dimitrov@grafana.com
What this PR does
Introduces a limit for the total size of the inflight push requests in the distributor
-distributor.instance-limits.max-inflight-push-requests-bytes
.This limit is useful in combination with the existing
-distributor.instance-limits.max-inflight-push-requests
. The existing limit does not account for large requests that may be below the inflight requests count. The new limit is meant to protect the distributor against multiple large requests.By default the limit is disabled because it will limit memory utilization on existing Mimir installations.
Which issue(s) this PR fixes or relates to
Fixes #2226
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]