Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase default Jaeger queue size for store-gateways and queriers #7068

Merged
merged 7 commits into from
Jan 9, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Jan 8, 2024

What this PR does

This PR upstreams a change we've made internally at Grafana Labs which reduces the number of trace spans dropped due to the Jaeger client queue becoming full.

It increases the value of Jaeger's queue size for store-gateways and queriers when using Helm and Jsonnet.

Note that the increase for queriers when using Jsonnet was already applied in https://github.com/grafana/mimir/pull/6764/files#diff-b51f5403fbd4f68fb4ad925c7fdd104a50c4165085b19f75b54405472f74b8f0.

Note to reviewers: if the pattern of adding a jaegerReporterMaxQueueSize Helm value for each component looks good, I'll follow up this PR with another to add this for all components.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@charleskorn charleskorn marked this pull request as ready for review January 9, 2024 00:14
@charleskorn charleskorn requested a review from a team as a code owner January 9, 2024 00:14
Copy link
Contributor

@56quarters 56quarters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one non-blocking comment.

@@ -61,6 +61,8 @@
),
// Dynamically set GOMEMLIMIT based on memory request.
GOMEMLIMIT: std.toString(std.floor($.util.siToBytes($.store_gateway_container.resources.requests.memory))),

JAEGER_REPORTER_MAX_QUEUE_SIZE: '1000',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make this a configuration setting in jsonnet for parity with Helm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marco suggested not doing that in https://github.com/grafana/mimir/pull/6764/files#diff-f3070297971f928d50b986208d8f23344d5c0a1d37f2c1422eed86feea6bb31a, so I'm following the pattern established there. I'd be happy either way.

I went for a different approach in Helm because I believe it's a harder (or even impossible) to extend the default value of something, so it'd be tricky for someone to extend our default set of environment variables that includes JAEGER_REPORTER_MAX_QUEUE_SIZE with additional variables. Might be wrong though.

@charleskorn charleskorn enabled auto-merge (squash) January 9, 2024 23:03
@charleskorn charleskorn merged commit f4fb287 into main Jan 9, 2024
30 checks passed
@charleskorn charleskorn deleted the charleskorn/jaeger-queue-size branch January 9, 2024 23:05
charleskorn added a commit that referenced this pull request Jan 10, 2024
… size for all components configurable in Helm chart (#7086)

* Set JAEGER_REPORTER_MAX_QUEUE_SIZE for write and backend containers deployed by Jsonnet.

* Upstream default values of JAEGER_REPORTER_MAX_QUEUE_SIZE used for query-frontends, ingesters and rulers at Grafana Labs

* Add support for setting Jaeger reporter max queue size to Helm chart

* Update default values in Helm chart for ruler, ingester and query-frontends

* Add changelog entries, and fix location of Helm chart changelog entry added in #7068.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants