Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clamp GOMAXPROCS when higher than runtime.NumCPU #8201

Merged
merged 2 commits into from
May 29, 2024

Conversation

dimitarvdimitrov
Copy link
Contributor

Background

We are trying to automatically set GOMAXPROCS based on the number of CPUs that an ingester pod requests in Kubernetes. We're going with 2x the requested cores. The reason for this is that the default values of GOMAXPROCS is NumCPU. When running on large nodes and only utilizing a small % of the underlying node results in high scheduling overhead.

Problem

Sometimes the setting of GOMAXPROCS might exceed the number of cores of the node. We also don't want to restrict the nodes on which pods run. In those cases setting GOMAXPROCS to a higher value than NumCPU has the opposite effect - it increases scheduling overhead instead of reducing it.

What this PR does

Clamps the value of GOMAXPROCS to runtime.NumCPU. The idea of this PR is to basically make automating the GOMAXPROCS setting in deployment tooling easier by having some support from the code.

Considerations

It's possible that the original NumCPU is not correctly detected or that the operator intended to run with higher GOMAXPROCS. I didn't want to add more configuration options, so instead we're logging a warning message. If there is a real use case for GOMAXPROCS > NumCPU, then we can add the configuration option later.

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@dimitarvdimitrov dimitarvdimitrov requested a review from a team as a code owner May 28, 2024 14:43
@dimitarvdimitrov
Copy link
Contributor Author

cc @agardiman since you raised this at the last community call

Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm wondering if there should me a minimum GOMAXPROCS for components using mmap (ingesters), regardless the number of CPU cores in the node, due to the "golang mmap issue".

#### Background

We are trying to automatically set GOMAXPROCS based on the number of CPUs that an ingester pod requests in Kubernetes. We're going with 2x the requested cores. The reason for this is that the default values of GOMAXPROCS is NumCPU. When running on large nodes and only utilizing a small % of the underlying node results in high scheduling overhead.

#### Problem

Sometimes the setting of GOMAXPROCS might exceed the number of cores of the node. We also don't want to restrict the nodes on which pods run. In those cases setting GOMAXPROCS to a higher value than NumCPU has the opposite effect - it increases scheduling overhead instead of reducing it.

The idea of this PR is to basically make automating the GOMAXPROCS setting in deployment tooling easier by having some support from the code.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov
Copy link
Contributor Author

AFAIK we have no visibility into how much of the process capacity mmap might be using, so not sure how to make this call - is 2 enough? 10? Do you have any ideas?

@dimitarvdimitrov
Copy link
Contributor Author

golang mmap issue

I had a look at a single ingester process via Linux's pprof. I was looking at the proportion of time spent in handle_mm_fault (60s profiling @ 500Hz: perf record -F 500 -g -p <PID> -- sleep 60). Looking at perf script there are only 18 stack traces that include handle_mm_fault out of 44146 total stacks: 0.04%.

Flamegraph & `perf script` output

out.script.zip

flamegraph

Unfortunately, this is an extremely narrow view of ingester usage. But the method with perf doesn't scale at all. At least it suggests that in a stable state the overhead of mmap can be negligible.

I'll merge this PR since I don't think the clamping is likely to be a negative change. Happy to continue discussing about mmap and GOMAXPROCS.

@dimitarvdimitrov dimitarvdimitrov merged commit 3803a60 into main May 29, 2024
29 checks passed
@dimitarvdimitrov dimitarvdimitrov deleted the dimitar/clamp-gomaxprocs branch May 29, 2024 11:53
narqo pushed a commit to narqo/grafana-mimir that referenced this pull request Jun 6, 2024
* Clamp GOMAXPROCS when higher than runtime.NumCPU

#### Background

We are trying to automatically set GOMAXPROCS based on the number of CPUs that an ingester pod requests in Kubernetes. We're going with 2x the requested cores. The reason for this is that the default values of GOMAXPROCS is NumCPU. When running on large nodes and only utilizing a small % of the underlying node results in high scheduling overhead.

#### Problem

Sometimes the setting of GOMAXPROCS might exceed the number of cores of the node. We also don't want to restrict the nodes on which pods run. In those cases setting GOMAXPROCS to a higher value than NumCPU has the opposite effect - it increases scheduling overhead instead of reducing it.

The idea of this PR is to basically make automating the GOMAXPROCS setting in deployment tooling easier by having some support from the code.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Add CHANGELOG.md entry

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants