Clamp GOMAXPROCS when higher than runtime.NumCPU #8201

dimitarvdimitrov · 2024-05-28T14:43:50Z

Background

We are trying to automatically set GOMAXPROCS based on the number of CPUs that an ingester pod requests in Kubernetes. We're going with 2x the requested cores. The reason for this is that the default values of GOMAXPROCS is NumCPU. When running on large nodes and only utilizing a small % of the underlying node results in high scheduling overhead.

Problem

Sometimes the setting of GOMAXPROCS might exceed the number of cores of the node. We also don't want to restrict the nodes on which pods run. In those cases setting GOMAXPROCS to a higher value than NumCPU has the opposite effect - it increases scheduling overhead instead of reducing it.

What this PR does

Clamps the value of GOMAXPROCS to runtime.NumCPU. The idea of this PR is to basically make automating the GOMAXPROCS setting in deployment tooling easier by having some support from the code.

Considerations

It's possible that the original NumCPU is not correctly detected or that the operator intended to run with higher GOMAXPROCS. I didn't want to add more configuration options, so instead we're logging a warning message. If there is a real use case for GOMAXPROCS > NumCPU, then we can add the configuration option later.

Checklist

Tests updated.
Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
about-versioning.md updated with experimental features.

dimitarvdimitrov · 2024-05-28T14:46:04Z

cc @agardiman since you raised this at the last community call

pracucci

LGTM. I'm wondering if there should me a minimum GOMAXPROCS for components using mmap (ingesters), regardless the number of CPU cores in the node, due to the "golang mmap issue".

#### Background We are trying to automatically set GOMAXPROCS based on the number of CPUs that an ingester pod requests in Kubernetes. We're going with 2x the requested cores. The reason for this is that the default values of GOMAXPROCS is NumCPU. When running on large nodes and only utilizing a small % of the underlying node results in high scheduling overhead. #### Problem Sometimes the setting of GOMAXPROCS might exceed the number of cores of the node. We also don't want to restrict the nodes on which pods run. In those cases setting GOMAXPROCS to a higher value than NumCPU has the opposite effect - it increases scheduling overhead instead of reducing it. The idea of this PR is to basically make automating the GOMAXPROCS setting in deployment tooling easier by having some support from the code. Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov · 2024-05-28T15:01:54Z

AFAIK we have no visibility into how much of the process capacity mmap might be using, so not sure how to make this call - is 2 enough? 10? Do you have any ideas?

dimitarvdimitrov · 2024-05-29T11:53:47Z

golang mmap issue

I had a look at a single ingester process via Linux's pprof. I was looking at the proportion of time spent in handle_mm_fault (60s profiling @ 500Hz: perf record -F 500 -g -p <PID> -- sleep 60). Looking at perf script there are only 18 stack traces that include handle_mm_fault out of 44146 total stacks: 0.04%.

Flamegraph & `perf script` output

out.script.zip

Unfortunately, this is an extremely narrow view of ingester usage. But the method with perf doesn't scale at all. At least it suggests that in a stable state the overhead of mmap can be negligible.

I'll merge this PR since I don't think the clamping is likely to be a negative change. Happy to continue discussing about mmap and GOMAXPROCS.

* Clamp GOMAXPROCS when higher than runtime.NumCPU #### Background We are trying to automatically set GOMAXPROCS based on the number of CPUs that an ingester pod requests in Kubernetes. We're going with 2x the requested cores. The reason for this is that the default values of GOMAXPROCS is NumCPU. When running on large nodes and only utilizing a small % of the underlying node results in high scheduling overhead. #### Problem Sometimes the setting of GOMAXPROCS might exceed the number of cores of the node. We also don't want to restrict the nodes on which pods run. In those cases setting GOMAXPROCS to a higher value than NumCPU has the opposite effect - it increases scheduling overhead instead of reducing it. The idea of this PR is to basically make automating the GOMAXPROCS setting in deployment tooling easier by having some support from the code. Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com> * Add CHANGELOG.md entry Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com> --------- Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov requested a review from a team as a code owner May 28, 2024 14:43

pracucci approved these changes May 28, 2024

View reviewed changes

dimitarvdimitrov added 2 commits May 28, 2024 16:56

Add CHANGELOG.md entry

02c5770

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov force-pushed the dimitar/clamp-gomaxprocs branch from a272486 to 02c5770 Compare May 28, 2024 14:57

pstibrany approved these changes May 28, 2024

View reviewed changes

dimitarvdimitrov merged commit 3803a60 into main May 29, 2024
29 checks passed

dimitarvdimitrov deleted the dimitar/clamp-gomaxprocs branch May 29, 2024 11:53

dimitarvdimitrov mentioned this pull request Sep 12, 2024

jsonnet: set GOMAXPROCS on ingesters #9273

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clamp GOMAXPROCS when higher than runtime.NumCPU #8201

Clamp GOMAXPROCS when higher than runtime.NumCPU #8201

dimitarvdimitrov commented May 28, 2024

dimitarvdimitrov commented May 28, 2024

pracucci left a comment

dimitarvdimitrov commented May 28, 2024

dimitarvdimitrov commented May 29, 2024

Clamp GOMAXPROCS when higher than runtime.NumCPU #8201

Clamp GOMAXPROCS when higher than runtime.NumCPU #8201

Conversation

dimitarvdimitrov commented May 28, 2024

Background

Problem

What this PR does

Considerations

Checklist

dimitarvdimitrov commented May 28, 2024

pracucci left a comment

Choose a reason for hiding this comment

dimitarvdimitrov commented May 28, 2024

dimitarvdimitrov commented May 29, 2024