Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put consumers in their own scheduling group #10398

Merged

Conversation

travisdowns
Copy link
Member

@travisdowns travisdowns commented Apr 26, 2023

The idea here is that consumers can (and do) poll Redpanda quickly for new messages when they are at the tip of the log. Since consumption in this state is (relatively) CPU-heavy and has no IO, only a few concurrent consumers can saturate a shard's CPU as at least one is running the fetch flow at any one time (e.g., if a consumer takes 2 ms to handle one fetch and try the next, and it takes RP 1 ms CPU to handle the fetch, just 3 consumers will end up saturating the CPU).

When this occurs it slows down all other tasks, like produce requests, raft stuff, etc. It would be better to give the CPU cycles to those tasks, because the consumers are flexible here anyway: if their task gets delayed for 1 or 2 ms, it just means they return more data in their fetch, and can sustain the same throughput at a slight latency code (but the consumer latency is anyway mostly hidden since it is happening async as "readahead" in the client, so it seems to be rarely considered an important metric).

This change puts consumers into their own scheduling group, meaning they will compete with themselves for CPU time but other tasks will get their share of the CPU rather than just being 1 out of many fetch tasks all lumped into the same queue.

Partial fix for redpanda-data/core-internal#435.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Improvements

  • Improved CPU balance between fetch requests and other system activity when there is a high fetch load. Previously, fetchers reading from the tip of the log could partly starve out other types of work (which produce a smaller number of tasks) causing overall poor performance, but now fetch requests have less impact on other tasks running on the CPU.

Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes a lot of sense to me.

@piyushredpanda piyushredpanda marked this pull request as ready for review April 26, 2023 23:17
@piyushredpanda piyushredpanda marked this pull request as draft April 26, 2023 23:18
By analogy with smp_groups which is already public we need to
access this outside of application.cc to pass the right scheduling
groups through to instantiations of services in our redpanda fixture.
This scheduling group is used for consumer fetch processing. We assign
it the same priority as the default group (where most other kafka
handling takes place), but by putting it into its own group we prevent
non-fetch requests from being significantly delayed when fetch requests
use all the CPU.

Issue redpanda-data/core-internal#435
Add the with_scheduling_group call to switch to the kafka scheduling
group in the fetch handling flow. See prior change for context.

Issue redpanda-data/core-internal/redpanda-data#435.
@travisdowns
Copy link
Member Author

/ci-repeat 1

@travisdowns
Copy link
Member Author

Test failure seems spurious, timeout whlie stopping though I did not find a match and filed:

#10402

@travisdowns
Copy link
Member Author

travisdowns commented Apr 27, 2023

Some perf results from localhost:

1 rdkafka producer like so:

 ./rdkafka_performance -P -c '300000000' -s '1024' -t xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx -b localhost:9092 -u -r 140000 -K4

2 rdkafka consumers like so:

./rdkafka_performance -C -t xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx -b localhost -G foo -u -oend

Latency reported by client, with the change 1/3rd of the way through when I switched from using the separate group (left part) to the old style of default group (right part):

image

The latency jumps by about 5x and actually the server cannot keep up wtih the rate any more, dropping from 140K (fixed rate) to less than 100K messages sustained.

For this particular test I can't see much change to the consumer latency. Here's consumer latency for a different run: the red line is where I changed from new strategy (on the left, this change) to old strategy (on the right, before this change):

image

The consumers, I guess, are mostly waiting on the producers to get data they can read, so being put in their own group, which may delay their tasks relative to producers doesn't seem to hurt much: perhaps it even helps in a sense because if a consumer is going to poll and get nothing, they'd actually rather a producer task ran first to append to the log so that the fetch is actually successful.

Introduce a config tunable which would let users set the scheduling
group to use for kafka fetch request back to the default group.

Also useful for quick A/B testing of the perf effect of the change.
@piyushredpanda
Copy link
Contributor

Failure in new run seems to be #10219
@travisdowns this is ready to review now? Ideally we get this backported to v22.3.x and v23.1.x for release (today)

@piyushredpanda piyushredpanda marked this pull request as ready for review April 27, 2023 13:21
@piyushredpanda piyushredpanda merged commit f23f4ba into redpanda-data:dev Apr 27, 2023
@piyushredpanda
Copy link
Contributor

/backport v23.1.x

@dotnwat
Copy link
Member

dotnwat commented Apr 28, 2023

Excellent write up @travisdowns thanks for the cover letter details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants