Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka: expire pending group members #3761

Merged
merged 2 commits into from
Feb 14, 2022

Conversation

dotnwat
Copy link
Member

@dotnwat dotnwat commented Feb 8, 2022

Cover letter

Prior to this change a pending member that never rejoins might leave a
pending member indefinitely. Now we kick em out.

When pending members stick around it may cause issues when trying to
determine of all members have joined. The function all members joined
will return false if any pending member exists, which would occur if a
pending member were left lingering in the group and this in turn may
prevent the group from properly transitioning through the state machine.
For example, the inline join completion that occurs in handle_join_group
as well as in the complete_join callback all have scenarios gated on
this condition which would prevent the group from moving into the sync
phase.

Release notes

  • Fixes a potential bug in consumer groups in which a pending member is stuck in a group because redpanda did not set an expiration time for pending members.

if (g._partition) {
return fmt::format("{}", g._partition->ntp());
} else {
return std::string("<none>");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constexpr string_view? for these static strings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish but I dunno how the types would be compatible

@emaxerrno
Copy link
Contributor

good find!

@emaxerrno emaxerrno closed this Feb 8, 2022
@emaxerrno
Copy link
Contributor

sorry! i meant to hit 'comment'

@emaxerrno emaxerrno reopened this Feb 8, 2022
@dotnwat dotnwat force-pushed the group-expire-pending-members branch from 99a59dd to 912f89d Compare February 9, 2022 04:17
Copy link
Member

@mmaslankaprv mmaslankaprv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great, the only thing is that some of the tests are failing.

Prior to this change a pending member that never rejoins might leave a
pending member indefinitely. Now we kick em out.

When pending members stick around it may cause issues when trying to
determine of all members have joined. The function all members joined
will return false if any pending member exists, which would occur if a
pending member were left lingering in the group and this in turn may
prevent the group from properly transitioning through the state machine.
For example, the inline join completion that occurs in handle_join_group
as well as in the complete_join callback all have scenarios gated on
this condition which would prevent the group from moving into the sync
phase.

Signed-off-by: Noah Watkins <noah@redpanda.com>
On join we dump out a lot of metadata about the group and members at
trace level. This is to aid in debugging stuck groups if they occur
again.

Signed-off-by: Noah Watkins <noah@redpanda.com>
@dotnwat
Copy link
Member Author

dotnwat commented Feb 10, 2022

@mmaslankaprv force pushed ⬆️ to fix use-after-free bug that was causing the CI failure.

@dotnwat
Copy link
Member Author

dotnwat commented Feb 13, 2022

ping @mmaslankaprv

dotnwat added a commit that referenced this pull request Feb 26, 2022
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants