Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raft flush fixes #2937

Merged
merged 3 commits into from
Nov 11, 2021
Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Nov 11, 2021

Cover letter

Fixed tracking flushed offset in an events of log truncation (both prefix and suffix). Fixed checking if configuration update is taking place when transferring leadership.

Fixes: #2932

Release notes

When truncating log we need to update raft flushed offset to reflect
truncated log state

Signed-off-by: Michal Maslanka <michal@vectorized.io>
@mmaslankaprv mmaslankaprv force-pushed the raft-flush-fixes branch 2 times, most recently from 5c37af5 to b1f2142 Compare November 11, 2021 11:04
@mmaslankaprv mmaslankaprv marked this pull request as ready for review November 11, 2021 11:06
@jcsp
Copy link
Contributor

jcsp commented Nov 11, 2021

These changes look good to me, although I think we're still going to have a very far-behind _flush_offset on followers when doing acks=1 writes, right? That is harmless as long as _commit_offset on the leader is only essential to these types of situation, but if we are e.g. exposing _commit_offset as a statistic, it's going to look pretty weird to human eyes.

I think it would make sense to also auto-flush followers when they are more than N bytes behind the head of the log, even if it's not strictly necessary for the raft code. But that could be a separate PR, as I'd like to merge this one promptly to resolve the failing test.

jcsp
jcsp previously approved these changes Nov 11, 2021
Copy link
Contributor

@jcsp jcsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wait for one more +1 on this before merging, since it's such a fiddly area.

@mmaslankaprv
Copy link
Member Author

These changes look good to me, although I think we're still going to have a very far-behind _flush_offset on followers when doing acks=1 writes, right? That is harmless as long as _commit_offset on the leader is only essential to these types of situation, but if we are e.g. exposing _commit_offset as a statistic, it's going to look pretty weird to human eyes.

I think it would make sense to also auto-flush followers when they are more than N bytes behind the head of the log, even if it's not strictly necessary for the raft code. But that could be a separate PR, as I'd like to merge this one promptly to resolve the failing test.

I agree that we should periodically flush on followers. The committed_index is never exposed to Kafka API so this should not be misleading to users. I know @rystsov is going to work on periodic flushes in raft

src/v/raft/consensus.cc Outdated Show resolved Hide resolved
src/v/raft/consensus.cc Outdated Show resolved Hide resolved
ztlpn
ztlpn previously approved these changes Nov 11, 2021
Signed-off-by: Michal Maslanka <michal@vectorized.io>
In redpanda raft implementation supports relaxed consistency. With
relaxed consistency we not always update `_commit_index` even tho
entries were replicated to all the nodes. When performing leadership
transfer we need to take that into account. When configuration is
successfully replicated to majority of nodes and it is not in joint
consensus state we are safe to proceed with the transfer.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
Copy link
Contributor

@rystsov rystsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mmaslankaprv mmaslankaprv merged commit ab330cb into redpanda-data:dev Nov 11, 2021
dotnwat added a commit that referenced this pull request Nov 11, 2021
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure in test_prefix_truncate_recovery (acks=1, start_empty=False)
4 participants