Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdctl: find leader when promoting learner #28

Closed
wants to merge 1 commit into from

Conversation

jingyih
Copy link
Owner

@jingyih jingyih commented Apr 26, 2019

Background:
In #15, we decided to have client only send member promote request to leader in order to simply the implementation of learner progress check on server side.

What this PR does:
This PR partially addresses the issue of "only send member promote request to leader". etcdctl member promote will scan over the provided endpoints to find leader, then send promote request to leader endpoint.

Remaining issue:
User can still create their own clientv3 client with random endpoint in cluster and use it to send member promote request. If the endpoint is not a leader, request will fail with error ErrNotLeader according to logic added in #19. But at least etcdctl member promote command will try to find leader endpoint.

cc @gyuho @WIZARD-CXY @jpbetz

find leader endpoint in the given endpoints, send member promote request
to leader endpoint.
@jingyih jingyih force-pushed the find_leader_in_etcdctl_member_promote branch from 24b0337 to 285a96f Compare April 26, 2019 23:03
@WIZARD-CXY
Copy link

hmm, I don't have better idea to solve all the problem. I think it might just work for now.

@WIZARD-CXY
Copy link

maybe @jpbetz has better idea? for example add a helper function in clientv3 that can make client talk to the leader directly?

leaderEp string
leaderFound bool
)
for _, ep := range eps {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ctx has a deadline, loop multiple times over the endpoints until the deadline expires or a leader is found? This would reduce the odd of a leader change causing this operation to fail.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore if we can do a server side request foward like mentioned in #28 (comment)

@jpbetz
Copy link

jpbetz commented Apr 27, 2019

EtcdServer.LeaseTimeToLive() appears to have forward to leader logic:

// forward to leader
for cctx.Err() == nil {
leader, err := s.waitLeader(cctx)
if err != nil {
return nil, err
}

Can we use the same approach for PromoteLeader (and MoveLeader)? And avoid having the logic in the client?

@jpbetz
Copy link

jpbetz commented Apr 27, 2019

If we can forwarding use waitLeader() we should be able to check for ErrGRPCNotLeader when deciding if an error from the forward request should be retired in the for cctx.Err() == nil loop.

Maybe introduce an forwardToLeader(ctx context.Context, forwardFn func() error) error function that manages the retries and is shared by LeaseTimeToLive, PromoteLeader, MoveLeader and any other calls that need to be handled by the leader? It could keep attempting to find the leader and send the request to it, retrying only on ErrGRPCNotLeader error, and halting if the context is Done.

@jingyih
Copy link
Owner Author

jingyih commented Apr 29, 2019

Thanks @jpbetz. I'll take a closer look on server side forwarding. We might want to keep the client side logic (at least for move leader) for backward compatibility reasons (when new etcdctl send move leader request to old server).

@jingyih
Copy link
Owner Author

jingyih commented May 3, 2019

We decided to implement server side forwarding of the member promote request. Closing this PR.

@jingyih jingyih closed this May 3, 2019
@WIZARD-CXY
Copy link

agreed

@jingyih jingyih deleted the find_leader_in_etcdctl_member_promote branch September 7, 2019 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants