Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

embed: Can't restart etcd #6042

Closed
purpleidea opened this issue Jul 26, 2016 · 18 comments
Closed

embed: Can't restart etcd #6042

purpleidea opened this issue Jul 26, 2016 · 18 comments

Comments

@purpleidea
Copy link
Contributor

Etcd now has an embed API. Sweet :)

I believe there is still a bind problem when starting a server, then stopping it, and then starting it again. On the second start, you'll get a bind: address already in use error because the listen isn't closed properly.

Opening this issue at the request of @heyitsanthony

I think this may be casually related to #2920

Thanks!

@gyuho
Copy link
Contributor

gyuho commented Jul 26, 2016

Can you provide code snippet to reproduce? Thanks.

@purpleidea
Copy link
Contributor Author

@gyuho Fair enough, I don't have a trivial snippet at the moment, I'll reopen when I get one. Thanks!

@purpleidea
Copy link
Contributor Author

I forget if this is related to: golang/go#4674 or not.

@purpleidea
Copy link
Contributor Author

Apologies for the delay, but I can reliably reproduce this issue, so I will re-open this. I have an easy POC inside of https://github.com/purpleidea/mgmt/

@purpleidea purpleidea reopened this Apr 11, 2019
@purpleidea
Copy link
Contributor Author

Here is the reproducer:

start up three members...

./mgmt run --hostname h1 --tmp-prefix --no-pgp empty
./mgmt run --hostname h2 --tmp-prefix --no-pgp --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2381 --server-urls http://127.0.0.1:2382 empty
./mgmt run --hostname h3 --tmp-prefix --no-pgp --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2383 --server-urls http://127.0.0.1:2384 empty

tell the ideal cluster size to be three...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 put /_mgmt/chooser/dynamicsize/idealclustersize 3

check that it is...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 member list

add two more clients...

./mgmt run --hostname h4 --tmp-prefix --no-pgp --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2385 --server-urls http://127.0.0.1:2386 empty
./mgmt run --hostname h5 --tmp-prefix --no-pgp --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2387 --server-urls http://127.0.0.1:2388 empty

tell the cluster size to be 4...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 put /_mgmt/chooser/dynamicsize/idealclustersize 4

one more member will be started now...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 member list

set it back to three...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 put /_mgmt/chooser/dynamicsize/idealclustersize 3

make note of who shutdown...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 member list

bring it back to 4...

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 put /_mgmt/chooser/dynamicsize/idealclustersize 4

you'll most likely have one that previously started, try to start again... repeat the above 3->4->3 if not.

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 member list

in the logs of that member:

main.go:385: etcd: runtime error: listen tcp 127.0.0.1:2384: bind: address already in use
server start failed

I believe I am using the embed API correctly, but if there is some additional shutdown/unbind step that I should be performing that I am not, then please let me know. Thanks!

@stale
Copy link

stale bot commented Apr 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 7, 2020
@purpleidea
Copy link
Contributor Author

hi bot! please stop pinging here

@stale stale bot removed the stale label Apr 7, 2020
@stale
Copy link

stale bot commented Jul 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 6, 2020
@cretz
Copy link

cretz commented Jul 6, 2020

Bump, I don't believe this is resolved (IIRC in my integration tests cannot stop and start up embedded etcd again due to this issue). At the least deserves confirmation that's fixed before closing.

@stale stale bot removed the stale label Jul 6, 2020
@purpleidea
Copy link
Contributor Author

@cretz It's not fixed last I checked, but I'm giving up fighting with the bots, it's very end-user hostile I think. One bump should be enough for the lifetime of the bug. And the bot bugged about eight others today. :/

@stale
Copy link

stale bot commented Oct 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 4, 2020
@purpleidea
Copy link
Contributor Author

Stop closing this bot. This is important.

@stale stale bot removed the stale label Oct 4, 2020
@ptabor
Copy link
Contributor

ptabor commented Oct 5, 2020

I believe it might be about (etcd not using) 'SO_REUSEADDR'. Without this setting port is 'locked' for additional 60-120s,
to avoid getting left-over communication from previous customers.

That's pretty good article about this:
https://stackoverflow.com/questions/3229860/what-is-the-meaning-of-so-reuseaddr-setsockopt-option-linux/3233022#3233022
and https://hea-www.harvard.edu/~fine/Tech/addrinuse.html

@purpleidea
Copy link
Contributor Author

purpleidea commented Nov 26, 2020 via email

@stale
Copy link

stale bot commented Feb 25, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 25, 2021
@purpleidea
Copy link
Contributor Author

bot

@ptabor
Copy link
Contributor

ptabor commented Apr 5, 2021

I think / hope it got fixed in:

#12702

Please test and reopen.

@purpleidea
Copy link
Contributor Author

@ptabor Fantastic news, thanks! I'll test in 3.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants