Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to GCSClient without a local raylet hangs #219

Open
glennmoy opened this issue Oct 23, 2023 · 0 comments
Open

Connecting to GCSClient without a local raylet hangs #219

glennmoy opened this issue Oct 23, 2023 · 0 comments

Comments

@glennmoy
Copy link
Contributor

glennmoy commented Oct 23, 2023

Follow up to comment thread:

The issue is that the Connect(client) call returns Status::OK irrespective of whether the GCS Server has been initiated

It first reports after 5 seconds that it can't connect, then after a minute kills the session with an EXIT_FAILURE.
Again these are set by RayConfig params.

If the client does not exist then then the thread executing the server (I think) throws the error which only gets reported but not caught in the Julia REPL

https://github.com/ray-project/ray/blob/cde6e887cbb21a9cae2632e3e4b883d913d38a05/src/ray/rpc/gcs_server/gcs_rpc_client.h#L212-L216

Unfortunately the gcs_is_down_ field is private, however there is a way to check if the server is alive that uses a callback

However, I don't think it's worth directly implementing this. The timeout should take care of things it's just that the error won't be nicely caught/reported in Julia but we can add that as a follow up.

Originally posted by @glennmoy in #211 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant