Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate "Compute Engine Metadata server unavailable" #42

Closed
machow opened this issue Mar 29, 2021 · 6 comments
Closed

Investigate "Compute Engine Metadata server unavailable" #42

machow opened this issue Mar 29, 2021 · 6 comments

Comments

@machow
Copy link
Contributor

machow commented Mar 29, 2021

When I run tasks that require saving to GCS, I get a sort of cryptic error:

Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Enginemetadata service. Compute Engine Metadata server unavailable

Seems to be related to using token="cloud" in this code:

fs = gcsfs.GCSFileSystem(project=gcs_project, token="cloud")

See also:

Tried to pull out a reproducible example:

from google.auth.compute_engine import _metadata
from google.auth.transport.requests import Request
import requests

req = Request(requests.Session())
info = _metadata.get_service_account_info(req)
info
@hunterowens
Copy link
Member

I faced this error while debugging another job earlier.

IIRC steps to fix where

  1. remove token and project inside the python level initialization of the filesystem
  2. make sure on the command line (or command line before launching jupyter airflow etc), had run gcloud config set project $PROJECT_ID and that the command line gcloud had the project properly set?

@machow
Copy link
Contributor Author

machow commented Mar 29, 2021

Ah, thanks! After some more digging--I ended up getting to work by....

  • deleting project arg, setting token to google_default
  • in shell, running unset GOOGLE_APPLICATION_CREDENTIALS followed by gcloud init
  • in docker-compose.yaml
    • setting the env var GOOGLE_CLOUD_PROJECT="cal-itp-data-infra"
    • changing the mounted config volume to point to be user's home (/home/airflow/) rather than /root (though, I think we set the user id, so the config was readable?).

If someone wants to set it to use different credentails, they can set GOOGLE_APPLICATION_CREDENTIALS.

edit:

See this doc for details: https://googleapis.dev/python/google-auth/latest/user-guide.html#using-external-identities

@machow
Copy link
Contributor Author

machow commented Mar 29, 2021

One weird note is that I set the project, and can see it in the config, but it doesn't seem like the authentication process cares...

$ gcloud config list
[core]
account = michael.c@jarv.us
disable_usage_reporting = True
project = cal-itp-data-infra

Your active configuration is: [default]

@hunterowens
Copy link
Member

let's keep this open until we understand, but my understanding is this is now fixed.

@machow
Copy link
Contributor Author

machow commented Apr 1, 2021

I've had to dive a bit deeper while looking into the KubernetesPodOperator. There's a nice explanation in this airflow doc. It sounds like the gcloud's metadata service allows cloud instances to automatically authenticate.

I have no idea why it was working locally though :/. Maybe our credentials were cached from some other activity?

@machow
Copy link
Contributor Author

machow commented May 10, 2021

Closing, since I think we resolved by pointing it to our volume mounted credentials (and noting METADATA is an internal cloud service)

@machow machow closed this as completed May 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants