Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Ray Autoscaler does not work with Azure due to deprecated method #19523

Closed
2 tasks done
franklsf95 opened this issue Oct 19, 2021 · 16 comments · Fixed by #19603
Closed
2 tasks done

[Bug] Ray Autoscaler does not work with Azure due to deprecated method #19523

franklsf95 opened this issue Oct 19, 2021 · 16 comments · Fixed by #19603
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@franklsf95
Copy link
Contributor

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Clusters

What happened + What you expected to happen

ray up azure.yaml fails. See stack trace here Azure/azure-sdk-for-python#21313

The reason is that MS is deprecating this method get_client_from_cli_profile, see Azure/azure-sdk-for-python#21337

  File "/Users/lsf/opt/miniconda3/lib/python3.8/site-packages/ray/autoscaler/_private/_azure/node_provider.py", line 7, in <module>
    from azure.common.client_factory import get_client_from_cli_profile

Versions / Dependencies

Ray 1.7.0
Python 3.8.11
Latest install of azure-cli
macOS

Reproduction script

ray up azure.yml where the yml can be the example yaml file

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@franklsf95 franklsf95 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 19, 2021
@richardliaw
Copy link
Contributor

Nice catch! Push a fix?

cc @eisber @gramhagen

@franklsf95
Copy link
Contributor Author

@richardliaw I'm trying, but it doesn't look like there's a drop-in replacement for azure.common.client_factory.get_client_from_cli_profile. Might need someone more familiar with the Autoscaler to fix this.

@lmazuel
Copy link

lmazuel commented Oct 19, 2021

Hello, I'm the manager of the Azure Python SDK at MS, feel free to reach out to me by email (use my github alias and add microsoft.com at the end) if you need support on this and have questions. Long story short, we could have done better as a company here to make the deprecation story more visible, so I'm happy to help connect the dots if necessary.

@franklsf95
Copy link
Contributor Author

@lmazuel Thanks for following up on this! I'll be happy to work on a PR to fix this if I could get some help. There are only two callsites to azure.common:

from azure.common.client_factory import get_client_from_cli_profile
from azure.common.credentials import get_azure_cli_credentials

I did not find much docs on azure-identity (https://docs.microsoft.com/en-us/python/api/azure-identity/?view=azure-python). How can I use the AzureCliCredential class to obtain a credential and to replace the two APIs from azure.common? It would be great if you can point me to some code examples.

@lmazuel
Copy link

lmazuel commented Oct 20, 2021

So it used to be this:

        from azure.common.client_factory import get_client_from_cli_profile
        from azure.mgmt.compute import ComputeManagementClient
        client = get_client_from_cli_profile(ComputeManagementClient)

and it should now be this:

        from azure.identity import AzureCliCredential
        from azure.mgmt.compute import ComputeManagementClient
        client = ComputeManagementClient(AzureCliCredential(), subscription_id)

To get the default subscription:

        from azure.common.credentials import get_cli_profile
        subscription_id = get_cli_profile().get_subscription_id()

@gramhagen
Copy link
Contributor

gramhagen commented Oct 20, 2021 via email

@franklsf95
Copy link
Contributor Author

I'll take a stab at it today.

@gramhagen
Copy link
Contributor

gramhagen commented Oct 21, 2021 via email

@franklsf95
Copy link
Contributor Author

Great! I'll leave it to you then.

@mickare
Copy link

mickare commented Dec 10, 2021

@bkpcoding
Copy link

Hey, I am still getting this error. How to resolve this?

Traceback (most recent call last):
  File "/home/sagar/anaconda3/envs/rllib/bin/ray", line 8, in <module>
    sys.exit(main())
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/scripts/scripts.py", line 1989, in main
    return cli()
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/scripts/scripts.py", line 971, in up
    create_or_update_cluster(
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/autoscaler/_private/commands.py", line 233, in create_or_update_cluster
    config = _bootstrap_config(config, no_config_cache=no_config_cache)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/autoscaler/_private/commands.py", line 316, in _bootstrap_config
    resolved_config = provider_cls.bootstrap_config(config)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/autoscaler/_private/_azure/node_provider.py", line 309, in bootstrap_config
    return bootstrap_azure(cluster_config)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/autoscaler/_private/_azure/config.py", line 22, in bootstrap_azure
    config = _configure_resource_group(config)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/ray/autoscaler/_private/_azure/config.py", line 58, in _configure_resource_group
    resource_client.resource_groups.create_or_update(
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/mgmt/resource/resources/v2021_04_01/operations/_resource_groups_operations.py", line 154, in create_or_update
    pipeline_response = self._client._pipeline.run(request, stream=False, **kwargs)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 211, in run
    return first_node.send(pipeline_request)  # type: ignore
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  [Previous line repeated 2 more times]
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/mgmt/core/policies/_base.py", line 47, in send
    response = self.next.send(request)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/policies/_redirect.py", line 158, in send
    response = self.next.send(request)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/policies/_retry.py", line 445, in send
    response = self.next.send(request)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/policies/_authentication.py", line 117, in send
    self.on_request(request)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/core/pipeline/policies/_authentication.py", line 94, in on_request
    self._token = self._credential.get_token(*self._scopes)
  File "/home/sagar/anaconda3/envs/rllib/lib/python3.8/site-packages/azure/common/credentials.py", line 72, in get_token
    _, token, fulltoken = credentials._token_retriever()  # pylint:disable=protected-access
AttributeError: 'CredentialAdaptor' object has no attribute '_token_retriever'

@gramhagen
Copy link
Contributor

Ugh. You're right we missed one. Will have time to look at it later this week.

@gramhagen
Copy link
Contributor

ok, sorry at first glance I assumed this was an issue with the changing sdk function names, but I don't believe that's the cause here. @bkpcoding can you tell me what version of azure python packages are being used with something like pip list | grep azure
also what version of ray? are you using something built from master? because the lines in the error message look like an older version, so i'm hoping this issue is actually fixed by the changes in #19603

@bkpcoding
Copy link

I am currently using ray version 1.9.1. As for the version azure, these are the versions

azure-appconfiguration                  1.1.1
azure-batch                             11.0.0
azure-cli                               2.31.0
azure-cli-core                          2.31.0
azure-cli-telemetry                     1.0.6
azure-common                            1.1.27
azure-core                              1.21.1
azure-cosmos                            3.2.0
azure-datalake-store                    0.0.52
azure-graphrbac                         0.60.0
azure-identity                          1.7.1
azure-keyvault                          1.1.0
azure-keyvault-administration           4.0.0b3
azure-keyvault-keys                     4.5.0b5
azure-loganalytics                      0.1.1
azure-mgmt-advisor                      9.0.0
azure-mgmt-apimanagement                0.2.0
azure-mgmt-appconfiguration             2.0.0
azure-mgmt-applicationinsights          1.0.0
azure-mgmt-authorization                0.61.0
azure-mgmt-batch                        16.0.0
azure-mgmt-batchai                      7.0.0b1
azure-mgmt-billing                      6.0.0
azure-mgmt-botservice                   0.3.0
azure-mgmt-cdn                          11.0.0
azure-mgmt-cognitiveservices            13.0.0
azure-mgmt-compute                      23.1.0
azure-mgmt-consumption                  2.0.0
azure-mgmt-containerinstance            9.1.0
azure-mgmt-containerregistry            8.2.0
azure-mgmt-containerservice             16.1.0
azure-mgmt-core                         1.3.0
azure-mgmt-cosmosdb                     7.0.0b2
azure-mgmt-databoxedge                  1.0.0
azure-mgmt-datalake-analytics           0.2.1
azure-mgmt-datalake-nspkg               3.0.1
azure-mgmt-datalake-store               0.5.0
azure-mgmt-datamigration                10.0.0
azure-mgmt-deploymentmanager            0.2.0
azure-mgmt-devtestlabs                  4.0.0
azure-mgmt-dns                          8.0.0
azure-mgmt-eventgrid                    9.0.0
azure-mgmt-eventhub                     9.1.0
azure-mgmt-extendedlocation             1.0.0
azure-mgmt-hdinsight                    9.0.0
azure-mgmt-imagebuilder                 0.4.0
azure-mgmt-iotcentral                   9.0.0
azure-mgmt-iothub                       2.1.0
azure-mgmt-iothubprovisioningservices   0.3.0
azure-mgmt-keyvault                     9.3.0
azure-mgmt-kusto                        0.3.0
azure-mgmt-loganalytics                 11.0.0
azure-mgmt-managedservices              1.0.0
azure-mgmt-managementgroups             0.2.0
azure-mgmt-maps                         2.0.0
azure-mgmt-marketplaceordering          1.1.0
azure-mgmt-media                        7.0.0
azure-mgmt-monitor                      2.0.0
azure-mgmt-msi                          0.2.0
azure-mgmt-netapp                       5.1.0
azure-mgmt-network                      19.3.0
azure-mgmt-nspkg                        3.0.2
azure-mgmt-policyinsights               1.0.0
azure-mgmt-privatedns                   1.0.0
azure-mgmt-rdbms                        10.0.0
azure-mgmt-recoveryservices             2.0.0
azure-mgmt-recoveryservicesbackup       3.0.0
azure-mgmt-redhatopenshift              1.0.0
azure-mgmt-redis                        13.0.0
azure-mgmt-relay                        0.1.0
azure-mgmt-reservations                 0.6.0
azure-mgmt-resource                     20.0.0
azure-mgmt-search                       8.0.0
azure-mgmt-security                     2.0.0b1
azure-mgmt-servicebus                   6.0.0
azure-mgmt-servicefabric                1.0.0
azure-mgmt-servicefabricmanagedclusters 1.0.0
azure-mgmt-servicelinker                1.0.0b1
azure-mgmt-signalr                      1.0.0
azure-mgmt-sql                          3.0.1
azure-mgmt-sqlvirtualmachine            1.0.0b1
azure-mgmt-storage                      19.0.0
azure-mgmt-synapse                      2.1.0b3
azure-mgmt-trafficmanager               0.51.0
azure-mgmt-web                          4.0.0
azure-multiapi-storage                  0.7.0
azure-nspkg                             3.0.2
azure-storage-common                    1.4.2
azure-synapse-accesscontrol             0.5.0
azure-synapse-artifacts                 0.9.0
azure-synapse-managedprivateendpoints   0.3.0
azure-synapse-spark                     0.2.0
msrestazure                             0.6.4

@gramhagen
Copy link
Contributor

got it thanks, yes the changes are not in 1.9.1 afaik. if you install from the latest wheel built from master does it work?
pip install -U "ray[default] @ https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-2.0.0.dev0-cp38-cp38-manylinux2014_x86_64.whl"

@bkpcoding
Copy link

Yes, it does, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants