Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applies zone labels to newly created vsphere volumes #72687

Merged

Conversation

subramanian-neelakantan
Copy link
Contributor

@subramanian-neelakantan subramanian-neelakantan commented Jan 8, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
When a volume is created on vSphere, the zone label does not get applied on it today. Without this information on the volume, a pod that uses this volume could get scheduled to a different zone where this volume is not visible/accessible. This issue can be solved by just labeling the volume with the zone information. This PR computes the zone of a volume in the PersistentVolumeLabel admission controller since vsphere is still an in-tree cloud provider and the out-of-tree plugin in still the works.

Which issue(s) this PR fixes:
Fixes #73301
Part of issue #67703

Special notes for your reviewer:
This part does not guarantee that allowedTopologies specified in a StorageClass will be honoured. That would be done as part of a different PR.

Does this PR introduce a user-facing change?:
Yes

This change applies zone labels to vSphere Volumes automatically. The zone labels are visible on the PV:
$ kubectl get pv --show-labels
NAME           CAPACITY   ACCESSMODES   STATUS    CLAIM            REASON    AGE       LABELS
pv-abc                5Gi        RWO                   Bound     default/claim1                    46s       failure-domain.beta.kubernetes.io/region=VC1,failure-domain.beta.kubernetes.io/zone=cluster-1

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 8, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @subramanian-neelakantan. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 8, 2019
@k8s-ci-robot k8s-ci-robot added sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 8, 2019
@subramanian-neelakantan
Copy link
Contributor Author

/test all

@k8s-ci-robot
Copy link
Contributor

@subramanian-neelakantan: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@subramanian-neelakantan
Copy link
Contributor Author

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. area/apiserver area/kubelet sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 9, 2019
@subramanian-neelakantan
Copy link
Contributor Author

@SandeepPissay Thanks for the review and both good suggestions.

The main concern I have with this PR is performance -

  • datastore to zone/region mapping is static but we seem to be computing the zone for the same datastore for every volume created on it.

Let us track this performance improvement as a separate task since it fits in easily as an incremental change. Here is the issue I just created to track this - #73955

  • computing volume labels logic is time consuming as it involves multiple remote calls. This is done even if zones feature is not enabled in k8s cluster.

There is no way to disable the 'zones' feature in a k8s cluster AFAICT. The 'Labels' section of the vsphere.conf should not be treated as a feature gate. It only over-rides the tag category name used for zone and region.

@SandeepPissay
Copy link
Contributor

SandeepPissay commented Feb 12, 2019

@subramanian-neelakantan I would like to see the content of vsphere.conf with this code change? Could you paste it here?

I'm wondering if we can have a flag in vSphere.conf to explicitly enable multi zones support in VCP.

@frapposelli
Copy link
Member

My understanding is that zones support is enabled only when the zones parameters are set in vsphere.conf otherwise support is disabled.

@SandeepPissay
Copy link
Contributor

If zones feature is enabled by explicitly setting zones params in vsphere.conf, then VCP/vsphere volume plugin should not be invoking tag service APIs at all if the zones param is not present(zones features disabled). But that is not the case in this PR, which is what I'm highlighting here.

If there is a way to determine if zones feature is disabled, I would suggest to skip tag service API calls into Virtual Center. Else, this PR will introduce performance regression.

@subramanian-neelakantan
Copy link
Contributor Author

With this change, zone support is enabled if the vCenter inventory (Datacenter/Cluster/Host) is tagged with zone/region tags. VCP looks for tags with the category name "zone" and "region" to identify this. The vsphere.conf is only used to over-ride the default category names of "zone" and "region". So Tag API calls are used to determine whether zones support is enabled or not.

My opinion is that we do not need another back door feature enablement (through vsphere.conf) for the zones support. It should be present once VCP has implemented this k8s feature. The performance implication of such a feature support, if any, should be addressed separately (see issue #73955).

OTOH, if we feel strongly against supporting zone feature by default in VCP, I can revert the part of code that makes "zone" and "region" as the default category names. That way, zones will be discovered from VC only when the vsphere.conf has such an entry:
[Labels]
zone = <vc_zone_tag_category_name>
region = <vc_region_tag_category_name>

@SandeepPissay
Copy link
Contributor

Most of VMware customers are not using zones and may not need this feature, so this change will introduce performance regression for them which I'm suggesting to avoid. This can be a simple flag that gets set during VCP init.

@frapposelli
Copy link
Member

/cc @dougm @jiatongw PTAL, I don't remember the context/details on why the zones feature was made entirely optional

@dougm
Copy link
Member

dougm commented Feb 13, 2019

We chose to make zones/tags optional to avoid breaking existing user configs on upgrades of K8s.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 14, 2019
@subramanian-neelakantan
Copy link
Contributor Author

Most of VMware customers are not using zones and may not need this feature, so this change will introduce performance regression for them which I'm suggesting to avoid. This can be a simple flag that gets set during VCP init.

Since I have not heard any objections to this approach, I have made this change in commit ba9a9cf to treat the "zone and region [Labels]" entries in vsphere.conf file to indicate whether zone support is enabled for VCP. When these entries are not present in vsphere.conf, VCP avoid zone based computation. In fact, it avoids discovering the tags from VC entirely.

@SandeepPissay PTAL.

@frapposelli
Copy link
Member

frapposelli commented Feb 15, 2019

thanks @dougm for the context

@subramanian-neelakantan the approach you took with the latest commit seems to follow the original intent of the Zones feature so

/lgtm

leaving approval to @SandeepPissay

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 15, 2019
Copy link
Contributor

@SandeepPissay SandeepPissay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

// GetVolumeLabels returns the well known zone and region labels for given volume
func (vs *VSphere) GetVolumeLabels(volumePath string) (map[string]string, error) {
// Create context
ctx, cancel := context.WithCancel(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this code to after zones check as this is redundant if zones is not enabled.

@vladimirvivien
Copy link
Member

@SandeepPissay don't forget to mark this PR as /approved so it'll get merged.

@vladimirvivien
Copy link
Member

Will this be cherry picked to an earlier version?

@SandeepPissay
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, frapposelli, SandeepPissay, subramanian-neelakantan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/kubelet area/provider/vmware Issues or PRs related to vmware provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vSphere volumes do not have zone labels auto applied
10 participants