Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metrics-server lookup order #4884

Merged
merged 1 commit into from
Oct 21, 2021
Merged

Fix metrics-server lookup order #4884

merged 1 commit into from
Oct 21, 2021

Conversation

vasu1124
Copy link
Member

Signed-off-by: vasu1124 vasu1124@actvirtual.com

How to categorize this PR?
/area control-plane
/kind bug

What this PR does / why we need it:
The metrics-server with the current rule will always use Hostname to resolve the scraping target. The Hostname can only be resolved in infrastructure environments where automatically DNS entries are provisioned. In many environments, like OpenStack, metal, equinix, etc. there is no such DNS entry.
The setting ensures that the metric-server picks one of the keys (type in the given rule order) from the node resource, e.g.

  status:
    addresses:
    - address: 10.250.1.161
      type: InternalIP
    - address: shoot--project--name-worker-pool-z1-12345-abcd
      type: Hostname

The type: Hostname will always be available (no node w/o hostname!), but is not always resolveable. Hence, the metric-server in such cases will always fail to scrape the node.

This PR fixes the rule to choose InternalIP first, which should be a correct selection for allmost all environments, and skip to other the alternatives only if that type is not set:
InternalIP,InternalDNS,ExternalDNS,ExternalIP,Hostname

Which issue(s) this PR fixes:
Fixes #4626 #2455 #2019 and probably more

Special notes for your reviewer:
We have been arguing over this rule change over many issues (overlayed by a notation/typo fix with/without bracket).
The proposed rule should be safe. But please run via test matrix.

Release note:

fix metrics-server for scenarios where address resolution via hostname does not work.

Signed-off-by: vasu1124 <vasu1124@actvirtual.com>
@vasu1124 vasu1124 requested a review from a team as a code owner October 20, 2021 15:57
@gardener-robot gardener-robot added area/control-plane Control plane related kind/bug Bug needs/review size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 20, 2021
Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @vpnachev @timuthy @MartinWeindel @jia-jerry
Any concerns/comments?

Copy link
Member

@vpnachev vpnachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

A was thinking the same in #4626 (comment)

Copy link
Member

@timuthy timuthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rfranzke rfranzke merged commit df42566 into gardener:master Oct 21, 2021
krgostev pushed a commit to krgostev/gardener that referenced this pull request Apr 21, 2022
Signed-off-by: vasu1124 <vasu1124@actvirtual.com>
krgostev pushed a commit to krgostev/gardener that referenced this pull request Jul 5, 2022
Signed-off-by: vasu1124 <vasu1124@actvirtual.com>
ialidzhikov added a commit to ialidzhikov/gardener-extension-provider-alicloud that referenced this pull request Jun 25, 2024
After gardener/gardener#4884 it should no longer be required to have the mutation of the metrics-server Deployment.
ialidzhikov added a commit to ialidzhikov/gardener-extension-provider-packet that referenced this pull request Jun 25, 2024
After gardener/gardener#4884 it should no longer be required to have the mutation of the metrics-server Deployment.
ialidzhikov added a commit to ialidzhikov/gardener-extension-provider-alicloud that referenced this pull request Sep 9, 2024
After gardener/gardener#4884 it should no longer be required to have the mutation of the metrics-server Deployment.
ialidzhikov added a commit to ialidzhikov/gardener-extension-provider-alicloud that referenced this pull request Sep 9, 2024
After gardener/gardener#4884 it should no longer be required to have the mutation of the metrics-server Deployment.
kevin-lacoo pushed a commit to gardener/gardener-extension-provider-alicloud that referenced this pull request Sep 18, 2024
)

* Drop obsolete code for the vpn-shoot Service mutation

After ReversedVPN feature is unconditionally enabled, there is no longer a Service kube-system/vpn-shoot in the Shoot cluster

* Drop obsolete code for the metrics-server Deployment mutation

After gardener/gardener#4884 it should no longer be required to have the mutation of the metrics-server Deployment.

* Add object selector to the shoot webhook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/control-plane Control plane related kind/bug Bug size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants