Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topology-updater: introduce exclude-list #949

Merged

Conversation

Tal-or
Copy link
Contributor

@Tal-or Tal-or commented Nov 2, 2022

The exclude-list allows filtering specific resource accounting
from NRT's objects per node basis.

The CRs created by the topology-updater are used by the scheduler-plugin
as a source of truth for making scheduling decisions.
As such, this feature allows hiding specific information
from the scheduler, which in turn
will affect the scheduling decision.
A common use case is when a user would like to perform scheduling
decisions that are based on a specific resource.
In that case, we can exclude all the other resources
which we don't want the scheduler to examine.

The exclude-list is provided to the topology-updater via a ConfigMap.
Resource type's names specified in the list should match the names
as shown here: https://pkg.go.dev/k8s.io/api/core/v1#ResourceName

This is a resurrection of an old work started here:
#545

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 2, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @Tal-or. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 2, 2022
@netlify
Copy link

netlify bot commented Nov 2, 2022

Deploy Preview for kubernetes-sigs-nfd ready!

Name Link
🔨 Latest commit d495376
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-nfd/deploys/637bd2563d86300008d4ef95
😎 Deploy Preview https://deploy-preview-949--kubernetes-sigs-nfd.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@marquiz
Copy link
Contributor

marquiz commented Nov 4, 2022

Thanks @Tal-or for the patch. On a very very quick look it looks generally good 😄 I'll take a closer look next week

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 4, 2022
Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, still a little bit of work but nothing major. Basically we should take care of the Helm deployment, too, and update documentation.

In the documentation we should somewhere describe this new feature, e.g. in docs/usage/nfd-topology-updater.md. We could also add topology-updater config reference (similar to docs/reference/worker-configuration-reference.md). We don't support dynamic configuration file update (similar to nfd-worker) and I think that's fine but I think that should be noted in the documentation as well.

Albout the commit message(s). It would be nice if the commit message would explain a bit why this is added. E.g. an example of an intended usage scenario (it's still a bit unclear to me 😅)

Also, could you wrap the body of commit messages to 72 characters (https://www.kubernetes.dev/docs/guide/pull-requests/#wrap-the-commit-message-body-at-72-characters)

@Tal-or Tal-or force-pushed the exclude_list branch 2 times, most recently from 6722288 to ed0e102 Compare November 10, 2022 16:49
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 12, 2022
@Tal-or Tal-or force-pushed the exclude_list branch 2 times, most recently from bd681d9 to c5d2c0b Compare November 15, 2022 20:11
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 15, 2022
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 15, 2022

Explained the rationale for this feature and added an e2e test

@marquiz
Copy link
Contributor

marquiz commented Nov 16, 2022

added an e2e test

👍👍 just need to fix the one linter problem there

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 16, 2022

added an e2e test

+1+1 just need to fix the one linter problem there

On it. Already been fixed locally I'll just address the last comment from you and upload a new revision

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress @Tal-or 👍 Just a few more small comments but almost there, I think

docs/reference/topology-updater-commandline-reference.md Outdated Show resolved Hide resolved
docs/reference/topology-updater-configuration-reference.md Outdated Show resolved Hide resolved
docs/usage/nfd-topology-updater.md Outdated Show resolved Hide resolved
docs/usage/nfd-topology-updater.md Outdated Show resolved Hide resolved
docs/usage/nfd-topology-updater.md Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 16, 2022
@Tal-or Tal-or force-pushed the exclude_list branch 4 times, most recently from ec3e35c to 9cfa884 Compare November 16, 2022 15:13
Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tal-or. Good that you spotted the helm chart parameters table in docs, too.

Just two more small nits 🙄 But after those I think I'm good

docs/usage/nfd-topology-updater.md Outdated Show resolved Hide resolved
pkg/resourcemonitor/excludelist.go Show resolved Hide resolved
@marquiz
Copy link
Contributor

marquiz commented Nov 21, 2022

@Tal-or #961 is merged. Could you rebase this one?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 21, 2022
The exclude-list allows to filter specific resource accounting
from NRT's objects per node basis.

The CRs created by the topology-updater are used by the scheduler-plugin
as a source of truth for making scheduling decisions.
As such, this feature allows to hide specific information
from the scheduler, which in turn
will affect the scheduling decision.
A common use case is when user would like to perform scheduling
decisions which are based on a specific resource.
In that case, we can exclude all the other resources
which we don't want the scheduler to exemine.

The exclude-list is provided to the topology-updater via a ConfigMap.
Resource type's names specified in the list should match the names
as shown here: https://pkg.go.dev/k8s.io/api/core/v1#ResourceName

This is a resurrection of an old work started here:
kubernetes-sigs#545

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
Different tests requires different configuration
of the topology-updater DaemonSet.
Here, we decouple the configuration from the creation part
using `JustBeforeEach` so that each test container
will has its own configuration.

Additional reading:
https://onsi.github.io/ginkgo/#separating-creation-and-configuration-justbeforeeach

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 21, 2022
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 21, 2022

/hold cancel #961 merged

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 21, 2022

/test pull-node-feature-discovery-build-image-cross-generic

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tal-or.

I have one more nit about the commit message title(s)

e2e: topology-updater: add e2e test

add e2e test for what, not the whole topology updater, I think 😄

Same thing with the commits updating deployment and docs.

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 21, 2022

Thanks @Tal-or.

I have one more nit about the commit message title(s)

e2e: topology-updater: add e2e test

add e2e test for what, not the whole topology updater, I think 😄

Same thing with the commits updating deployment and docs.

Oops :) I thought that because it is under the topology update PR it's clear. I'll address that no problem

@marquiz
Copy link
Contributor

marquiz commented Nov 21, 2022

I thought that because it is under the topology update PR it's clear. I'll address that no problem

NP. Yeah, it's clear now but later on when reading the Git log it's not so obvious anymore :)

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
Add a kustomization file with a config example for the exclude-list.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
- Add a helm template with a config example for the exclude-list.
- Add mount for the topology-updater.conf file
- Update the templates Makefile target

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
Update the docs with explanations and examples
about the exclude-list feature.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 21, 2022

I thought that because it is under the topology update PR it's clear. I'll address that no problem

NP. Yeah, it's clear now but later on when reading the Git log it's not so obvious anymore :)

Done!

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now 👍

ping @fromanirh @fmuyassarov wanna take a look?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marquiz, Tal-or

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@fmuyassarov fmuyassarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
thanks @Tal-or

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2022
@ffromani
Copy link
Contributor

/hold cancel
per #949 (review) and #949 (review)

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2022
@k8s-ci-robot k8s-ci-robot merged commit 592d6c6 into kubernetes-sigs:master Nov 22, 2022
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 22, 2022

@marquiz Thank you for the time to review this work

@Tal-or Tal-or deleted the exclude_list branch November 22, 2022 08:09
@marquiz marquiz mentioned this pull request Dec 20, 2022
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants