-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unremovable_nodes_count metric #3690
Conversation
Welcome @evgenii-petrov-arrival! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A related test is failing:
--- FAIL: TestFindUnneededNodes (0.02s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x3b7791f]
Please fix your code.
Done. Sorry for not running the gofmt/golint/tests before create a pull request, I haven't figured out how vendoring is structured in the repo and was depending on CI to catch any errors while I'm figuring that out. |
@mwielgus , as far as I can tell, there is no way for me to mark your request for changes as addressed, so just in case I'll duplicate a request for re-review as a comment. I think I've addressed your request for changes and the tests are passing now. |
It seems to me that this is duplicating the logic for gathering unready reasons that we already have for ScaleDownStatusProcessor. I think implementing this as a processor should be preferred to avoid duplication, keep the aggregation logic encapsulated and avoid bloating scale-down logic even more. |
I've looked at ScaleDownStatusProcessor after your comment (wasn't aware of it before), and I don't think that adding a processor here would reduce complexity. Currently NoOpScaleDownStatusProcessor is the default ScaleDownStatusProcessor. And changing from NoOp to SomeOp seems to be a much larger change than adding a metric next to another similar metric. That said, moving both |
@mwielgus , is there something I can do to advance this pull request through the review? |
Adding a processor only for updating one metric does seem like a bit of an overkill to me. This PR isn't actually duplicating any existing logic, since we don't pivot the First of all, unremovable reasons are also set in Secondly, I think you're missing the fact that the |
Thanks @towca , both very good points. I think I've addressed them know, please take a look. |
Sorry for missing your reply, thanks for addressing my comments! /lgtm |
Please squash commits to just 1-2 and we are good to merge. |
2e0c653
to
b6f5d55
Compare
Squashed to one commit, please take another look. |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: evgenii-petrov-arrival, mwielgus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I want to set up alerting for cases where nodes are unremovable due to user errors, like lack of "safe to deschedule" annotation, and so this Pull Requests adds a Gauge sliced by
fmt.Sprintf("%s", simulator.UnremovableReason)
.I wasn't sure if there is any concurrent access involved, so added a mutex for the counter map, please let me know if it is unnecessary.
I also wasn't sure how to test this, any suggestions are welcome.