Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add average replicas assign proposal #5225

Closed
wants to merge 6 commits into from

Conversation

ipsum-0320
Copy link
Contributor

@ipsum-0320 ipsum-0320 commented Jul 18, 2024

What type of PR is this?

/kind feature
/kind documentation

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes part of #5159

Special notes for your reviewer: @whitewindmills

Does this PR introduce a user-facing change?:

NONE

Other related issues: #4085

@karmada-bot karmada-bot added kind/feature Categorizes issue or PR as related to a new feature. kind/documentation Categorizes issue or PR as related to documentation. labels Jul 18, 2024
@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 18, 2024
@whitewindmills
Copy link
Member

/retest

@codecov-commenter
Copy link

codecov-commenter commented Jul 18, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 98.24561% with 1 line in your changes missing coverage. Please review.

Project coverage is 34.22%. Comparing base (ef7d528) to head (68134e6).

Files with missing lines Patch % Lines
.../scheduler/core/spreadconstraint/group_clusters.go 98.24% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5225      +/-   ##
==========================================
+ Coverage   34.15%   34.22%   +0.06%     
==========================================
  Files         643      643              
  Lines       44503    44551      +48     
==========================================
+ Hits        15201    15248      +47     
- Misses      28145    28146       +1     
  Partials     1157     1157              
Flag Coverage Δ
unittests 34.22% <98.24%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,263 @@
---
title: Karmada distributes the replicas evenly based on spread constraint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you re-summarize the title? in fact, we did not consider spread constraints during the design.

Therefore, we plan to introduce a new replica allocation strategy called AverageReplicas for Karmada's scheduler. This strategy will support evenly distributing the target replicas among the selected clusters, and it will:

1. Adhere to spread constraints.
2. Consider the available resources in the working clusters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we usually call them member clusters.


By adding the AverageReplicas replica allocation strategy, we ensure that replicas can be evenly distributed among the selected clusters as much as possible. This even distribution means:

1. The allocation results adhere to the spread constraint (Selection phase).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the point is to make sure that it's as evenly distributed as possible, not spread constraints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the point is to make sure that it's as evenly distributed as possible, not spread constraints.
@whitewindmills yeah, I only mentioned spread constraints in passing... I will remove this part.

spec:
placement:
replicaScheduling:
replicaDivisionPreference: Average
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to fix the format of this yaml.
BTW, can you provide an example?

可以看到,如果要添加一个 AverageReplicas 副本分配策略,可以考虑给 replicaDivisionPreference 添加一个新的取值 Average。
-->

Note that `replicaDivisionPreference == Average` will only be effective when `replicaSchedulingType == Divided`. Additionally, since `selectBestClustersByRegion` currently does not consider whether the selected clusters can accommodate the corresponding number of replica resources, **the AverageReplicas strategy is only applicable to `selectBestClustersByCluster`.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it, we should only focus on the Assign phase, as for what the spread constraints are, we don't care.

@whitewindmills
Copy link
Member

invite these people to help with the review.
/cc @RainbowMango @XiShanYongYe-Chang @chaunceyjiang @chaosi-zju @zhzhuang-zju

@ipsum-0320
Copy link
Contributor Author

@whitewindmills yeah, thanks, I have received your comments and will make some changes.

@ipsum-0320
Copy link
Contributor Author

/retest

@karmada-bot
Copy link
Collaborator

@ipsum-0320: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.


## Motivation

By adding the AverageReplicas replica allocation strategy, Karmada can distribute the target replicas as evenly as possible across the member clusters. At the same time, this even distribution will adhere to spread constraints and consider the available resources in the member clusters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I set it to StaticWeight, and it's 1:1:1:1:1, can it achieve the average behavior you mentioned?

So I suggest you need to elaborate on the differences between your proposal and StaticWeight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I have already considered this, as follows.

Therefore, we plan to introduce a new replica allocation strategy called AverageReplicas for Karmada's scheduler. This strategy will support evenly distributing the target replicas among the selected clusters, and it will:

  1. Adhere to spread constraints.
  2. Consider the available resources in the member clusters.

The StaticWeight does not consider spread constraints. There is some infomation about your question:

click me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding that difference to the current proposal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding that difference to the current proposal?

yeah, i have added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to your feedback on #4805 (comment).

For now, I can see two approaches:

  1. Enhance the StaticWeight
  2. Introduce a new AverageReplicas as proposed by this PR.

No matter which solution we are going to take, it's a great chance for us to revisit the StaticWeight feature, and make a clear behavior.

The logic here is essentially to determine how many replicas should be allocated to each cluster. According to the project's objectives, in the Assign phase, replicas should be evenly distributed among the selected clusters as much as possible. First, we need to calculate the ideal number of replicas each cluster should be allocated:

1. Define the number of selected clusters as L。
2. Calculate the average number of replicas each cluster should be allocated, defined as avg_rep = Replica / L。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to solve the non-divisible scenario, like L=3 and Replica=8, then the avg_rep = 2, follow the algorithm metioned below, only avg_rep * L = 6 may be allocated.
I think we should set the sumPendingReplica=Replica - L * avg_rep = 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have refined the explanation of this issue, thanks.

<!--
b.如果 AvailableReplica <= 该 Cluster 对应的 avg_rep,那么给该 Cluster 分配满额的 AvailableReplica 个实例,同时将剩余的待分配 pendingReplica,也就是 avg_rep - AvailableReplica,添加到变量 sumPendingReplica(这是一个全局变量,不与任何一个 Cluster 绑定)中,该变量初始时为零。
-->
2. After completing the first round of traversal, first sort the unFullCluster by Cluster.Score, then start traversing the unFullCluster. For each cluster in it, allocate one replica, and decrement both the cluster's AvailableReplica and sumPendingReplica by 1. Then check if the cluster's AvailableReplica is zero, and if so, remove that cluster from the unFullCluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When i first view the proposal, I have misunderstanding because I cut the sentence not in a right way, should we use Cluster.Score instead of Cluster.Score, just a suggestion, not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have noticed this issue. Thank you for your suggestion. @warjiang

@ipsum-0320
Copy link
Contributor Author

i have update the proposal, everyone is welcome to give me some feedback. /cc @whitewindmills @warjiang @RainbowMango @XiShanYongYe-Chang @chaunceyjiang @chaosi-zju @zhzhuang-zju

@ipsum-0320
Copy link
Contributor Author

/retest

@karmada-bot
Copy link
Collaborator

@ipsum-0320: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@whitewindmills
Copy link
Member

/ok-to-test

@ipsum-0320
Copy link
Contributor Author

/retest

spec:
placement:
replicaScheduling:
replicaDivisionPreference: Average
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AverageReplicas sounds more appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have completed the edit.

<!--
b.如果 AvailableReplica <= 该 Cluster 对应的 avg_rep,那么给该 Cluster 分配满额的 AvailableReplica 个实例,同时将剩余的待分配 pendingReplica,也就是 avg_rep - AvailableReplica,添加到变量 sumPendingReplica(这是一个全局变量,不与任何一个 Cluster 绑定)中,该变量初始时为零。
-->
2. After completing the first round of traversal, first sort the unFullCluster by Cluster.Score, then start traversing the unFullCluster. For each cluster in it, allocate one replica, and decrement both the cluster's AvailableReplica and sumPendingReplica by 1. Then check if the cluster's AvailableReplica is zero, and if so, remove that cluster from the unFullCluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if two clusters have the same score, we can prioritize the one with the higher number of available replicas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I have added the relevant explanation.

Copy link
Member

@XiShanYongYe-Chang XiShanYongYe-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your proposal!

通过添加 AverageReplicas 副本分配策略,Karmada 能够尽可能平均地将目标副本分配到工作集群中。同时,这种平均是遵守传播约束(spread constraint)且考虑工作集群中的可用资源的。
-->

Such an allocation strategy can significantly improve the system's disaster recovery capability. Even if one cluster fails, it will not have a major impact on the overall system. This is highly beneficial for ensuring high availability of the service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can it be significantly improved compared to the current strategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the wording and gave relevant explanations.

2.结合待分配的 Replica 计算平均每个 Cluster 要分配多少 replica,定义 avg_rep = Replica / L。
-->

After calculating the average number of replicas to be allocated to each cluster avg_rep, we start traversing the selected clusters and use the predefined TargetCluster struct to define the allocation results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can give the TargetCluster structure a code link, so we can easily understand what it looks like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I add the the link of TargetCluster.

3.循环第二步,直至 sumPendingReplica 为 0,分配结束。
-->

Since we have ensured in the Select phase that the total AvailableReplica of the selected clusters is greater than the number of replicas to be allocated, there is no need to worry about resource insufficiency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requirement is met in the current Select phase. However, it does not mean that this requirement is always met in the future evolution. Therefore, the resource insufficiency still needs to be considered during the design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I have already pointed this out.

由于在 Select 阶段中,我们已经保证了所选出的 Cluster 的 AvailableReplica 总和是大于待分配的 Replica 的,因此不必担心资源不足的问题。
-->

It should be clearly stated that if the result is not a whole number when calculating avg_rep (avg_rep = Replica / L), the avg_rep will be rounded down, and the remaining Replicas due to rounding will be added to the sumPendingReplica variable. For example, suppose Replica = 8 and L = 3, then avg_rep = 8 / 3 = 2.67. We would round down avg_rep to 2, meaning each Cluster is expected to be allocated 2 Replicas. Since 2 * 3 = 6, there are still 2 Replicas left unallocated. These two Replicas will then be added to sumPendingReplica, making sumPendingReplica = sumPendingReplica + 2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence can be explained in advance when the calculation method is described.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i have explained in advance.

@whitewindmills
Copy link
Member

@RainbowMango @chaunceyjiang @warjiang
still need your review. do you have any comment?

ipsum-0320 and others added 4 commits August 6, 2024 17:57
Signed-off-by: ipsum <trueman.0320@zju.edu.cn>
Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
Signed-off-by: ipsum <trueman.0320@zju.edu.cn>
Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
Signed-off-by: ipsum <trueman.0320@zju.edu.cn>
Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
@whitewindmills
Copy link
Member

@ipsum-0320
Copy link
Contributor Author

@RainbowMango @chaunceyjiang @warjiang @XiShanYongYe-Chang Does anyone have any questions? 😁

@XiShanYongYe-Chang
Copy link
Member

LGTM

We are about to release a major version (v1.11) soon (by the end of this month), and it seems that this feature will not make it into this version. We can start to quickly advance this feature in early September.

@chaunceyjiang
Copy link
Member

/assign

@XiShanYongYe-Chang
Copy link
Member

Hi @RainbowMango @whitewindmills @chaunceyjiang can we get on with this mission?

@whitewindmills
Copy link
Member

I'm ok with it, and seems that it looks good to @XiShanYongYe-Chang.
ping @RainbowMango @chaunceyjiang

@RainbowMango
Copy link
Member

I'll look at it today.

@karmada-bot karmada-bot added the do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. label Sep 19, 2024
@karmada-bot
Copy link
Collaborator

Adding label do-not-merge/contains-merge-commits because PR contains merge commits, which are not allowed in this repository.
Use git rebase to reapply your commits on top of the target branch. Detailed instructions for doing so can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from chaunceyjiang. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 19, 2024
@ipsum-0320 ipsum-0320 closed this Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants