-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add average replicas assign proposal #5225
Conversation
/retest |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #5225 +/- ##
==========================================
+ Coverage 34.15% 34.22% +0.06%
==========================================
Files 643 643
Lines 44503 44551 +48
==========================================
+ Hits 15201 15248 +47
- Misses 28145 28146 +1
Partials 1157 1157
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -0,0 +1,263 @@ | |||
--- | |||
title: Karmada distributes the replicas evenly based on spread constraint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you re-summarize the title? in fact, we did not consider spread constraints during the design.
Therefore, we plan to introduce a new replica allocation strategy called AverageReplicas for Karmada's scheduler. This strategy will support evenly distributing the target replicas among the selected clusters, and it will: | ||
|
||
1. Adhere to spread constraints. | ||
2. Consider the available resources in the working clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually call them member clusters.
|
||
By adding the AverageReplicas replica allocation strategy, we ensure that replicas can be evenly distributed among the selected clusters as much as possible. This even distribution means: | ||
|
||
1. The allocation results adhere to the spread constraint (Selection phase). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the point is to make sure that it's as evenly distributed as possible, not spread constraints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the point is to make sure that it's as evenly distributed as possible, not spread constraints.
@whitewindmills yeah, I only mentioned spread constraints in passing... I will remove this part.
spec: | ||
placement: | ||
replicaScheduling: | ||
replicaDivisionPreference: Average |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to fix the format of this yaml.
BTW, can you provide an example?
可以看到,如果要添加一个 AverageReplicas 副本分配策略,可以考虑给 replicaDivisionPreference 添加一个新的取值 Average。 | ||
--> | ||
|
||
Note that `replicaDivisionPreference == Average` will only be effective when `replicaSchedulingType == Divided`. Additionally, since `selectBestClustersByRegion` currently does not consider whether the selected clusters can accommodate the corresponding number of replica resources, **the AverageReplicas strategy is only applicable to `selectBestClustersByCluster`.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it, we should only focus on the Assign
phase, as for what the spread constraints are, we don't care.
invite these people to help with the review. |
@whitewindmills yeah, thanks, I have received your comments and will make some changes. |
/retest |
@ipsum-0320: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
||
## Motivation | ||
|
||
By adding the AverageReplicas replica allocation strategy, Karmada can distribute the target replicas as evenly as possible across the member clusters. At the same time, this even distribution will adhere to spread constraints and consider the available resources in the member clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I set it to StaticWeight
, and it's 1:1:1:1:1
, can it achieve the average
behavior you mentioned?
So I suggest you need to elaborate on the differences between your proposal and StaticWeight
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I have already considered this, as follows.
Therefore, we plan to introduce a new replica allocation strategy called AverageReplicas for Karmada's scheduler. This strategy will support evenly distributing the target replicas among the selected clusters, and it will:
- Adhere to spread constraints.
- Consider the available resources in the member clusters.
The StaticWeight does not consider spread constraints. There is some infomation about your question:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding that difference to the current proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding that difference to the current proposal?
yeah, i have added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to your feedback on #4805 (comment).
For now, I can see two approaches:
- Enhance the StaticWeight
- Introduce a new
AverageReplicas
as proposed by this PR.
No matter which solution we are going to take, it's a great chance for us to revisit the StaticWeight feature, and make a clear behavior.
The logic here is essentially to determine how many replicas should be allocated to each cluster. According to the project's objectives, in the Assign phase, replicas should be evenly distributed among the selected clusters as much as possible. First, we need to calculate the ideal number of replicas each cluster should be allocated: | ||
|
||
1. Define the number of selected clusters as L。 | ||
2. Calculate the average number of replicas each cluster should be allocated, defined as avg_rep = Replica / L。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to solve the non-divisible scenario, like L=3
and Replica=8
, then the avg_rep = 2
, follow the algorithm metioned below, only avg_rep * L = 6
may be allocated.
I think we should set the sumPendingReplica=Replica - L * avg_rep = 2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refined the explanation of this issue, thanks.
<!-- | ||
b.如果 AvailableReplica <= 该 Cluster 对应的 avg_rep,那么给该 Cluster 分配满额的 AvailableReplica 个实例,同时将剩余的待分配 pendingReplica,也就是 avg_rep - AvailableReplica,添加到变量 sumPendingReplica(这是一个全局变量,不与任何一个 Cluster 绑定)中,该变量初始时为零。 | ||
--> | ||
2. After completing the first round of traversal, first sort the unFullCluster by Cluster.Score, then start traversing the unFullCluster. For each cluster in it, allocate one replica, and decrement both the cluster's AvailableReplica and sumPendingReplica by 1. Then check if the cluster's AvailableReplica is zero, and if so, remove that cluster from the unFullCluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When i first view the proposal, I have misunderstanding because I cut the sentence not in a right way, should we use Cluster.Score
instead of Cluster.Score, just a suggestion, not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have noticed this issue. Thank you for your suggestion. @warjiang
i have update the proposal, everyone is welcome to give me some feedback. /cc @whitewindmills @warjiang @RainbowMango @XiShanYongYe-Chang @chaunceyjiang @chaosi-zju @zhzhuang-zju |
/retest |
@ipsum-0320: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
/retest |
spec: | ||
placement: | ||
replicaScheduling: | ||
replicaDivisionPreference: Average |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AverageReplicas
sounds more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have completed the edit.
<!-- | ||
b.如果 AvailableReplica <= 该 Cluster 对应的 avg_rep,那么给该 Cluster 分配满额的 AvailableReplica 个实例,同时将剩余的待分配 pendingReplica,也就是 avg_rep - AvailableReplica,添加到变量 sumPendingReplica(这是一个全局变量,不与任何一个 Cluster 绑定)中,该变量初始时为零。 | ||
--> | ||
2. After completing the first round of traversal, first sort the unFullCluster by Cluster.Score, then start traversing the unFullCluster. For each cluster in it, allocate one replica, and decrement both the cluster's AvailableReplica and sumPendingReplica by 1. Then check if the cluster's AvailableReplica is zero, and if so, remove that cluster from the unFullCluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if two clusters have the same score, we can prioritize the one with the higher number of available replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have added the relevant explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your proposal!
通过添加 AverageReplicas 副本分配策略,Karmada 能够尽可能平均地将目标副本分配到工作集群中。同时,这种平均是遵守传播约束(spread constraint)且考虑工作集群中的可用资源的。 | ||
--> | ||
|
||
Such an allocation strategy can significantly improve the system's disaster recovery capability. Even if one cluster fails, it will not have a major impact on the overall system. This is highly beneficial for ensuring high availability of the service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be significantly improved compared to the current strategy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the wording and gave relevant explanations.
2.结合待分配的 Replica 计算平均每个 Cluster 要分配多少 replica,定义 avg_rep = Replica / L。 | ||
--> | ||
|
||
After calculating the average number of replicas to be allocated to each cluster avg_rep, we start traversing the selected clusters and use the predefined TargetCluster struct to define the allocation results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can give the TargetCluster structure a code link, so we can easily understand what it looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I add the the link of TargetCluster.
3.循环第二步,直至 sumPendingReplica 为 0,分配结束。 | ||
--> | ||
|
||
Since we have ensured in the Select phase that the total AvailableReplica of the selected clusters is greater than the number of replicas to be allocated, there is no need to worry about resource insufficiency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requirement is met in the current Select phase. However, it does not mean that this requirement is always met in the future evolution. Therefore, the resource insufficiency still needs to be considered during the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I have already pointed this out.
由于在 Select 阶段中,我们已经保证了所选出的 Cluster 的 AvailableReplica 总和是大于待分配的 Replica 的,因此不必担心资源不足的问题。 | ||
--> | ||
|
||
It should be clearly stated that if the result is not a whole number when calculating avg_rep (avg_rep = Replica / L), the avg_rep will be rounded down, and the remaining Replicas due to rounding will be added to the sumPendingReplica variable. For example, suppose Replica = 8 and L = 3, then avg_rep = 8 / 3 = 2.67. We would round down avg_rep to 2, meaning each Cluster is expected to be allocated 2 Replicas. Since 2 * 3 = 6, there are still 2 Replicas left unallocated. These two Replicas will then be added to sumPendingReplica, making sumPendingReplica = sumPendingReplica + 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence can be explained in advance when the calculation method is described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i have explained in advance.
@RainbowMango @chaunceyjiang @warjiang |
Signed-off-by: ipsum <trueman.0320@zju.edu.cn> Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
Signed-off-by: ipsum <trueman.0320@zju.edu.cn> Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
Signed-off-by: ipsum <trueman.0320@zju.edu.cn> Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
Signed-off-by: ipsum <ipsum@ipsumdeMacBook-Pro.local>
@RainbowMango @chaunceyjiang @warjiang @XiShanYongYe-Chang Does anyone have any questions? 😁 |
LGTM We are about to release a major version (v1.11) soon (by the end of this month), and it seems that this feature will not make it into this version. We can start to quickly advance this feature in early September. |
/assign |
Hi @RainbowMango @whitewindmills @chaunceyjiang can we get on with this mission? |
I'm ok with it, and seems that it looks good to @XiShanYongYe-Chang. |
I'll look at it today. |
Adding label Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
/kind documentation
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes part of #5159
Special notes for your reviewer: @whitewindmills
Does this PR introduce a user-facing change?:
Other related issues: #4085