Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controller: refuse to create partitions that the cluster doesn't have space for #3274

Closed
jcsp opened this issue Dec 15, 2021 · 2 comments
Closed
Assignees
Labels
area/controller kind/enhance New feature or request

Comments

@jcsp
Copy link
Contributor

jcsp commented Dec 15, 2021

The minimum amount of space required for a partition is the fallocation step, which is currently hardcoded at 32MB (although will be made adjustable in #2876), multiplied by the replication size.

Calculating exactly whether a partition will fit is not trivial because we would have to guess which nodes it would get allocated to and consider their free space individually. However, imposing a general upper bound is straightforward: if the number of partitions multiplied by replication factor multiplied by fallocation size is greater than the total size of all data directories in the cluster, then that is too many partitions and we should refuse to create them.

This check should apply during topic creation and also on requests to add more partitions to existing topics.

What about...?

Q: Why don't we auto-adjust fallocation step size downwards instead of refusing to create partitions? That way we could accomodate many more.
A: Once #2876 is done, users can do exactly this by hand if they really intended to have a huge number of very tiny partitions, but it shouldn't be the default: if we did this by default, then we would end up permitting creation of partitions, but those partitions having no chance of ever reaching a full segment before hitting a full disk condition.

Q: should we use segment size instead of falloc step size as the min space required per partition?
A: Our default segment size is currently pretty large (1GB). For a topic with many partitions, it may never end up writing full segments if the rate of incoming traffic isn't reasonably high.

@jcsp jcsp added kind/enhance New feature or request area/controller labels Dec 15, 2021
@jcsp
Copy link
Contributor Author

jcsp commented Dec 15, 2021

Related: #2166, for the case where the system can handle the partition count but not the actual total length of each log.

@jcsp
Copy link
Contributor Author

jcsp commented Feb 22, 2022

This was resolved by #3398 -- the falloc step is used as the minimum space required per partition. This doesn't prevent the user exhausting space (because the falloc step is just a lower bound on space needed), but it makes it harder to instantly disable a cluster by requesting a too-high partition count on tiny disks.

@jcsp jcsp closed this as completed Feb 22, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller kind/enhance New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants