You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The minimum amount of space required for a partition is the fallocation step, which is currently hardcoded at 32MB (although will be made adjustable in #2876), multiplied by the replication size.
Calculating exactly whether a partition will fit is not trivial because we would have to guess which nodes it would get allocated to and consider their free space individually. However, imposing a general upper bound is straightforward: if the number of partitions multiplied by replication factor multiplied by fallocation size is greater than the total size of all data directories in the cluster, then that is too many partitions and we should refuse to create them.
This check should apply during topic creation and also on requests to add more partitions to existing topics.
What about...?
Q: Why don't we auto-adjust fallocation step size downwards instead of refusing to create partitions? That way we could accomodate many more.
A: Once #2876 is done, users can do exactly this by hand if they really intended to have a huge number of very tiny partitions, but it shouldn't be the default: if we did this by default, then we would end up permitting creation of partitions, but those partitions having no chance of ever reaching a full segment before hitting a full disk condition.
Q: should we use segment size instead of falloc step size as the min space required per partition?
A: Our default segment size is currently pretty large (1GB). For a topic with many partitions, it may never end up writing full segments if the rate of incoming traffic isn't reasonably high.
The text was updated successfully, but these errors were encountered:
This was resolved by #3398 -- the falloc step is used as the minimum space required per partition. This doesn't prevent the user exhausting space (because the falloc step is just a lower bound on space needed), but it makes it harder to instantly disable a cluster by requesting a too-high partition count on tiny disks.
The minimum amount of space required for a partition is the fallocation step, which is currently hardcoded at 32MB (although will be made adjustable in #2876), multiplied by the replication size.
Calculating exactly whether a partition will fit is not trivial because we would have to guess which nodes it would get allocated to and consider their free space individually. However, imposing a general upper bound is straightforward: if the number of partitions multiplied by replication factor multiplied by fallocation size is greater than the total size of all data directories in the cluster, then that is too many partitions and we should refuse to create them.
This check should apply during topic creation and also on requests to add more partitions to existing topics.
What about...?
Q: Why don't we auto-adjust fallocation step size downwards instead of refusing to create partitions? That way we could accomodate many more.
A: Once #2876 is done, users can do exactly this by hand if they really intended to have a huge number of very tiny partitions, but it shouldn't be the default: if we did this by default, then we would end up permitting creation of partitions, but those partitions having no chance of ever reaching a full segment before hitting a full disk condition.
Q: should we use segment size instead of falloc step size as the min space required per partition?
A: Our default segment size is currently pretty large (1GB). For a topic with many partitions, it may never end up writing full segments if the rate of incoming traffic isn't reasonably high.
The text was updated successfully, but these errors were encountered: