Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add emerge --jobs-tmpdir-space-threshold option #1345

Closed

Conversation

zmedico
Copy link
Member

@zmedico zmedico commented Jun 16, 2024

--jobs-tmpdir-space-threshold[=RATIO]

Specifies the maximum ratio of used space allowed (a floating-point number) in PORTAGE_TMPDIR when starting a new job. With no argument, removes a previous space ratio threshold. For example, use a ratio of 0.85 to stop starting new jobs when the space usage in PORTAGE_TMPDIR exceeds 85%.

Bug: https://bugs.gentoo.org/934382

@zmedico zmedico marked this pull request as draft June 16, 2024 22:42
@zmedico zmedico requested a review from akhuettel June 16, 2024 22:43
@zmedico zmedico force-pushed the bug_934382_jobs_tmpdir_space_threshold branch 2 times, most recently from 9e29391 to c300856 Compare June 17, 2024 01:34
@zmedico zmedico requested a review from Flowdalic June 17, 2024 01:38
@Flowdalic
Copy link
Member

In no way I object this change.

But I believe we should eventually come to a point where we are (also) able to specify absolute thresholds besides a relative one. The two relevant resources in question are free disk space and free inodes. Ideally we could gather some data about the worst-case disk space and inode consumption after src_compile and use those worst-case values (plus a small additional safety margin) as default thresholds. Maybe we could gather some worst-case data from the tinderboxes?

But this is mostly unrelated to the change proposed to this PR. Though, I personally would probably have implemented absolute threshold parameters for free disk space and inodes first, and not a relative one.

if (
(vfs_stat.f_blocks - vfs_stat.f_bavail) / vfs_stat.f_blocks
) > self._jobs_tmpdir_space_threshold:
return False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, I personally would probably have implemented absolute threshold parameters for free disk space and inodes first, and not a relative one.

Since ratios have no units, it was simpler to implement this threshold as a ratio. I was thinking we could possibly re-use the same option for absolute thresholds if we add support for parsing units.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it possible to distinguish between used space or remaining space, using a plus or minus operator for example.

@zmedico

This comment was marked as resolved.

@Flowdalic
Copy link
Member

Flowdalic commented Jun 18, 2024

Sure, an inode threshold could possibly be useful, but in this context running out of inodes seems likely to indicate that the user statically allocated an insufficient number of inodes.

Probably, still @akhuettel ran into inode exhaustion, not free-space. Of course, you could argue that it was a filesystem configuration error of the user, which would otherwise potentially go unnoticed.

I think these thresholds probably depend too much on the user's PORTAGE_TMPDIR configuration for defaults to be really meaningful.

I am a little bit ore optimistic here.

Sure, no (sane) default could prevent portage from running into filesystem-based resource exhaustion. Still, even without performing a throughout analysis of the worst-case file-system space and inode consumption of packages in ::gentoo, we could probably come up with a reasonable default threshold for free filesystem space and inodes. Just pick some good candidates, for example, firefox, chrome, compilers, and, of course libre office, to get an idea of their filesystem resource consumption, and then multiply by some factor, probably taking the number of concurrent portage jobs into account, as safety margin to get an idea what an initial default value could be.

@zmedico zmedico force-pushed the bug_934382_jobs_tmpdir_space_threshold branch from c300856 to ba127a5 Compare June 19, 2024 01:26
--jobs-tmpdir-space-threshold[=RATIO]

  Specifies the maximum ratio of used space allowed (a
  floating-point number) in PORTAGE_TMPDIR when starting a
  new job. With no argument, removes a previous space
  ratio threshold. For example, use a ratio of 0.85 to stop
  starting new jobs when the space usage in PORTAGE_TMPDIR
  exceeds 85%. This option conflicts with FEATURES="keep-work".

Bug: https://bugs.gentoo.org/934382
Signed-off-by: Zac Medico <[email protected]>
@zmedico zmedico force-pushed the bug_934382_jobs_tmpdir_space_threshold branch from ba127a5 to 88a1d9f Compare June 19, 2024 01:47
@zmedico zmedico marked this pull request as ready for review June 19, 2024 02:06
@zmedico zmedico requested a review from thesamesam June 19, 2024 02:06
@zmedico
Copy link
Member Author

zmedico commented Jun 19, 2024

Sure, an inode threshold could possibly be useful, but in this context running out of inodes seems likely to indicate that the user statically allocated an insufficient number of inodes.

Probably, still @akhuettel ran into inode exhaustion, not free-space. Of course, you could argue that it was a filesystem configuration error of the user, which would otherwise potentially go unnoticed.

Now that I've added --jobs-merge-wait-threshold in #1349, it has occurred to me that it can help to prevent inode exhaustion, since inode consumption would be partially related to the merge-wait queue length.

@zmedico
Copy link
Member Author

zmedico commented Jun 19, 2024

Sure, an inode threshold could possibly be useful, but in this context running out of inodes seems likely to indicate that the user statically allocated an insufficient number of inodes.

Probably, still @akhuettel ran into inode exhaustion, not free-space. Of course, you could argue that it was a filesystem configuration error of the user, which would otherwise potentially go unnoticed.

Now that I've added --jobs-merge-wait-threshold in #1349, it has occurred to me that it can help to prevent inode exhaustion, since inode consumption would be partially related to the merge-wait queue length.

Closing this in favor of #1351 which distinguishes between both used blocks and used files (inodes).

@zmedico zmedico closed this Jun 19, 2024
@zmedico
Copy link
Member Author

zmedico commented Jun 19, 2024

I think these thresholds probably depend too much on the user's PORTAGE_TMPDIR configuration for defaults to be really meaningful.

I am a little bit ore optimistic here.

Sure, no (sane) default could prevent portage from running into filesystem-based resource exhaustion. Still, even without performing a throughout analysis of the worst-case file-system space and inode consumption of packages in ::gentoo, we could probably come up with a reasonable default threshold for free filesystem space and inodes. Just pick some good candidates, for example, firefox, chrome, compilers, and, of course libre office, to get an idea of their filesystem resource consumption, and then multiply by some factor, probably taking the number of concurrent portage jobs into account, as safety margin to get an idea what an initial default value could be.

In addition to the number of jobs, an automatically calculated threshold might also account for the baseline usage in PORTAGE_TMPDIR at the time that emerge is started, in case this file-system has some pre-existing content either inside or outside of ${PORTAGE_TMPDIR}/portage. I suppose FEATURES=fail-clean or lack thereof is another thing we might want to account for somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants