Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix queue hang when collector exits without successfully benchmarking #1432

Open
Mark-Simulacrum opened this issue Aug 29, 2022 · 4 comments

Comments

@Mark-Simulacrum
Copy link
Member

We have a general problem today that there's no general error feedback from the collector to the site -- only on a per-benchmark collection basis. That means that when there's some problem with the startup phase (e.g., no benchmarks match a filter, or SHA passed doesn't have artifacts), we end up hanging the queue until an operator manually intervenes and either removes or fixes the failing entry. (Currently done via manual database edit).

I think a good solution here would add some general error mechanism from the collector that could be posted back to the site. I think addressing the cases where the collector returns an Err is already 80% good enough -- we can work on avoiding panic! after that.

@Kobzol
Copy link
Contributor

Kobzol commented Aug 29, 2022

Regarding the input configuration (e.g. include etc.), what option would you prefer?

  1. When we gather @rust-timer build/queue, and the parameters are bad, we don't even queue the PR into the DB, and just post a comment with an error.
  2. If bench_next fails, a comment is sent to the PR (and the entry is removed from the PR).

I think that for fixing invalid input parameters, 1) would be preferable.

@Mark-Simulacrum
Copy link
Member Author

IMO, (1) is nice-to-have validation, but we can't expect to duplicate all of that validation into the site -- that's not going to be maintainable. So I think pursuing (2) as the more general option is a better idea.

@Kobzol
Copy link
Contributor

Kobzol commented Aug 29, 2022

Well site already depends on collector, so it wouldn't be that hard to expose a few functions that check the validity of CLI parameters (probably to avoid panicking in bench_next we will need to do that anyway). I guess that we can do both.

@Mark-Simulacrum
Copy link
Member Author

Sure, if that's easy we can do so. Some of the deeper validation - e.g., downloading the artifacts - feels a little harder to do well at that point, so I think we still want the more general feedback mechanism anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants