Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check to ensure we never parse multiple range intervals when mapping an instant query for splitting #2575

Merged
merged 6 commits into from
Jul 28, 2022

Conversation

pracucci
Copy link
Collaborator

What this PR does

getRangeInterval() makes me uncomfortable because it returns the 1st range interval found. Theoretically, due to how we use it, we should never call it with an expr that could have 2+ matrix selectors, but it's not enforced in the code.

In this PR I'm enforcing it, adding a check that will fail the mapping if we call it with an expr containing 2+ matrix selectors.

Which issue(s) this PR fixes or relates to

N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@ssncferreira ssncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the idea looks good, but I have a concern with Subqueries 🤔


// Ignore the error since we never return it.
_, _ = anyNode(expr, func(entry parser.Node) (bool, error) {
matrix, ok := entry.(*parser.MatrixSelector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this will return an empty array if expr is a subquery right?
If so, can we add a test to TestGetRangeIntervals?

Copy link
Contributor

@ssncferreira ssncferreira Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From looking at Suquery it can have an expression: https://github.com/prometheus/prometheus/blob/main/promql/parser/ast.go#L131-L146
so that means that we can have something like sum_over_time(rate(metric[1m])[5h:5m]), right?
In this case, I think this will return [1m] and it cannot be split 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! I think the utilities should be agnostic, but we should avoid splitting subqueries in the caller. I'm proposing it here: 37076dd

We do something similar (but a bit more evoluted to support some subqueries) in query sharding:

Comment on lines 100 to 77
if isSubquery(e) {
// Subqueries are currently not supported by splitting, so we stop the mapping here.
return e, true, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what is this checking?

I see this function checks whether the first arg of a call is a subquery, but I don't understand why are we checking that (why only the first arg? why don't we just dig recursively?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inherited this logic from the query sharding. See here:
mimir/pkg/frontend/querymiddleware/astmapper/sharding.go

In PromQL functions I don't think we can have functions where a subquery in a position different than 0. Can we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

histogram_quantile has the vector as the second argument

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, but it has to be a range aggregation call. Well, still I think it's worth fixing the function to be future-proof instead of taking those assumptions. But I see the point now. Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually quantile_over_time(scalar, range-vector) takes the range-vector as 2nd parameter. I'm currently trying to write a test to break the current logic and then I will fix it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here:
1e88552

Copy link
Contributor

@ssncferreira ssncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🍪

…ing an instant query for splitting

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Copy link
Contributor

@ssncferreira ssncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's try it 💥
Just a couple of comments, but non-blocking

@pracucci
Copy link
Collaborator Author

Wasn't able to find a way for an inner expression in Call to be a BinaryExpr 🕵️‍♀️
Do you know if it is possible?

@ssncferreira BinaryExpr can only have scalar or instant vectors on its legs. A subquery returns a range vector, so you can't use directly as a leg in a BinaryExpr. For example, this is invalid: up[5m:1m] + 1. What you can do is wrap the subquery with an aggregation function (e.g. rate(up{namespace="cortex-dev-01"}[5m:1m]) + 1) but then it's a function call the parent of the subquery, so we enter in the logic added in this PR.

@pracucci
Copy link
Collaborator Author

🤞

@pracucci pracucci merged commit c2bcffc into main Jul 28, 2022
@pracucci pracucci deleted the enhance-get-range-interval branch July 28, 2022 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants