Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming PromQL engine: optionally fall back to Prometheus' engine if query is not supported #7898

Merged
merged 8 commits into from
Apr 30, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Apr 13, 2024

What this PR does

This PR adds optional support for falling back to Prometheus' engine if the streaming PromQL engine does not support the query expression.

The fallback behaviour can be controlled with a CLI flag, but is enabled by default.

If fallback is required, the reason is logged and recorded in the metric cortex_streaming_promql_engine_unsupported_queries_total.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@charleskorn charleskorn marked this pull request as ready for review April 13, 2024 11:49
@charleskorn charleskorn requested review from jdbaldry and a team as code owners April 13, 2024 11:49
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (modulo a minor comment)

reason string
}

func NewNotSupportedError(reason string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason passed here is quite high cardinality (because generated dinamically), and it's used as a label value. Have you considered introducing few classes of pre-defined reasons, and then keeping the current dynamic string as "details"? The reason will be used in metrics, the reason + details in the info log.

Examples of pre-defined reasons:

  • unsupported aggregation (any aggregation)
  • unsupported function (any function)
  • unsupported function argument (any function, any argument)
  • unsupported expression (any expression)
  • offset
  • grouping using without

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum cardinality of this should be 109 series:

  • There are 16 binary operators that we don't support, although at present these will be reported as one "binary operators not supported" series
  • There are 11 aggregation functions that we don't support
  • There are 64 functions that we don't support
  • There are 10 aggregation over time functions that we don't support
  • There are 8 particular cases we don't support that will have their own series:
    • instant vector selector with offset
    • range vector selector with offset
    • grouping with without
    • number literals as the top-level expression
    • string literals as the top-level expression
    • unary expressions
    • range vector selectors as the top-level expression
    • subqueries

I'm open to other opinions, but I'm inclined to keep this as-is to keep things simple and make analysis easier. As we implement support for each of these, the associated series will disappear, and we can use Adaptive Metrics to aggregate away unnecessary labels like pod etc. in the meantime. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge this as-is, but if you have any opinions on this @pracucci let me know and I can change the behaviour in a follow-up PR.

charleskorn and others added 2 commits April 29, 2024 12:33
These metrics could be emitted by query-frontends too, so "querier" in
the name could be misleading.
@charleskorn
Copy link
Contributor Author

Note that I've renamed the metrics since raising this PR: originally I used cortex_querier_streaming_promql_engine_supported_queries_total, but I've dropped the querier prefix in 291ea4a as these metrics could also potentially be emitted by query-frontends.

Copy link
Contributor

@jhesketh jhesketh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm :-)

@charleskorn charleskorn merged commit bce1bbc into main Apr 30, 2024
29 checks passed
@charleskorn charleskorn deleted the charleskorn/engine-fallback branch April 30, 2024 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants