Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Allow passing a storeMatcher[] to select matching stores when debugging the querier #2931

Merged
merged 1 commit into from
Aug 1, 2020

Conversation

geobeau
Copy link
Contributor

@geobeau geobeau commented Jul 23, 2020

  • Get the general design validated.
  • Update tests.
  • Update documentation.
  • I added CHANGELOG entry for this change.

Changes

Solves the api part of #2919
Add a new parameter on the API to pass an allow list of stores.
Only stores matching this list will be able to be used for the query.

Verification

WIP: manual testing and partial unit tests

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Yes something like this would work! ❤️

I have some ideas, I think we can make it more flexible out of the box! Let me know if that makes sense, should be quick for you to adapt this new idea with matchers 💪

pkg/query/api/v1.go Outdated Show resolved Hide resolved
pkg/query/api/v1.go Outdated Show resolved Hide resolved
@@ -269,6 +270,21 @@ func (api *API) parseReplicaLabelsParam(r *http.Request) (replicaLabels []string
return replicaLabels, nil
}

func (api *API) parseStoreAllowListParam(r *http.Request) (storeAllowList []string, _ *ApiError) {
const storeAllowListParam = "storeAllowList[]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now when I think of it I wonder if something like matchers would be quite neat to have.
Because when we add allowlist we will soon have request "can I have denylist"? ;p

I mean something like this API have: https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers

For example:

&storeMatch[]={__address__=~"1\.2\.3\.\d*"}

This will give us quite amazing flexibility as in future we can add more "labels" to maybe filter by source, health, or any other metadata ... what do you think? (:

Copy link
Contributor Author

@geobeau geobeau Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could rely on the same system as the external labels match:

  • When a store is registered we add a label for __thanos_store_address__ containing the address (and maybe a __thanos_store_health__ as well)
  • Then the user add the label in its regular query and we match it using the regular system
  • Before we dispatch the query to the other store, we drop all the __thanos_store_.* labels

Pro:

  • easy to implement
  • no need to use contexts
  • easy to use

Con:

  • magic is done is the background
  • we reserve some labels

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was thinking about this as well, but this sounds scary. I think this one will need more thoughts. Let me propagate this question among maintainers and devs. I think the matcher idea is less intrusive.

We could in theory add those labels only in some --debug mode 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely cannot inject them as normal flow as it will surprise Querier clients. It's important to remember that users already build alerts etc so data consistency is the key

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder what e.g @brian-brazil would say about our ideas =D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no problems here with storeMatch - though I wonder if you really need regexes if this is just for ad-hoc debugging.

Tying this to general label semantics sounds like it's over-complicating things, and would likely cause problems down time line (plus is technically breaking as __thanos_store_address__ is a valid label value that could come from Prometheus).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's go for storeMatch then! 👍

Copy link
Member

@bwplotka bwplotka Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Let's limit the scope of this change. Although I really like the fact that you could aggregate, join etc all metadata. ;p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR. Much cleaner to use matcher instead of a regular allow list!

pkg/query/api/v1.go Outdated Show resolved Hide resolved
@@ -255,8 +260,14 @@ func (s *ProxyStore) Series(r *storepb.SeriesRequest, srv storepb.Store_SeriesSe
// NOTE: all matchers are validated in matchesExternalLabels method so we explicitly ignore error.
var ok bool
tracing.DoInSpan(gctx, "store_matches", func(ctx context.Context) {
storeAllowList := make(map[string]struct{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add todo to consider this in proto

@geobeau geobeau force-pushed the store-allow branch 6 times, most recently from d100144 to 4eb9993 Compare July 28, 2020 14:45
@geobeau geobeau changed the title WIP: Allow passing allow-list of store addrs for debug Query: Allow passing a storeMatcher[] to select matching stores when debugging the querier Jul 28, 2020
@geobeau
Copy link
Contributor Author

geobeau commented Jul 29, 2020

@bwplotka could you review it again? :)

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I think this will do the work. I would be happy to merge this. There are bunch of things we can improve in next PRs, but nothing critical. Thanks!

@@ -250,9 +256,20 @@ func (q *querier) selectFn(ctx context.Context, hints *storage.SelectHints, ms .
if err != nil {
return nil, errors.Wrap(err, "convert matchers")
}
var storeMatchers [][]storepb.LabelMatcher
for _, stms := range q.storeMatchers {
stm, err := storepb.TranslatePromMatchers(stms...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm what about translating them in build time (start time of query) not query time? (:

if len(storeMatcher) != 0 {
res := false
for _, stm := range storeMatcher {
stmMatch, err := labelSetMatches(clientLabels, stm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify this error handling, not sure why simple matching can produce error tbh ;p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better? The function is very similar to labelSetsMatch()

}

func generateMetadataClientLabels(s Client) storepb.LabelSet {
l := storepb.Label{Name: "__address__", Value: s.Addr()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like shallow function to me ):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in a separate function because I wanted to separate the matching from the generation of labels. It also make the function easier to extend with new labels or be used elsewhere.

@bwplotka
Copy link
Member

Rebase needed though ):

@@ -222,6 +222,24 @@ Thanos Querier has the ability to perform concurrent select request per query. I
The maximum number of concurrent requests are being made per query is controller by `query.max-concurrent-select` flag.
Keep in mind that the maximum number of concurrent queries that are handled by querier is controlled by `query.max-concurrent`. Please consider implications of combined value while tuning the querier.

### Store filtering

It's possible to provide a set of matchers to the Querier api to select specific stores to be used during the query using the `storeMatch[]` parameter. It is useful when debugging a slow/broken store. It uses the same format as the matcher of [Prometheus' federate api](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). Note that at the moment the querier only support the `__address__` which contains the address of the store.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It's possible to provide a set of matchers to the Querier api to select specific stores to be used during the query using the `storeMatch[]` parameter. It is useful when debugging a slow/broken store. It uses the same format as the matcher of [Prometheus' federate api](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). Note that at the moment the querier only support the `__address__` which contains the address of the store.
It's possible to provide a set of matchers to the Querier api to select specific stores to be used during the query using the `storeMatch[]` parameter. It is useful when debugging a slow/broken store. It uses the same format as the matcher of [Prometheus' federate api](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). Note that at the moment the querier only support the `__address__` which contains the address of the store that were ...

I think it would be actually IP address that is visible as Endpoint on Store API. Not necessarily domain as user can configure dns://prometheus-foo.thanos-sidecar:10901 (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! thx!

Copy link
Member

@bwplotka bwplotka Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link the docs here in changelog & rebase? Then it's more than enough for merge 💪 Thanks!

@geobeau geobeau force-pushed the store-allow branch 2 times, most recently from aaa03d4 to a651d8b Compare July 29, 2020 15:29
@geobeau
Copy link
Contributor Author

geobeau commented Jul 30, 2020

Rebased!

Add a new parameter on the API to pass an allow list of stores.
Only stores matching this list will be able to be used for the query.

Signed-off-by: Geoffrey Beausire <g.beausire@criteo.com>
@bwplotka
Copy link
Member

bwplotka commented Aug 1, 2020

Thanks!

@bwplotka bwplotka merged commit 76a6361 into thanos-io:master Aug 1, 2020
@geobeau
Copy link
Contributor Author

geobeau commented Aug 3, 2020

Nice, thanks for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants