Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match query-frontend/query-scheduler/querier custom deployments by default #376

Merged
merged 1 commit into from
Aug 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
* [CHANGE] Increased `CortexIngesterReachingSeriesLimit` critical alert threshold from 80% to 85%. #363
* [CHANGE] Decreased `-server.grpc-max-concurrent-streams` from 100k to 10k. #369
* [CHANGE] Decreased blocks storage ingesters graceful termination period from 80m to 20m. #369
* [CHANGE] Changed default `job_names` for query-frontend, query-scheduler and querier to match custom deployments too. #376
* [ENHANCEMENT] cortex-mixin: Make `cluster_namespace_deployment:kube_pod_container_resource_requests_{cpu_cores,memory_bytes}:sum` backwards compatible with `kube-state-metrics` v2.0.0. #317
* [ENHANCEMENT] Cortex-mixin: Include `cortex-gw-internal` naming variation in default `gateway` job names. #328
* [ENHANCEMENT] Ruler dashboard: added object storage metrics. #354
Expand All @@ -48,6 +49,7 @@
* [BUGFIX] Fixed `CortexInconsistentRuntimeConfig` metric. #335
* [BUGFIX] Fixed scaling dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #365
* [BUGFIX] Fixed rollout progress dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #366
* [BUGFIX] Fixed rollout progress dashboard to include query-scheduler too. #376
* [BUGFIX] Fixed `-distributor.extend-writes` setting on ruler when `unregister_ingesters_on_shutdown` is disabled. #369

## 1.9.0 / 2021-05-18
Expand Down
2 changes: 1 addition & 1 deletion cortex-mixin/alerts/alerts.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,7 @@
expr: |||
memberlist_client_cluster_members_count
!= on (%s) group_left
sum by (%s) (up{job=~".+/(admin-api|compactor|store-gateway|distributor|ingester.*|querier|cortex|ruler)"})
sum by (%s) (up{job=~".+/(admin-api|compactor|store-gateway|distributor|ingester.*|querier.*|cortex|ruler)"})
||| % [$._config.alert_aggregation_labels, $._config.alert_aggregation_labels],
'for': '5m',
labels: {
Expand Down
8 changes: 4 additions & 4 deletions cortex-mixin/config.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@
// These are used by the dashboards and allow for the simultaneous display of
// microservice and single binary cortex clusters.
job_names: {
ingester: '(ingester.*|cortex$)', // Match also ingester-blocks, which is used during the migration from chunks to blocks.
ingester: '(ingester.*|cortex$)', // Match also custom and per-zone ingester deployments.
distributor: '(distributor|cortex$)',
querier: '(querier|cortex$)',
querier: '(querier.*|cortex$)', // Match also custom querier deployments.
ruler: '(ruler|cortex$)',
query_frontend: '(query-frontend|cortex$)',
query_scheduler: 'query-scheduler', // Not part of single-binary.
query_frontend: '(query-frontend.*|cortex$)', // Match also custom query-frontend deployments.
query_scheduler: 'query-scheduler.*', // Not part of single-binary. Match also custom query-scheduler deployments.
table_manager: '(table-manager|cortex$)',
store_gateway: '(store-gateway|cortex$)',
gateway: '(gateway|cortex-gw|cortex-gw-internal)',
Expand Down
2 changes: 1 addition & 1 deletion cortex-mixin/dashboards/rollout-progress.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ local utils = import 'mixin-utils/utils.libsonnet';
gateway_job_matcher: $.jobMatcher($._config.job_names.gateway),
gateway_write_routes_regex: 'api_(v1|prom)_push',
gateway_read_routes_regex: '(prometheus|api_prom)_api_v1_.+',
all_services_regex: std.join('|', ['cortex-gw', 'distributor', 'ingester.*', 'query-frontend', 'querier', 'compactor', 'store-gateway', 'ruler', 'alertmanager']),
all_services_regex: std.join('|', ['cortex-gw', 'distributor', 'ingester.*', 'query-frontend.*', 'query-scheduler.*', 'querier.*', 'compactor', 'store-gateway', 'ruler', 'alertmanager']),
},

'cortex-rollout-progress.json':
Expand Down
2 changes: 1 addition & 1 deletion cortex-mixin/dashboards/slow-queries.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ local utils = import 'mixin-utils/utils.libsonnet';
targets: [
{
// Filter out the remote read endpoint.
expr: '{cluster=~"$cluster",namespace=~"$namespace",name="query-frontend"} |= "query stats" != "/api/v1/read" | logfmt | org_id=~"${tenant_id}" | response_time > ${min_duration}',
expr: '{cluster=~"$cluster",namespace=~"$namespace",name=~"query-frontend.*"} |= "query stats" != "/api/v1/read" | logfmt | org_id=~"${tenant_id}" | response_time > ${min_duration}',
instant: false,
legendFormat: '',
range: true,
Expand Down