Verify behaviour of prometheus grafana functionality with old and new metrics endpoint #5501

mmedenjak · 2022-07-18T13:54:35Z

With the recent changes to metrics in 22.2, we have two metrics endpoints:

/metrics - "old" metrics endpoint, mostly unchanged but now has aggregated metrics which decreases the total number of metrics. Should be backwards compatible.
/public_metrics - "new" metrics endpoint, fewer metrics, better documented, but not complete yet for a monitoring solution. We'll be adding new metrics in the future releases.

For further info, see the PRD, the RFC for cardinality reduction and for the new endpoint.

In terms of rpk generate prometheus-config and rpk generate grafana-dashboard, we should:

check they are not broken by recent metrics changes on the "old" endpoint (must-have for 22.2.1)
see what it would take to add scraping of the new endpoint as well (some small changes can go in 22.2.1, but can also go after and be backported)
add new dashboard sections consuming both endpoints (some small changes can go in 22.2.1, but can also go after and be backported)

The text was updated successfully, but these errors were encountered:

r-vasquez · 2022-07-19T15:03:27Z

rpk generate prometheus-config shouldn't be a problem since the command doesn't use any metric endpoint (/metrics or /public-metrics endpoint).

It uses 3 flags (job-name, node-addrs, seed-addrs) to generate a yaml scrape_config that looks like:

- job_name: redpanda-metrics-test
  static_configs:
  - targets:
    - localhost:9644

r-vasquez · 2022-07-19T15:28:48Z

By reading the original slack thread it seems that we wanted to change the endpoint in generate prometheus-config but I'm not entirely sure what's needed here, what do you think @BenPope

BenPope · 2022-07-19T15:43:03Z

I don't really know how the scrape config works, but I guess it needs to output something like:

- job_name: redpanda-node
  static_configs:
  - targets:
    - localhost:9644
  metrics_path: /metrics
- job_name: redpanda-node-public
  static_configs:
  - targets:
    - localhost:9644
  metrics_path: /public_metrics

I think @VladLazar had a play with the scrape config on a test cluster.

BenPope · 2022-07-19T15:51:06Z

It might be useful, for customers with multiple clusters, to be able to take another argument that can attach a label:

rpk generate prometheus-config --seed-addr localhost:9092 --job-name redpanda-node --add-label cluster-id:cluster-a

To get

- job_name: redpanda-node
  static_configs:
  - targets:
    - localhost:9644
  labels:
    - cluster-id: cluster-a
  metrics_path: /metrics
- job_name: redpanda-node-public
  static_configs:
  - targets:
    - localhost:9644
  labels:
    - cluster-id: cluster-a
  metrics_path: /public_metrics

VladLazar · 2022-07-19T17:47:58Z

I think @VladLazar had a play with the scrape config on a test cluster.

I managed to set this up in k8s env, but you don't edit the scrape config directly for that.
What Ben suggested looks about right to me.

r-vasquez · 2022-07-19T20:15:33Z

To summarize (give me a 👍 if it's correct), 2 changes required for generate prometheus-config:

To include a flag --metrics-path (default to /metrics) to add metrics_path property to the generated yaml.
To include a flag --add-label that receives a list of 'key:values' to add under labels yaml property

CC: @VladLazar @BenPope

BenPope · 2022-07-19T20:43:32Z

Sounds reasonable to me, I think people will also want to be able to scrape both, but they can concat the results of two calls, I guess. It'll simplify the way we document stuff and over time, deprecate the old metrics endpoint.

Alternatively we could have flags --internal-metrics and --public-metrics, which could be combined and produce results for /metrics and /public_metrics respectively.

mmedenjak added the area/rpk label Jul 18, 2022

mmedenjak changed the title ~~Verify behaviour of prometheus grafana functionality with old and new metrics endpoing~~ Verify behaviour of prometheus grafana functionality with old and new metrics endpoint Jul 18, 2022

twmb assigned r-vasquez Jul 19, 2022

This was referenced Jul 20, 2022

rpk: Add option to generate scrape_config for both metrics endpoints #5526

Merged

Attach labels to prometheus scrape_config using rpk generate #5551

Closed

VladLazar mentioned this issue Jul 26, 2022

Make rpk generated dashboards work with "public_metrics" #5646

Closed

twmb closed this as completed in #5526 Jul 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify behaviour of prometheus grafana functionality with old and new metrics endpoint #5501

Verify behaviour of prometheus grafana functionality with old and new metrics endpoint #5501

mmedenjak commented Jul 18, 2022

r-vasquez commented Jul 19, 2022 •

edited by mmedenjak

Loading

r-vasquez commented Jul 19, 2022

BenPope commented Jul 19, 2022 •

edited

Loading

BenPope commented Jul 19, 2022 •

edited

Loading

VladLazar commented Jul 19, 2022

r-vasquez commented Jul 19, 2022 •

edited

Loading

BenPope commented Jul 19, 2022

Verify behaviour of prometheus grafana functionality with old and new metrics endpoint #5501

Verify behaviour of prometheus grafana functionality with old and new metrics endpoint #5501

Comments

mmedenjak commented Jul 18, 2022

r-vasquez commented Jul 19, 2022 • edited by mmedenjak Loading

r-vasquez commented Jul 19, 2022

BenPope commented Jul 19, 2022 • edited Loading

BenPope commented Jul 19, 2022 • edited Loading

VladLazar commented Jul 19, 2022

r-vasquez commented Jul 19, 2022 • edited Loading

BenPope commented Jul 19, 2022

r-vasquez commented Jul 19, 2022 •

edited by mmedenjak

Loading

BenPope commented Jul 19, 2022 •

edited

Loading

BenPope commented Jul 19, 2022 •

edited

Loading

r-vasquez commented Jul 19, 2022 •

edited

Loading