Queue piles up and metrics are delayed if targets are unreachable #34229

gbxavier · 2024-07-23T17:36:34Z

Component(s)

exporter/prometheusremotewrite, receiver/prometheus

What happened?

Description

I have a dedicated instance to scrape metrics from pods exposing Prometheus Endpoints. The instance has more than enough resources to process a lot of metrics (resource utilization never reaches 60%). In this setup, most namespaces are under Network Policies preventing this scraper from reaching the pods found in the Kubernetes autodiscovery.

At first, the metrics reach the target (Grafana Cloud) as expected; but it's possible to immediately notice that the memory consumption keeps on growing and the queue size starts growing slowly until it reaches the capacity and enqueuing starts failing.

The amount of metrics received and sent remains constant, but over time, the delay between the metric being "seen" by the collector and sent to the backend slowly grows to the point that the last observed data point is hours late (but still being received by the backend). This behavior is observed by all receivers configured in the instance, including the prometheus/self instance that doesn't face any problem scraping the metrics.

This behavior only happens when the workload_prometheus is enabled, and no other instance suffers from this problem or any performance/limits issues.

Steps to Reproduce

Create AKS cluster with several namespaces protected by network policies;
Install Otel Collector (Prometheus Endpoints scraper);

Expected Result

The receiver scrapes the metrics from the endpoints it can reach and those metrics are correctly sent through the Prometheus Remote Write Exporter reasonably fast.

Actual Result

Memory consumption increases over time; the delay between the metric being "seen" by the collector and sent to the backend slowly grows to the point that the last observed data point is hours late;

Collector version

0.105.0

Environment information

Environment

OS: AKSUbuntu-2204gen2containerd-202407.03.0
Kubernetes: Azure AKS v1.29.2

OpenTelemetry Collector configuration

apiVersion: v1
data:
  relay: |
    exporters:
      debug:
        sampling_initial: 100
        sampling_thereafter: 10
        verbosity: detailed
      prometheusremotewrite/grafana_cloud:
        auth:
          authenticator: basicauth/grafana_cloud_prometheus
        endpoint: ${env:GRAFANA_CLOUD_PROMETHEUS_URL}
        external_labels:
          cluster: ${env:CLUSTER_NAME}
          deployment.environment: ${env:DEPLOYMENT_ENVIRONMENT}
          k8s.cluster.name: ${env:CLUSTER_NAME}
        max_batch_size_bytes: 3000000
        remote_write_queue:
          num_consumers: 10
          queue_size: 20000
    extensions:
      basicauth/grafana_cloud_prometheus:
        client_auth:
          password: ${env:GRAFANA_COM_API_TOKEN}
          username: "1277373"
      health_check:
        endpoint: ${env:MY_POD_IP}:13133
    processors:
      batch: {}
      k8sattributes/extract_app_kubernetes_io:
        extract:
          labels:
          - from: pod
            key: app.kubernetes.io/name
            tag_name: service.name
          - from: pod
            key: app.kubernetes.io/version
            tag_name: service.version
          metadata:
          - k8s.pod.uid
        filter:
          node_from_env_var: KUBE_NODE_NAME
        passthrough: false
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 25
      resourcedetection/aks:
        detectors:
        - env
        - azure
        - aks
        override: false
        timeout: 2s
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: ${env:MY_POD_IP}:4317
          http:
            endpoint: ${env:MY_POD_IP}:4318
      prometheus:
        config:
          scrape_configs:
          - job_name: opentelemetry-collector
            scrape_interval: 10s
            static_configs:
            - targets:
              - ${env:MY_POD_IP}:8888
      prometheus/self:
        config:
          scrape_configs:
          - job_name: opentelemetry-collector
            scrape_interval: 10s
            static_configs:
            - labels:
                instance: ${env:MY_POD_NAME}
              targets:
              - ${env:MY_POD_IP}:8888
      prometheus/workload_prometheus:
        config:
          global:
            scrape_protocols:
            - PrometheusProto
            - OpenMetricsText1.0.0
            - OpenMetricsText0.0.1
            - PrometheusText0.0.4
            scrape_timeout: 200ms
          scrape_configs:
          - job_name: integrations/kubernetes/workload-prometheus
            kubernetes_sd_configs:
            - role: pod
            relabel_configs:
            - action: keep
              regex: "true"
              source_labels:
              - __meta_kubernetes_pod_annotation_prometheus_io_scrape
            - action: replace
              regex: (https?)
              source_labels:
              - __meta_kubernetes_pod_annotation_prometheus_io_scheme
              target_label: __scheme__
            - action: replace
              regex: (.+)
              source_labels:
              - __meta_kubernetes_pod_annotation_prometheus_io_path
              target_label: __metrics_path__
            - action: replace
              regex: ([^:]+)(?::\d+)?;(\d+)
              replacement: $${1}:$${2}
              source_labels:
              - __address__
              - __meta_kubernetes_pod_annotation_prometheus_io_port
              target_label: __address__
            - action: replace
              source_labels:
              - __meta_kubernetes_namespace
              target_label: k8s_namespace_name
            - action: replace
              source_labels:
              - __meta_kubernetes_pod_name
              target_label: k8s_pod_name
    service:
      extensions:
      - health_check
      - basicauth/grafana_cloud_prometheus
      pipelines:
        metrics:
          exporters:
          - prometheusremotewrite/grafana_cloud
          processors:
          - memory_limiter
          - batch
          - k8sattributes/extract_app_kubernetes_io
          - resourcedetection/aks
          receivers:
          - prometheus/self
          - prometheus/workload_prometheus
      telemetry:
        logs:
          level: debug
        metrics:
          address: ${env:MY_POD_IP}:8888
kind: ConfigMap
metadata:
  creationTimestamp: "2024-07-22T13:19:32Z"
  labels:
    app.kubernetes.io/instance: otel-scraper-otel-collectors
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: opentelemetry-collector
    app.kubernetes.io/version: 0.104.0
    argocd.argoproj.io/instance: otel-scraper-otel-collectors
    helm.sh/chart: opentelemetry-collector-0.97.2
  name: otel-scraper-otel-collectors-opentelemetry-collector
  namespace: otel-collectors
  resourceVersion: "1333804215"
  uid: dfb34eec-b6da-4d8d-acf1-f08ee066a6b6

Log output

2024-07-23T16:48:25.892Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.51.54:15020/stats/prometheus", "error": "Get \"http://172.24.51.54:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:25.951Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.32.221:15020/stats/prometheus", "error": "Get \"http://172.24.32.221:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:26.462Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.23.186:15020/stats/prometheus", "error": "Get \"http://172.24.23.186:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:26.490Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.25.53:15020/stats/prometheus", "error": "Get \"http://172.24.25.53:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:26.504Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.50.246:15020/stats/prometheus", "error": "Get \"http://172.24.50.246:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:26.654Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.30.59:15020/stats/prometheus", "error": "Get \"http://172.24.30.59:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:26.701Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.50.164:15020/stats/prometheus", "error": "Get \"http://172.24.50.164:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:30.049Z    debug    memorylimiter/memorylimiter.go:200    Currently used memory.    {"kind": "processor", "name": "memory_limiter", "pipeline": "metrics", "cur_mem_mib": 419}
2024-07-23T16:48:34.147Z    debug    k8sattributesprocessor@v0.105.0/processor.go:122    evaluating pod identifier    {"kind": "processor", "name": "k8sattributes/extract_app_kubernetes_io", "pipeline": "metrics", "value": [{"Source":{"From":"resource_attribute","Name":"k8s.pod.uid"},"Value":"c3192582-53a8-46e0-bc2b-caaf6e394ab8"},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2024-07-23T16:48:34.147Z    debug    k8sattributesprocessor@v0.105.0/processor.go:140    getting the pod    {"kind": "processor", "name": "k8sattributes/extract_app_kubernetes_io", "pipeline": "metrics", "pod": {"Name":"namespace-dotnet-test-6-76965ffc96-2wmt2","Address":"172.24.51.3","PodUID":"c3192582-53a8-46e0-bc2b-caaf6e394ab8","Attributes":{"k8s.pod.uid":"c3192582-53a8-46e0-bc2b-caaf6e394ab8"},"StartTime":"2024-07-17T12:31:13Z","Ignore":false,"Namespace":"namespace-dotnet-test-6","NodeName":"","HostNetwork":false,"Containers":{"ByID":null,"ByName":null},"DeletedAt":"0001-01-01T00:00:00Z"}}
2024-07-23T16:48:34.147Z    debug    k8sattributesprocessor@v0.105.0/processor.go:122    evaluating pod identifier    {"kind": "processor", "name": "k8sattributes/extract_app_kubernetes_io", "pipeline": "metrics", "value": [{"Source":{"From":"resource_attribute","Name":"k8s.pod.uid"},"Value":"1b444434-4035-4102-8835-79ff20c2c9bc"},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2024-07-23T16:48:34.147Z    debug    k8sattributesprocessor@v0.105.0/processor.go:140    getting the pod    {"kind": "processor", "name": "k8sattributes/extract_app_kubernetes_io", "pipeline": "metrics", "pod": {"Name":"namespace-2-684876b69c-6ksq5","Address":"172.24.52.18","PodUID":"1b444434-4035-4102-8835-79ff20c2c9bc","Attributes":{"k8s.pod.uid":"1b444434-4035-4102-8835-79ff20c2c9bc"},"StartTime":"2024-07-19T11:03:36Z","Ignore":false,"Namespace":"namespace-test-2","NodeName":"","HostNetwork":false,"Containers":{"ByID":null,"ByName":null},"DeletedAt":"0001-01-01T00:00:00Z"}}
2024-07-23T16:48:34.603Z    debug    scrape/scrape.go:1352    Append failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.52.99:15020/stats/prometheus", "error": "expected a valid start token, got \"\\xb0\" (\"INVALID\") while parsing: \"\\xb0\""}
2024-07-23T16:48:34.604Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314573, "target_labels": "{__name__=\"up\", instance=\"172.24.52.99:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-1\", k8s_pod_name=\"pod-1\"}"}
2024-07-23T16:48:34.666Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314465, "target_labels": "{__name__=\"up\", instance=\"172.24.51.51:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-12\", k8s_pod_name=\"pod-12\"}"}
2024-07-23T16:48:34.673Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314473, "target_labels": "{__name__=\"up\", instance=\"172.24.52.188:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-13\", k8s_pod_name=\"pod-13\"}"}
2024-07-23T16:48:34.689Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314484, "target_labels": "{__name__=\"up\", instance=\"172.24.52.193:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-4\", k8s_pod_name=\"pod-4\"}"}
2024-07-23T16:48:34.858Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314656, "target_labels": "{__name__=\"up\", instance=\"172.24.15.122:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-14\", k8s_pod_name=\"pod-14\"}"}
2024-07-23T16:48:34.874Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314673, "target_labels": "{__name__=\"up\", instance=\"172.24.27.189:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-15\", k8s_pod_name=\"pod-15\"}"}
2024-07-23T16:48:34.893Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314688, "target_labels": "{__name__=\"up\", instance=\"172.24.50.81:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-17\", k8s_pod_name=\"pod-17\"}"}
2024-07-23T16:48:35.048Z    debug    memorylimiter/memorylimiter.go:200    Currently used memory.    {"kind": "processor", "name": "memory_limiter", "pipeline": "metrics", "cur_mem_mib": 465}
2024-07-23T16:48:35.049Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314847, "target_labels": "{__name__=\"up\", instance=\"172.24.30.98:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-20\", k8s_pod_name=\"pod-20\"}"}
2024-07-23T16:48:35.109Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753314908, "target_labels": "{__name__=\"up\", instance=\"172.24.62.137:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-16\", k8s_pod_name=\"pod-16\"}"}
2024-07-23T16:48:35.221Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.30.87:15020/stats/prometheus", "error": "Get \"http://172.24.30.87:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:35.221Z    warn    internal/transaction.go:125    Failed to scrape Prometheus endpoint    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_timestamp": 1721753315019, "target_labels": "{__name__=\"up\", instance=\"172.24.30.87:15020\", job=\"integrations/kubernetes/workload-prometheus\", k8s_namespace_name=\"namespace-14\", k8s_pod_name=\"pod-14\"}"}
2024-07-23T16:48:35.229Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.10.220:15020/stats/prometheus", "error": "Get \"http://172.24.10.220:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:35.322Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.62.204:15020/stats/prometheus", "error": "Get \"http://172.24.62.204:15020/stats/prometheus\": context deadline exceeded"}
2024-07-23T16:48:35.354Z    debug    scrape/scrape.go:1337    Scrape failed    {"kind": "receiver", "name": "prometheus/workload_prometheus", "data_type": "metrics", "scrape_pool": "integrations/kubernetes/workload-prometheus", "target": "http://172.24.52.80:15020/stats/prometheus", "error": "Get \"http://172.24.52.80:15020/stats/prometheus\": context deadline exceeded"}

Additional context

The resource utilization is low, but memory grows over time up to 50% of the configured limit, specified below.

resources:
  limits:
    cpu: 2
    memory: 8Gi
  requests:
    cpu: 500m
    memory: 1024Mi

Screenshot with metrics from this scraper instance.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-07-23T17:36:52Z

Pinging code owners:

exporter/prometheusremotewrite: @Aneurysm9 @rapphil
receiver/prometheus: @Aneurysm9 @dashpole

See Adding Labels via Comments if you do not have permissions to add labels yourself.

dashpole · 2024-07-29T18:05:51Z

This doesn't look like an issue with the receiver. Does increasing the number of consumers help?

gbxavier · 2024-08-02T14:10:37Z

No. It does not help. I think it's a performance issue with the exporter. After further experimentation, increasing the batch size to 50000 stopped me from having problems with the queue.

dashpole · 2024-08-28T17:53:43Z

Looks like the number of consumers is hard-coded to 1, which would explain why bumping that up doesn't help...

opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter/factory.go

Lines 51 to 70 in 0f63b5a

    
           // Don't allow users to configure the queue. 
        
           // See https://github.com/open-telemetry/opentelemetry-collector/issues/2949. 
        
           // Prometheus remote write samples needs to be in chronological 
        
           // order for each timeseries. If we shard the incoming metrics 
        
           // without considering this limitation, we experience 
        
           // "out of order samples" errors. 
        
           exporter, err := exporterhelper.NewMetricsExporter( 
        
           	ctx, 
        
           	set, 
        
           	cfg, 
        
           	prwe.PushMetrics, 
        
           	exporterhelper.WithTimeout(prwCfg.TimeoutSettings), 
        
           	exporterhelper.WithQueue(exporterhelper.QueueSettings{ 
        
           		Enabled:      prwCfg.RemoteWriteQueue.Enabled, 
        
           		NumConsumers: 1, 
        
           		QueueSize:    prwCfg.RemoteWriteQueue.QueueSize, 
        
           	}), 
        
           	exporterhelper.WithStart(prwe.Start), 
        
           	exporterhelper.WithShutdown(prwe.Shutdown), 
        
           )

We should probably emit a warning for people who try to configure this.

dashpole · 2024-08-28T17:56:24Z

So increasing the batch size is probably the only resolution we can offer, which you found.

Action items:

Comment in the docs
Warning in the logs when people try to set num_consumers

dashpole · 2024-08-28T17:57:21Z

@ArthurSens

ArthurSens · 2024-08-29T20:02:15Z

I'm struggling a bit to understand how to log a warning. Which object has a logger object during the exporter creation? I can't find any 🤔

dashpole · 2024-09-03T15:18:36Z

You can use exporter.Settings:

opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter/factory.go

Line 38 in c6cda87

func createMetricsExporter(ctx context.Context, set exporter.Settings,

That is https://pkg.go.dev/go.opentelemetry.io/collector/exporter@v0.108.1/internal#Settings

It includes a Logger: https://pkg.go.dev/go.opentelemetry.io/collector/component#TelemetrySettings

So you can do set.Logger.Info(...

…te_write_queue.num_consumers being a no-op (#34993) **Description:** This PR documents and adds a warning log if remote_write_queue.num_consumers is set in the prometheusremotewriteexporter's config. Current behavior already doesn't use the configuration for anything, more information can be found in open-telemetry/opentelemetry-collector#2949 **Link to tracking Issue:** Related to #34229 (not a fix) Should we skip changelog here? Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

…te_write_queue.num_consumers being a no-op (open-telemetry#34993) **Description:** This PR documents and adds a warning log if remote_write_queue.num_consumers is set in the prometheusremotewriteexporter's config. Current behavior already doesn't use the configuration for anything, more information can be found in open-telemetry/opentelemetry-collector#2949 **Link to tracking Issue:** Related to open-telemetry#34229 (not a fix) Should we skip changelog here? Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

gbxavier added bug Something isn't working needs triage New item requiring triage labels Jul 23, 2024

github-actions bot added exporter/prometheusremotewrite receiver/prometheus Prometheus receiver labels Jul 23, 2024

dashpole removed the receiver/prometheus Prometheus receiver label Jul 29, 2024

github-actions bot mentioned this issue Jul 30, 2024

Weekly Report: 2024-07-23 - 2024-07-30 #34301

Closed

github-actions bot mentioned this issue Aug 6, 2024

Weekly Report: 2024-07-30 - 2024-08-06 #34410

Closed

This was referenced Aug 13, 2024

Weekly Report: 2024-08-06 - 2024-08-13 #34626

Closed

Weekly Report: 2024-08-13 - 2024-08-20 #34743

Closed

github-actions bot mentioned this issue Aug 27, 2024

Weekly Report: 2024-08-20 - 2024-08-27 #34856

Closed

dashpole removed the needs triage New item requiring triage label Aug 28, 2024

ArthurSens mentioned this issue Sep 3, 2024

[exporter/prometheusremotewriteexporter]chore: log warning about remote_write_queue.num_consumers being a no-op #34993

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queue piles up and metrics are delayed if targets are unreachable #34229

Queue piles up and metrics are delayed if targets are unreachable #34229

gbxavier commented Jul 23, 2024

github-actions bot commented Jul 23, 2024

dashpole commented Jul 29, 2024

gbxavier commented Aug 2, 2024

dashpole commented Aug 28, 2024

dashpole commented Aug 28, 2024

dashpole commented Aug 28, 2024

ArthurSens commented Aug 29, 2024

dashpole commented Sep 3, 2024