sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0 #7395

mkrull · 2024-05-28T09:59:56Z

Thanos, Prometheus and Golang version used:

thanos, version 0.32.3 (branch: HEAD, revision: 3d98d7ce7a254b893e4c8ee8122f7f6edd3174bd)
  build user:       root@0b3c549e9dae
  build date:       20230920-07:27:32
  go version:       go1.20.8
  platform:         linux/amd64
  tags:             netgo

Object Storage Provider:

AWS S3

What happened:

After upgrading from 0.31.0 to 0.35.0 we saw greatly increased sidecar memory usage and narrowed it down to a change between 0.32.2 and 0.32.3 (the Prometheus update maybe?).

The memory usage shoots up for certain queries, for us likely recording rules by the ruler, thus constantly high usage was observed.

What you expected to happen:

No significant change in memory usage.

How to reproduce it (as minimally and precisely as possible):

Run {job=".+"} on Prometheus with some metrics for either version and compare memory usage.

Full logs to relevant components:

Anything else we need to know:

Heap profiles for 0.32.2 and 0.32.3 with the same query on the same Prometheus node:

The text was updated successfully, but these errors were encountered:

mkrull · 2024-05-28T10:29:42Z

This comment probably refers to the same issue: #6744 (comment)

GiedriusS · 2024-05-28T11:09:09Z

I think it's a consequence of #6706. We had to fix a correctness bug and as a consequence, responses need to be sorted in memory before being sent off. Unfortunately, but Prometheus sometimes produces not a sorted response and that needs to be fixed upstream. Or external labels functionality has to be completely reworked. See prometheus/prometheus#12605

mkrull · 2024-05-28T16:19:22Z

Ouch, I see. Upgrading in environments like Kubernetes comes with a considerable new risk of OOMs for pods running Prometheus with Thanos sidecar because it gets really hard to estimate max memory requirements for the sidecar containers 🤔

mazad01 · 2024-08-05T20:09:56Z

Still happening in 0.36.0.

Substantial mem usage after going from 0.28.1 -> 0.36.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0 #7395

sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0 #7395

mkrull commented May 28, 2024 •

edited

Loading

mkrull commented May 28, 2024

GiedriusS commented May 28, 2024

mkrull commented May 28, 2024

mazad01 commented Aug 5, 2024 •

edited

Loading

sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0 #7395

sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0 #7395

Comments

mkrull commented May 28, 2024 • edited Loading

mkrull commented May 28, 2024

GiedriusS commented May 28, 2024

mkrull commented May 28, 2024

mazad01 commented Aug 5, 2024 • edited Loading

mkrull commented May 28, 2024 •

edited

Loading

mazad01 commented Aug 5, 2024 •

edited

Loading