Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update vendored prometheus #8295

Merged
merged 5 commits into from
Jun 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
* [CHANGE] Clamp [`GOMAXPROCS`](https://pkg.go.dev/runtime#GOMAXPROCS) to [`runtime.NumCPU`](https://pkg.go.dev/runtime#NumCPU). #8201
* [CHANGE] Added new metric `cortex_compactor_disk_out_of_space_errors_total` which counts how many times a compaction failed due to the compactor being out of disk. #8237
* [CHANGE] Anonymous usage statistics tracking: report active series in addition to in-memory series. #8279
* [CHANGE] Ruler: `evaluation_delay` field in the rule group configuration has been deprecated. Please use `query_offset` instead (it has the same exact meaning and behaviour). #8295
* [FEATURE] Continuous-test: now runable as a module with `mimir -target=continuous-test`. #7747
* [FEATURE] Store-gateway: Allow specific tenants to be enabled or disabled via `-store-gateway.enabled-tenants` or `-store-gateway.disabled-tenants` CLI flags or their corresponding YAML settings. #7653
* [FEATURE] New `-<prefix>.s3.bucket-lookup-type` flag configures lookup style type, used to access bucket in s3 compatible providers. #7684
Expand Down Expand Up @@ -82,6 +83,7 @@
* [BUGFIX] Store-gateway: Allow long-running index scans to be interrupted. #8154
* [BUGFIX] Query-frontend: fix splitting of queries using `@ start()` and `@end()` modifiers on a subquery. Previously the `start()` and `end()` would be evaluated using the start end end of the split query instead of the original query. #8162
* [BUGFIX] Distributor: Don't discard time series with invalid exemplars, just drop affected exemplars. #8224
* [BUGFIX] Ingester: fixed in-memory series count when replaying a corrupted WAL. #8295
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: this is related to prometheus/prometheus#14079.


### Mixin

Expand Down
5 changes: 5 additions & 0 deletions docs/sources/mimir/configure/about-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,3 +211,8 @@ The following features or configuration parameters are currently deprecated and
- `-ingester.client.report-grpc-codes-in-instrumentation-label-enabled`
- Mimirtool
- the flag `--rule-files`

The following features or configuration parameters are currently deprecated and will be **removed in a future release (to be announced)**:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: I want to keep evaluation_delay support for longer.


- Rule group configuration file
- `evaluation_delay` field: use `query_offset` instead
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ require (
)

// Using a fork of Prometheus with Mimir-specific changes.
replace github.com/prometheus/prometheus => github.com/grafana/mimir-prometheus v0.0.0-20240515135245-e5b85c151ba8
replace github.com/prometheus/prometheus => github.com/grafana/mimir-prometheus v0.0.0-20240606164718-ef8f745d5a38

// Replace memberlist with our fork which includes some fixes that haven't been
// merged upstream yet:
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -517,8 +517,8 @@ github.com/grafana/gomemcache v0.0.0-20240229205252-cd6a66d6fb56 h1:X8IKQ0wu40wp
github.com/grafana/gomemcache v0.0.0-20240229205252-cd6a66d6fb56/go.mod h1:PGk3RjYHpxMM8HFPhKKo+vve3DdlPUELZLSDEFehPuU=
github.com/grafana/memberlist v0.3.1-0.20220714140823-09ffed8adbbe h1:yIXAAbLswn7VNWBIvM71O2QsgfgW9fRXZNR0DXe6pDU=
github.com/grafana/memberlist v0.3.1-0.20220714140823-09ffed8adbbe/go.mod h1:MS2lj3INKhZjWNqd3N0m3J+Jxf3DAOnAH9VT3Sh9MUE=
github.com/grafana/mimir-prometheus v0.0.0-20240515135245-e5b85c151ba8 h1:XmqfG3buFH0G/ns/DXnyMx46LITUEENXjZ84f7YAUy0=
github.com/grafana/mimir-prometheus v0.0.0-20240515135245-e5b85c151ba8/go.mod h1:ZlD3SoAHSwXK5VGLHv78Jh5kOpgSLaQAzt9gxq76fLM=
github.com/grafana/mimir-prometheus v0.0.0-20240606164718-ef8f745d5a38 h1:5alk/jNBusfSJxVXQBnIkE7dPzjnVKIAtt3QOW7bw40=
github.com/grafana/mimir-prometheus v0.0.0-20240606164718-ef8f745d5a38/go.mod h1:Z5gyidU1bBNK7wJ3wPPcP8Fh3mgiHkAmnlATaqcdDak=
github.com/grafana/opentracing-contrib-go-stdlib v0.0.0-20230509071955-f410e79da956 h1:em1oddjXL8c1tL0iFdtVtPloq2hRPen2MJQKoAWpxu0=
github.com/grafana/opentracing-contrib-go-stdlib v0.0.0-20230509071955-f410e79da956/go.mod h1:qtI1ogk+2JhVPIXVc6q+NHziSmy2W5GbdQZFUHADCBU=
github.com/grafana/prometheus-alertmanager v0.25.1-0.20240605141526-70d9d63f74fc h1:VoEf4wNiS3hCPxxmFdEvyeZJA3eI4Wb4gAlzYZwh52A=
Expand Down
2 changes: 1 addition & 1 deletion pkg/mimirtool/backfill/backfill.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ func CreateBlocks(input IteratorCreator, mint, maxt int64, maxSamplesInAppender
blockDuration := tsdb.DefaultBlockDuration
mint = blockDuration * (mint / blockDuration)

db, err := tsdb.OpenDBReadOnly(outputDir, nil)
db, err := tsdb.OpenDBReadOnly(outputDir, "", nil)
if err != nil {
return err
}
Expand Down
59 changes: 56 additions & 3 deletions pkg/ruler/api_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -960,7 +960,6 @@ func TestAPI_CreateRuleGroup(t *testing.T) {
name string
cfg Config
input string
output string
err error
status int
}{
Expand Down Expand Up @@ -1034,7 +1033,61 @@ rules:
labels:
test: test
`,
output: "name: test\ninterval: 15s\nsource_tenants: [t1, t2]\nrules:\n - record: up_rule\n expr: up{}\n - alert: up_alert\n expr: sum(up{}) > 1\n for: 30s\n labels:\n test: test\n annotations:\n test: test\n",
},
{
name: "with valid rules and evaluation delay",
cfg: defaultCfg,
input: `
name: test
interval: 15s
evaluation_delay: 5m
rules:
- record: up_rule
expr: up{}
`,
status: 202,
},
{
name: "with valid rules and query offset",
cfg: defaultCfg,
input: `
name: test
interval: 15s
query_offset: 2m
rules:
- record: up_rule
expr: up{}
`,
status: 202,
},
{
name: "with valid rules and both evaluation delay and query offset set to the same value",
cfg: defaultCfg,
input: `
name: test
interval: 15s
evaluation_delay: 5m
query_offset: 5m
rules:
- record: up_rule
expr: up{}
`,
status: 202,
},
{
name: "with valid rules but evaluation delay and query offset set to different values",
cfg: defaultCfg,
input: `
name: test
interval: 15s
evaluation_delay: 2m
query_offset: 5m
rules:
- record: up_rule
expr: up{}
`,
status: 400,
err: errors.New("invalid rules configuration: rule group 'test' has both query_offset and (deprecated) evaluation_delay set, but to different values; please remove the deprecated evaluation_delay and use query_offset instead"),
},
}

Expand Down Expand Up @@ -1071,7 +1124,7 @@ rules:

router.ServeHTTP(w, req)
require.Equal(t, 200, w.Code)
require.YAMLEq(t, tt.output, w.Body.String())
require.YAMLEq(t, tt.input, w.Body.String())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: we can simplify the test. We expect to receive the same YAML we uploaded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, although isn't that because we don't test 0 value here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. If we would test the 0 value the output would be different


// Ensure it triggered a rules sync notification.
verifySyncRulesMetric(t, reg, 1, 1)
Expand Down
2 changes: 1 addition & 1 deletion pkg/ruler/compat.go
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ func DefaultTenantManagerFactory(
ForGracePeriod: cfg.ForGracePeriod,
ResendDelay: cfg.ResendDelay,
AlwaysRestoreAlertState: true,
DefaultEvaluationDelay: func() time.Duration {
DefaultRuleQueryOffset: func() time.Duration {
// Delay the evaluation of all rules by a set interval to give a buffer
// to metric that haven't been forwarded to Mimir yet.
return overrides.EvaluationDelay(userID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit [non-blocking]: At some point, we probably want to create a new limit so that it matches the description a bit more accurately. Happy to create an issue for it if you think it's worth it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should but looks very low priority. I'm fine creating an issue for now.

Expand Down
5 changes: 2 additions & 3 deletions pkg/ruler/compat_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -659,9 +659,8 @@ func TestDefaultManagerFactory_ShouldInjectStrongReadConsistencyToContextWhenQue

// Create a test alerting rule with a "for" duration greater than the "grace period".
ruleGroup = rulespb.RuleGroupDesc{
Name: "test",
Interval: cfg.EvaluationInterval,
EvaluationDelay: 0,
Name: "test",
Interval: cfg.EvaluationInterval,
Rules: []*rulespb.RuleDesc{{
Expr: fmt.Sprintf("%s > 0", metricName),
Alert: "test",
Expand Down
5 changes: 5 additions & 0 deletions pkg/ruler/manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,11 @@ func (r *DefaultMultiTenantManager) ValidateRuleGroup(g rulefmt.RuleGroup) []err
"but rules federation is disabled; please contact your service administrator to have it enabled", g.Name))
}

//nolint:staticcheck // We want to intentionally access a deprecated field
if g.EvaluationDelay != nil && g.QueryOffset != nil && *g.EvaluationDelay != *g.QueryOffset {
errs = append(errs, fmt.Errorf("invalid rules configuration: rule group '%s' has both query_offset and (deprecated) evaluation_delay set, but to different values; please remove the deprecated evaluation_delay and use query_offset instead", g.Name))
}

for i, r := range g.Rules {
for _, err := range r.Validate() {
var ruleName string
Expand Down
29 changes: 25 additions & 4 deletions pkg/ruler/rulespb/compat.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ import (
"github.com/grafana/mimir/pkg/mimirpb" //lint:ignore faillint allowed to import other protobuf
)

// ToProto transforms a formatted prometheus rulegroup to a rule group protobuf
// ToProto transforms a formatted prometheus rulegroup to a rule group protobuf.
// This function does a 1:1 mapping between the two data models.
func ToProto(user string, namespace string, rl rulefmt.RuleGroup) *RuleGroupDesc {
rg := RuleGroupDesc{
Name: rl.Name,
Expand All @@ -27,9 +28,19 @@ func ToProto(user string, namespace string, rl rulefmt.RuleGroup) *RuleGroupDesc
SourceTenants: rl.SourceTenants,
AlignEvaluationTimeOnInterval: rl.AlignEvaluationTimeOnInterval,
}

// This function is designed to do a 1:1 mapping between the data models, so we
// preserve QueryOffset and EvaluationDelay as they've been set in input. This
// guarantees that when we'll convert RuleGroupDesc back into rulefmt.RuleGroup
// we'll get the same model we originally had in input.
if rl.QueryOffset != nil && *rl.QueryOffset > 0 {
rg.QueryOffset = time.Duration(*rl.QueryOffset)
}
//nolint:staticcheck // We want to intentionally access a deprecated field
if rl.EvaluationDelay != nil && *rl.EvaluationDelay > 0 {
rg.EvaluationDelay = time.Duration(*rl.EvaluationDelay)
}

return &rg
}

Expand All @@ -50,7 +61,7 @@ func formattedRuleToProto(rls []rulefmt.RuleNode) []*RuleDesc {
return rules
}

// FromProto generates a rulefmt RuleGroup
// FromProto generates a rulefmt RuleGroup. This function does a 1:1 mapping between the two data models.
func FromProto(rg *RuleGroupDesc) rulefmt.RuleGroup {
formattedRuleGroup := rulefmt.RuleGroup{
Name: rg.GetName(),
Expand All @@ -59,9 +70,15 @@ func FromProto(rg *RuleGroupDesc) rulefmt.RuleGroup {
SourceTenants: rg.GetSourceTenants(),
AlignEvaluationTimeOnInterval: rg.GetAlignEvaluationTimeOnInterval(),
}

// This function is designed to do a 1:1 mapping between the data models, so we
// preserve QueryOffset and EvaluationDelay as they've been set in input.
if rg.QueryOffset > 0 {
formattedRuleGroup.QueryOffset = pointerOf[model.Duration](model.Duration(rg.QueryOffset))
}
//nolint:staticcheck // We want to intentionally access a deprecated field
if rg.EvaluationDelay > 0 {
formattedRuleGroup.EvaluationDelay = new(model.Duration)
*formattedRuleGroup.EvaluationDelay = model.Duration(rg.EvaluationDelay)
formattedRuleGroup.EvaluationDelay = pointerOf[model.Duration](model.Duration(rg.EvaluationDelay))
}

for i, rl := range rg.GetRules() {
Expand Down Expand Up @@ -91,3 +108,7 @@ func FromProto(rg *RuleGroupDesc) rulefmt.RuleGroup {

return formattedRuleGroup
}

func pointerOf[T any](value T) *T {
return &value
}
Loading
Loading