Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compaction of sporadic blocks #7329

Merged
merged 1 commit into from
Feb 16, 2024

Conversation

jhalterman
Copy link
Member

@jhalterman jhalterman commented Feb 8, 2024

What this PR does

This PR changes the split_merge_grouper's check that's meant to filter recent blocks based on the highestMaxTime, to allow jobs where either the:

  • maxTime is at least 1 full job range in the past, since users may not have uploaded recent blocks (described in #7265), but we don't want that to stop compaction from happening.
  • max compaction level is 1, since filtering level 1 jobs is handled separately. This condition allows compaction to proceed more quickly than waiting on the above condition to be met for L1 blocks.

Together, these conditions address users who upload blocks sporadically, where currently compaction does not occur as expected (as described in #7265).

Which issue(s) this PR fixes or relates to

Fixes #7265

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@jhalterman jhalterman force-pushed the sporadic-block-compaction branch 2 times, most recently from 189a256 to 1f2317b Compare February 10, 2024 03:23
@@ -142,10 +142,6 @@ func TestMultitenantCompactor_ShouldSupportSplitAndMergeCompactor(t *testing.T)
block1 := createTSDBBlock(t, bkt, userID, 0, (5 * time.Minute).Milliseconds(), numSeries, externalLabels(""))
block2 := createTSDBBlock(t, bkt, userID, time.Minute.Milliseconds(), (7 * time.Minute).Milliseconds(), numSeries, externalLabels(""))

// Add another block as "most recent one" otherwise the previous blocks are not compacted
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These additional blocks are no longer needed since the new split_merge_grouper conditions allow compaction to proceed based on the job's time being < now.

@jhalterman jhalterman force-pushed the sporadic-block-compaction branch 5 times, most recently from 1f671e4 to 4dfe3c1 Compare February 14, 2024 20:57
@jhalterman jhalterman marked this pull request as ready for review February 14, 2024 23:58
@jhalterman jhalterman requested a review from a team as a code owner February 14, 2024 23:58
@@ -549,7 +550,6 @@ func testMetadataQueriesWithBlocksStorage(
resp []prompb.Label
}
type labelValuesTest struct {
label string
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this field since it had the same value in all test cases.

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. I'd like to hear what @pracucci thinks, as he introduced the check in the first place and may better remember the reasons.

@pracucci pracucci self-requested a review February 15, 2024 18:12
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! LGTM. I think this is a nice improvement and safe change.

I'd like to hear what @pracucci thinks, as he introduced the check in the first place and may better remember the reasons

The check was an old one. It was originally inherited by Prometheus (which does something similar), and was introduce in a era where blocks backfilling, OOO ingestion and -compactor.first-level-compaction-wait-period were not a thing.

@@ -42,6 +42,7 @@ func Test_MaxSeriesAndChunksPerQueryLimitHit(t *testing.T) {
"-blocks-storage.tsdb.retention-period": blockRangePeriod.String(), // We want blocks to be immediately deleted from ingesters.
"-blocks-storage.tsdb.ship-interval": "1s",
"-blocks-storage.tsdb.head-compaction-interval": "500ms",
"-compactor.first-level-compaction-wait-period": "1m", // Do not compact aggressively
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the default you want for integration tests, then it's better to do the change in BlocksStorageFlags (defined in integration/configs.go).

Copy link
Member Author

@jhalterman jhalterman Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this the default causes some other ITs, which expect immediate L1 compaction, to fail. I'll leave the current default for now and just override for these 2 tests.

This PR improves compaction of blocks that are created sporadically, by relaxing the highestMaxTime check to exclude jobs where the max compaction level is 1, or where the job is at least 1 job range in the past.
@jhalterman jhalterman force-pushed the sporadic-block-compaction branch 2 times, most recently from 2288406 to 8d17aca Compare February 15, 2024 21:04
@jhalterman jhalterman merged commit 74de4f6 into grafana:main Feb 16, 2024
28 checks passed
@jhalterman jhalterman deleted the sporadic-block-compaction branch February 16, 2024 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compactor does not compact blocks when samples are pushed sporadically
3 participants