Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

Merged
merged 25 commits into from
May 16, 2023

Conversation

dimitarvdimitrov
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov commented May 12, 2023

What this PR does

I started this as an internal refactor to make use of the existing offsetTrackingReader that #4926 introduced.

But it turned out that I can slightly refactor the chunks refetching logic and get rid of a lot of code (~800 LOC). I also covered it with some tests because it was untested until now. If it's easier to review, I can split this into multiple PRs.

This allowed to remove the method for fetching a single chunk range and all the related code blocks

  • pool.BucketedBytes
  • storegateway.byteRange
  • bucketBlock.readChunkRange

Because of this the following config options are no longer used and can be removed

  • -blocks-storage.bucket-store.max-chunk-pool-bytes
  • -blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes
  • -blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes

I marked them as deprecated but they are also unused. I think this falls out of line with our deprecation policy, but presently the options were only used when refetching chunks, which is a very rare scenario and weren't being respected in the regular case.

related to #3939

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@@ -65,7 +67,7 @@ func (r *bucketChunkReader) addLoad(id chunks.ChunkRef, seriesEntry, chunkEntry
}
r.toLoad[seq] = append(r.toLoad[seq], loadIdx{
offset: off,
length: length,
length: util_math.Max(varint.MaxLen32, length), // If the length is 0, we need to at least fetch the length of the chunk.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After adding tests, I discovered that the bucketChunkReader cannot handle 0 estimations even though 0 is a valid value according to this godoc

// length will be 0 when the length of the chunk isn't known

This wasn't causing bugs because length never would have been 0 as it is always set to a non-zero value here

var chunkLen uint32
// We can only calculate the length of this chunk, if we know the ref of the next chunk
// and the two chunks are in the same segment file.
// We do that by taking the difference between the chunk references. This works since the chunk references are offsets in a file.
// If the chunks are in different segment files (unlikely, but possible),
// then this chunk ends at the end of the segment file, and we don't know how big the segment file is.
if nextRef, ok := nextChunkRef(partitions, pIdx, cIdx); ok && chunkSegmentFile(nextRef) == chunkSegmentFile(c.Ref) {
chunkLen = chunkOffset(nextRef) - chunkOffset(c.Ref)
if chunkLen > tsdb.EstimatedMaxChunkSize {
// Clamp the length in case chunks are scattered across a segment file. This should never happen,
// but if it does, we don't want to have an erroneously large length.
chunkLen = tsdb.EstimatedMaxChunkSize
}
} else {
chunkLen = tsdb.EstimatedMaxChunkSize
}

@pracucci pracucci self-requested a review May 15, 2023 04:55
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, LGTM! The bucketChunkReader.loadChunks() is easier to follow now 👍 I just left few minor comments.

I marked them as deprecated but they are also unused. I think this falls out of line with our deprecation policy, but presently the options were only used when refetching chunks, which is a very rare scenario and weren't being respected in the regular case.

Not a problem to me.

pkg/storage/tsdb/config.go Outdated Show resolved Hide resolved
pkg/storegateway/chunk_bytes_pool.go Show resolved Hide resolved
pkg/storegateway/bucket_chunk_reader.go Show resolved Hide resolved
pkg/storegateway/bucket_chunk_reader.go Outdated Show resolved Hide resolved
pkg/storegateway/bucket_chunk_reader.go Show resolved Hide resolved
pkg/storegateway/bucket_chunk_reader.go Outdated Show resolved Hide resolved
return errors.Wrap(err, "populate chunk")
}
localStats.chunksTouched++
localStats.chunksTouchedSizeSum += chunkLen + crc32.Size
localStats.chunksTouchedSizeSum += chunkEncDataLen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Previously we were also counting the varbit length and crc32.Size.

Copy link
Contributor Author

@dimitarvdimitrov dimitarvdimitrov May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether we should actually. It's bytes we throw away directly from the io.Reader. Is it different than discarding bytes of a different chunk?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely remember something: was the CRC added by you because you found out that for some special chunks the 4 bytes of the CRC was a significant % of the whole chunk data? 🤔

If we don't count the varbit and CRC length, could this be misleading when computing the "overfetching"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that makes sense. The crc is also accounted for here

total += varint.UvarintSize(uint64(dataLen)) + 1 + dataLen + crc32.Size

so it makes sense to also count it in loadChunks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also add varint.UvarintSize(uint64(dataLen)) for the same reason?

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov dimitarvdimitrov force-pushed the dimitar/st-gw/clean-up-chunks-fetching branch from f0d9c4e to ef21749 Compare May 15, 2023 09:16
@dimitarvdimitrov
Copy link
Contributor Author

thanks for the review @pracucci. I rebased after merging #4995 and addressed most of your comments. The one left is for whether or not we include the crc32 in the touched metrics.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments. New commits LGTM. I also replied to your comment with another question.

docs/sources/mimir/configure/about-versioning.md Outdated Show resolved Hide resolved
return errors.Wrap(err, "populate chunk")
}
localStats.chunksTouched++
localStats.chunksTouchedSizeSum += chunkLen + crc32.Size
localStats.chunksTouchedSizeSum += chunkEncDataLen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely remember something: was the CRC added by you because you found out that for some special chunks the 4 bytes of the CRC was a significant % of the whole chunk data? 🤔

If we don't count the varbit and CRC length, could this be misleading when computing the "overfetching"?

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov
Copy link
Contributor Author

dimitarvdimitrov commented May 16, 2023

with this I will also remove the flag for chunks pool from the mimir jsonnet and helm. I pushed this commit to do it ebe56c5

@dimitarvdimitrov
Copy link
Contributor Author

there is a strange 1k loc diff in the helm manifests. I will take a look later

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov
Copy link
Contributor Author

fixed. It was a matter of removing a local file and regenerating helm tests

@pracucci
Copy link
Collaborator

Still LGTM. I just left a final comment: #4996 (comment)

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov dimitarvdimitrov merged commit 942e16c into main May 16, 2023
@dimitarvdimitrov dimitarvdimitrov deleted the dimitar/st-gw/clean-up-chunks-fetching branch May 16, 2023 16:44
leizor added a commit that referenced this pull request Nov 15, 2023
The following deprecated flags are removed:
  - `-blocks-storage.bucket-store.max-chunk-pool-bytes`
  - `-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes`
  - `-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes`

See #4996 for more details.
leizor added a commit that referenced this pull request Nov 17, 2023
* Remove -querier.query-ingesters-within config

The `-querier.query-ingesters-within` config has been moved from a
global config to a per-tenant limit config.

See #4287 for more details.

* Remove -querier.iterators and -querier.batch-iterators

The `-querier.iterators` and `-querier.batch-iterators` configuration
parameters have been removed.

See #5114 for more details.

* Remove deprecated bucket store flags

The following deprecated flags are removed:
  - `-blocks-storage.bucket-store.max-chunk-pool-bytes`
  - `-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes`
  - `-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes`

See #4996 for more details.

* Remove -blocks-storage.bucket-store.bucket-index.enabled config

This configuration parameter has been removed. Mimir is running with
bucket index enabled by default since 2.0 and it is now not possible to
disable it.

See #5051 for more details.

* Update CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants