Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store-gateway: clean up chunks fetching, deprecate bucketed chunks pool #4996

Merged
merged 25 commits into from
May 16, 2023
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
8acc7eb
Add test for refetching chunks
dimitarvdimitrov May 12, 2023
b232842
Clean up chunks fetching
dimitarvdimitrov May 5, 2023
f6e8ab1
Remove more unused structs from pool
dimitarvdimitrov May 5, 2023
d78c803
Fix error handling
dimitarvdimitrov May 12, 2023
9de2d1a
Remove io
dimitarvdimitrov May 12, 2023
70ff43e
nit on an error message
dimitarvdimitrov May 12, 2023
4682075
Handle small estimations for chunk length
dimitarvdimitrov May 12, 2023
5c456e1
Add CHANGELOG.md entry
dimitarvdimitrov May 12, 2023
3a9ce79
Update CHANGELOG.md
dimitarvdimitrov May 12, 2023
b40fbee
Remove more code
dimitarvdimitrov May 12, 2023
ca98732
Remove more code
dimitarvdimitrov May 12, 2023
3fe5dfc
Add license
dimitarvdimitrov May 12, 2023
4b04282
Update Update deprecation version
dimitarvdimitrov May 15, 2023
4bbe45e
Mention removed metrics in CHANGELOG.md
dimitarvdimitrov May 15, 2023
4f3d26b
Clarify retaining bytes
dimitarvdimitrov May 15, 2023
116dafe
Update error message
dimitarvdimitrov May 15, 2023
645844a
Add sanity check to fetchChunkRemainder
dimitarvdimitrov May 15, 2023
ef21749
Reorder chunkEncDataLen components
dimitarvdimitrov May 15, 2023
3c950a9
Replace rename function usage
dimitarvdimitrov May 15, 2023
70e8c77
Correct about-versioning.md
dimitarvdimitrov May 15, 2023
19c0f6d
Clarify version of removal in CHANGELOG.md
dimitarvdimitrov May 15, 2023
0e42fa2
Add crc32 to touched size
dimitarvdimitrov May 15, 2023
ebe56c5
Remove chunk pool size limit from deployment configs
dimitarvdimitrov May 16, 2023
48067d8
revert faulty change
dimitarvdimitrov May 16, 2023
cff1855
Also include chunk length varint size
dimitarvdimitrov May 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@
* [CHANGE] Querier: `-querier.query-ingesters-within` has been moved from a global flag to a per-tenant override. #4287
* [CHANGE] Querier: Use `-blocks-storage.tsdb.retention-period` instead of `-querier.query-ingesters-within` for calculating the lookback period for shuffle sharded ingesters. Setting `-querier.query-ingesters-within=0` no longer disables shuffle sharding on the read path. #4287
* [CHANGE] Block upload: `/api/v1/upload/block/{block}/files` endpoint now allows file uploads with no `Content-Length`. #4956
* [CHANGE] Store-gateway: deprecate configuration parameters for chunk pooling. The following options are now also ignored: #4996
* `-blocks-storage.bucket-store.max-chunk-pool-bytes`
* `-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes`
* `-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes`
* [CHANGE] Store-gateway: remove metrics `cortex_bucket_store_chunk_pool_requested_bytes_total` and `cortex_bucket_store_chunk_pool_returned_bytes_total`. #4996
* [ENHANCEMENT] Add per-tenant limit `-validation.max-native-histogram-buckets` to be able to ignore native histogram samples that have too many buckets. #4765
* [ENHANCEMENT] Store-gateway: reduce memory usage in some LabelValues calls. #4789
* [ENHANCEMENT] Store-gateway: add a `stage` label to the metric `cortex_bucket_store_series_data_touched`. This label now applies to `data_type="chunks"` and `data_type="series"`. The `stage` label has 2 values: `processed` - the number of series that parsed - and `returned` - the number of series selected from the processed bytes to satisfy the query. #4797 #4830
Expand Down
6 changes: 3 additions & 3 deletions cmd/mimir/config-descriptor.json
Original file line number Diff line number Diff line change
Expand Up @@ -7057,7 +7057,7 @@
"fieldDefaultValue": 2147483648,
"fieldFlag": "blocks-storage.bucket-store.max-chunk-pool-bytes",
"fieldType": "int",
"fieldCategory": "advanced"
"fieldCategory": "deprecated"
},
{
"kind": "field",
Expand All @@ -7068,7 +7068,7 @@
"fieldDefaultValue": 16000,
"fieldFlag": "blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes",
"fieldType": "int",
"fieldCategory": "advanced"
"fieldCategory": "deprecated"
},
{
"kind": "field",
Expand All @@ -7079,7 +7079,7 @@
"fieldDefaultValue": 50000000,
"fieldFlag": "blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes",
"fieldType": "int",
"fieldCategory": "advanced"
"fieldCategory": "deprecated"
},
{
"kind": "field",
Expand Down
6 changes: 3 additions & 3 deletions cmd/mimir/help-all.txt.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -282,9 +282,9 @@ Usage of ./cmd/mimir/mimir:
-blocks-storage.bucket-store.bucket-index.update-on-error-interval duration
How frequently a bucket index, which previously failed to load, should be tried to load again. This option is used only by querier. (default 1m0s)
-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes int
Size - in bytes - of the largest chunks pool bucket. (default 50000000)
[deprecated] Size - in bytes - of the largest chunks pool bucket. (default 50000000)
-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes int
Size - in bytes - of the smallest chunks pool bucket. (default 16000)
[deprecated] Size - in bytes - of the smallest chunks pool bucket. (default 16000)
-blocks-storage.bucket-store.chunks-cache.attributes-in-memory-max-items int
Maximum number of object attribute items to keep in a first level in-memory LRU cache. Metadata will be stored and fetched in-memory before hitting the cache backend. 0 to disable the in-memory cache. (default 50000)
-blocks-storage.bucket-store.chunks-cache.attributes-ttl duration
Expand Down Expand Up @@ -488,7 +488,7 @@ Usage of ./cmd/mimir/mimir:
-blocks-storage.bucket-store.index-header.max-idle-file-handles uint
Maximum number of idle file handles the store-gateway keeps open for each index-header file. (default 1)
-blocks-storage.bucket-store.max-chunk-pool-bytes uint
Max size - in bytes - of a chunks pool, used to reduce memory allocations. The pool is shared across all tenants. 0 to disable the limit. (default 2147483648)
[deprecated] Max size - in bytes - of a chunks pool, used to reduce memory allocations. The pool is shared across all tenants. 0 to disable the limit. (default 2147483648)
-blocks-storage.bucket-store.max-concurrent int
Max number of concurrent queries to execute against the long-term storage. The limit is shared across all tenants. (default 100)
-blocks-storage.bucket-store.meta-sync-concurrency int
Expand Down
6 changes: 5 additions & 1 deletion docs/sources/mimir/configure/about-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,11 @@ The following features are currently deprecated and will be **removed in Mimir 2
- Ingester
- `-ingester.ring.readiness-check-ring-health`

The following features are currently deprecated and will be **removed in Mimir 2.10**:
The following features are currently deprecated and will be **removed in Mimir 2.11**:

- Store-gateway
- `-blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes`
- `-blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes`
- `-blocks-storage.bucket-store.max-chunk-pool-bytes`
dimitarvdimitrov marked this conversation as resolved.
Show resolved Hide resolved
- Ingester
- `-blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup`
dimitarvdimitrov marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -3199,16 +3199,16 @@ bucket_store:
# CLI flag: -blocks-storage.bucket-store.ignore-blocks-within
[ignore_blocks_within: <duration> | default = 10h]

# (advanced) Max size - in bytes - of a chunks pool, used to reduce memory
# (deprecated) Max size - in bytes - of a chunks pool, used to reduce memory
# allocations. The pool is shared across all tenants. 0 to disable the limit.
# CLI flag: -blocks-storage.bucket-store.max-chunk-pool-bytes
[max_chunk_pool_bytes: <int> | default = 2147483648]

# (advanced) Size - in bytes - of the smallest chunks pool bucket.
# (deprecated) Size - in bytes - of the smallest chunks pool bucket.
# CLI flag: -blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes
[chunk_pool_min_bucket_size_bytes: <int> | default = 16000]

# (advanced) Size - in bytes - of the largest chunks pool bucket.
# (deprecated) Size - in bytes - of the largest chunks pool bucket.
# CLI flag: -blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes
[chunk_pool_max_bucket_size_bytes: <int> | default = 50000000]

Expand Down
2 changes: 0 additions & 2 deletions pkg/mimir/mimir_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,6 @@ func TestMimir(t *testing.T) {
},
},
BucketStore: tsdb.BucketStoreConfig{
ChunkPoolMinBucketSizeBytes: tsdb.ChunkPoolDefaultMinBucketSize,
ChunkPoolMaxBucketSizeBytes: tsdb.ChunkPoolDefaultMaxBucketSize,
IndexCache: tsdb.IndexCacheConfig{
BackendConfig: cache.BackendConfig{
Backend: tsdb.IndexCacheBackendInMemory,
Expand Down
29 changes: 21 additions & 8 deletions pkg/storage/tsdb/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,10 @@ const (
BytesPerPostingInAPostingList = 4

// ChunkPoolDefaultMinBucketSize is the default minimum bucket size (bytes) of the chunk pool.
ChunkPoolDefaultMinBucketSize = EstimatedMaxChunkSize
ChunkPoolDefaultMinBucketSize = EstimatedMaxChunkSize // Deprecated. TODO: Remove in Mimir 2.11.

// ChunkPoolDefaultMaxBucketSize is the default maximum bucket size (bytes) of the chunk pool.
ChunkPoolDefaultMaxBucketSize = 50e6
ChunkPoolDefaultMaxBucketSize = 50e6 // Deprecated. TODO: Remove in Mimir 2.11.

// DefaultPostingOffsetInMemorySampling represents default value for --store.index-header-posting-offsets-in-mem-sampling.
// 32 value is chosen as it's a good balance for common setups. Sampling that is not too large (too many CPU cycles) and
Expand All @@ -96,6 +96,10 @@ const (
consistencyDelayFlag = "blocks-storage.bucket-store.consistency-delay"
maxTSDBOpeningConcurrencyOnStartupFlag = "blocks-storage.tsdb.max-tsdb-opening-concurrency-on-startup"
defaultMaxTSDBOpeningConcurrencyOnStartup = 10

maxChunksBytesPoolFlag = "blocks-storage.bucket-store.max-chunk-pool-bytes"
minBucketSizeBytesFlag = "blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes"
maxBucketSizeBytesFlag = "blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes"
)

// Validation errors
Expand Down Expand Up @@ -345,9 +349,9 @@ type BucketStoreConfig struct {
IgnoreBlocksWithin time.Duration `yaml:"ignore_blocks_within" category:"advanced"`

// Chunk pool.
MaxChunkPoolBytes uint64 `yaml:"max_chunk_pool_bytes" category:"advanced"`
ChunkPoolMinBucketSizeBytes int `yaml:"chunk_pool_min_bucket_size_bytes" category:"advanced"`
ChunkPoolMaxBucketSizeBytes int `yaml:"chunk_pool_max_bucket_size_bytes" category:"advanced"`
DeprecatedMaxChunkPoolBytes uint64 `yaml:"max_chunk_pool_bytes" category:"deprecated"` // Deprecated. TODO: Remove in Mimir 2.11.
DeprecatedChunkPoolMinBucketSizeBytes int `yaml:"chunk_pool_min_bucket_size_bytes" category:"deprecated"` // Deprecated. TODO: Remove in Mimir 2.11.
DeprecatedChunkPoolMaxBucketSizeBytes int `yaml:"chunk_pool_max_bucket_size_bytes" category:"deprecated"` // Deprecated. TODO: Remove in Mimir 2.11.

// Series hash cache.
SeriesHashCacheMaxBytes uint64 `yaml:"series_hash_cache_max_size_bytes" category:"advanced"`
Expand Down Expand Up @@ -398,9 +402,9 @@ func (cfg *BucketStoreConfig) RegisterFlags(f *flag.FlagSet, logger log.Logger)

f.StringVar(&cfg.SyncDir, "blocks-storage.bucket-store.sync-dir", "./tsdb-sync/", "Directory to store synchronized TSDB index headers. This directory is not required to be persisted between restarts, but it's highly recommended in order to improve the store-gateway startup time.")
f.DurationVar(&cfg.SyncInterval, "blocks-storage.bucket-store.sync-interval", 15*time.Minute, "How frequently to scan the bucket, or to refresh the bucket index (if enabled), in order to look for changes (new blocks shipped by ingesters and blocks deleted by retention or compaction).")
f.Uint64Var(&cfg.MaxChunkPoolBytes, "blocks-storage.bucket-store.max-chunk-pool-bytes", uint64(2*units.Gibibyte), "Max size - in bytes - of a chunks pool, used to reduce memory allocations. The pool is shared across all tenants. 0 to disable the limit.")
f.IntVar(&cfg.ChunkPoolMinBucketSizeBytes, "blocks-storage.bucket-store.chunk-pool-min-bucket-size-bytes", ChunkPoolDefaultMinBucketSize, "Size - in bytes - of the smallest chunks pool bucket.")
f.IntVar(&cfg.ChunkPoolMaxBucketSizeBytes, "blocks-storage.bucket-store.chunk-pool-max-bucket-size-bytes", ChunkPoolDefaultMaxBucketSize, "Size - in bytes - of the largest chunks pool bucket.")
f.Uint64Var(&cfg.DeprecatedMaxChunkPoolBytes, maxChunksBytesPoolFlag, uint64(2*units.Gibibyte), "Max size - in bytes - of a chunks pool, used to reduce memory allocations. The pool is shared across all tenants. 0 to disable the limit.")
f.IntVar(&cfg.DeprecatedChunkPoolMinBucketSizeBytes, minBucketSizeBytesFlag, ChunkPoolDefaultMinBucketSize, "Size - in bytes - of the smallest chunks pool bucket.")
f.IntVar(&cfg.DeprecatedChunkPoolMaxBucketSizeBytes, maxBucketSizeBytesFlag, ChunkPoolDefaultMaxBucketSize, "Size - in bytes - of the largest chunks pool bucket.")
f.Uint64Var(&cfg.SeriesHashCacheMaxBytes, "blocks-storage.bucket-store.series-hash-cache-max-size-bytes", uint64(1*units.Gibibyte), "Max size - in bytes - of the in-memory series hash cache. The cache is shared across all tenants and it's used only when query sharding is enabled.")
f.IntVar(&cfg.MaxConcurrent, "blocks-storage.bucket-store.max-concurrent", 100, "Max number of concurrent queries to execute against the long-term storage. The limit is shared across all tenants.")
f.IntVar(&cfg.TenantSyncConcurrency, "blocks-storage.bucket-store.tenant-sync-concurrency", 10, "Maximum number of concurrent tenants synching blocks.")
Expand Down Expand Up @@ -436,6 +440,15 @@ func (cfg *BucketStoreConfig) Validate(logger log.Logger) error {
if cfg.DeprecatedConsistencyDelay > 0 {
util.WarnDeprecatedConfig(consistencyDelayFlag, logger)
}
if cfg.DeprecatedMaxChunkPoolBytes != uint64(2*units.Gibibyte) {
util.WarnDeprecatedConfig(maxChunksBytesPoolFlag, logger)
}
if cfg.DeprecatedChunkPoolMinBucketSizeBytes != ChunkPoolDefaultMinBucketSize {
util.WarnDeprecatedConfig(minBucketSizeBytesFlag, logger)
}
if cfg.DeprecatedChunkPoolMaxBucketSizeBytes != ChunkPoolDefaultMaxBucketSize {
util.WarnDeprecatedConfig(maxBucketSizeBytesFlag, logger)
}
if !util.StringsContain(validSeriesSelectionStrategies, cfg.SeriesSelectionStrategyName) {
return errors.New("invalid series-selection-strategy, set one of " + strings.Join(validSeriesSelectionStrategies, ", "))
}
Expand Down
44 changes: 0 additions & 44 deletions pkg/storegateway/bucket.go
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,6 @@ import (
"github.com/grafana/mimir/pkg/storegateway/storepb"
"github.com/grafana/mimir/pkg/util"
util_math "github.com/grafana/mimir/pkg/util/math"
"github.com/grafana/mimir/pkg/util/pool"
"github.com/grafana/mimir/pkg/util/spanlogger"
)

Expand All @@ -66,10 +65,6 @@ const (
// Take a look at Figure 6 in this whitepaper http://www.vldb.org/pvldb/vol8/p1816-teller.pdf.
MaxSamplesPerChunk = 120

// Relatively large in order to reduce memory waste, yet small enough to avoid excessive allocations.
chunkBytesPoolMinSize = 64 * 1024 // 64 KiB
chunkBytesPoolMaxSize = 64 * 1024 * 1024 // 64 MiB

// Labels for metrics.
labelEncode = "encode"
labelDecode = "decode"
Expand All @@ -96,7 +91,6 @@ type BucketStore struct {
indexCache indexcache.IndexCache
chunksCache chunkscache.Cache
indexReaderPool *indexheader.ReaderPool
chunkPool pool.Bytes
seriesHashCache *hashcache.SeriesHashCache

// Sets of blocks that have the same labels. They are indexed by a hash over their label set.
Expand Down Expand Up @@ -208,13 +202,6 @@ func WithQueryGate(queryGate gate.Gate) BucketStoreOption {
}
}

// WithChunkPool sets a pool.Bytes to use for chunks.
func WithChunkPool(chunkPool pool.Bytes) BucketStoreOption {
return func(s *BucketStore) {
s.chunkPool = chunkPool
}
}

func WithFineGrainedChunksCaching(enabled bool) BucketStoreOption {
return func(s *BucketStore) {
s.fineGrainedChunksCachingEnabled = enabled
Expand Down Expand Up @@ -250,7 +237,6 @@ func NewBucketStore(
dir: dir,
indexCache: noopCache{},
chunksCache: chunkscache.NoopCache{},
chunkPool: pool.NoopBytes{},
blocks: map[ulid.ULID]*bucketBlock{},
blockSet: newBucketBlockSet(),
blockSyncConcurrency: blockSyncConcurrency,
Expand Down Expand Up @@ -443,7 +429,6 @@ func (s *BucketStore) addBlock(ctx context.Context, meta *metadata.Meta) (err er
s.bkt,
dir,
s.indexCache,
s.chunkPool,
indexHeaderReader,
s.partitioners,
)
Expand Down Expand Up @@ -1481,7 +1466,6 @@ type bucketBlock struct {
meta *metadata.Meta
dir string
indexCache indexcache.IndexCache
chunkPool pool.Bytes

indexHeaderReader indexheader.Reader

Expand All @@ -1507,7 +1491,6 @@ func newBucketBlock(
bkt objstore.BucketReader,
dir string,
indexCache indexcache.IndexCache,
chunkPool pool.Bytes,
indexHeadReader indexheader.Reader,
p blockPartitioners,
) (b *bucketBlock, err error) {
Expand All @@ -1517,7 +1500,6 @@ func newBucketBlock(
metrics: metrics,
bkt: bkt,
indexCache: indexCache,
chunkPool: chunkPool,
dir: dir,
partitioners: p,
meta: meta,
Expand Down Expand Up @@ -1576,32 +1558,6 @@ func (b *bucketBlock) readIndexRange(ctx context.Context, off, length int64) ([]
return buf.Bytes(), nil
}

func (b *bucketBlock) readChunkRange(ctx context.Context, seq int, off, length int64, chunkRanges byteRanges) (*[]byte, error) {
if seq < 0 || seq >= len(b.chunkObjs) {
return nil, errors.Errorf("unknown segment file for index %d", seq)
}

ctx = bucketcache.WithMemoryPool(ctx, chunkBytesSlicePool, chunkBytesSlabSize)
reader, err := b.bkt.GetRange(ctx, b.chunkObjs[seq], off, length)
if err != nil {
return nil, errors.Wrap(err, "get range reader")
}
defer runutil.CloseWithLogOnErr(b.logger, reader, "readChunkRange close range reader")

// Get a buffer from the pool.
chunkBuffer, err := b.chunkPool.Get(chunkRanges.size())
if err != nil {
return nil, errors.Wrap(err, "allocate chunk bytes")
}

*chunkBuffer, err = readByteRanges(reader, *chunkBuffer, chunkRanges)
if err != nil {
return nil, err
}

return chunkBuffer, nil
}

func (b *bucketBlock) chunkRangeReader(ctx context.Context, seq int, off, length int64) (io.ReadCloser, error) {
if seq < 0 || seq >= len(b.chunkObjs) {
return nil, errors.Errorf("unknown segment file for index %d", seq)
Expand Down
Loading