-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing the memory footprint of active series custom trackers #2568
Conversation
… memory performance
Can we see benchmarks? I would expect accessing the map being way slower than accessing the slice. |
I've run the existing benchmarks this way:
No significant difference:
Then I checked the benchmarks (I should have done it earlier) and I've realized no benchmark set the matchers, so obviously there's no difference. Please @gubjanos add benchmark(s) using matchers too. |
matches[i] = sm.Matches(series) | ||
if sm.Matches(series) { | ||
matches = append(matches, i) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we see benchmarks of this? :)
This will have to allocate a new slice every 2^n
matchers, which is probably more expensive than just allocating a slice of bools since it will push unnecessary pressure on the GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're worried about memory consumption here, then IMO we should keep the previous approach and change []bool
to a bitarray backed by []uint64
I have created a benchmark for testing Matchers.
|
This result looks good to me, however, if we want to optimize here and we know that most of the series match just once, we can do better. I made a quick PoC replacing your
Ran your benchmarks, and it's faster:
And the best thing, it does zero allocations for your test case (each series matching just one matcher):
IMO, reducing one allocation per active series is a pretty big deal (that's 1.5M less stuff in the heap for GC to iterate on, in the worst case, because not all series match something), but I would like to hear @bboreham's opinion on this. OTOH, I'm saying it's a quick PoC because I'd fine-tune a few things if we used this for prod:
|
The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase |
I think it's interesting to ship this improvements, then we can talk about how trackers can be further improved. |
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Sorry for the delay, changes lgtm with a few nitpicks and a missing changelog entry. Can you also please post the latest benchmark comparison? Thank you. |
Here is the comparison between your PoC version and the one with the improvements.
|
I've run a benchmark comparison between the current Click here to see the benchmark I've ported to `main` to comparefunc BenchmarkMatchesSeries(b *testing.B) {
trackerCounts := []int{10, 100, 1000}
asms := make([]*Matchers, len(trackerCounts))
for i, matcherCount := range trackerCounts {
configMap := map[string]string{}
for j := 0; j < matcherCount; j++ {
configMap[strconv.Itoa(j)] = fmt.Sprintf(`{this_will_match_%d="true"}`, j)
}
config, err := NewCustomTrackersConfig(configMap)
require.NoError(b, err)
asms[i] = NewMatchers(config)
}
makeLabels := func(total, matching int) labels.Labels {
if total < matching {
b.Fatal("wrong test setup, total < matching")
}
lbs := make(labels.Labels, 0, total)
for i := 0; i < matching; i++ {
lbs = append(lbs, labels.Label{Name: fmt.Sprintf(`this_will_match_%d`, i), Value: "true"})
}
for i := matching; i < total; i++ {
lbs = append(lbs, labels.Label{Name: fmt.Sprintf("something_else_%d", i), Value: "true"})
}
return lbs
}
for i, trackerCount := range trackerCounts {
for _, bc := range []struct {
total, matching int
}{
{1, 0},
{1, 1},
{10, 1},
{10, 2},
{10, 5},
{25, 1},
{25, 2},
{25, 5},
{100, 1},
{100, 2},
{100, 5},
} {
series := makeLabels(bc.total, bc.matching)
b.Run(fmt.Sprintf("TrackerCount: %d, Labels: %d, Matching: %d", trackerCount, bc.total, bc.matching), func(b *testing.B) {
for x := 0; x < b.N; x++ {
got := asms[i].Matches(series)
matched := 0
for _, v := range got {
if v {
matched++
}
}
require.Equal(b, bc.matching, matched)
}
})
}
}
} (I'll post a suggestion with some changes to your benchmark too in a separate comment, so it will match the one I ran here) The results look great and kind of as the expected ones:
There's a similar performance (+/- 5% everywhere) with zero allocations except for 5 matching trackers. |
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @gubjanos, great work!
…na#2568) * Changing active series matches from []bool to map[int]bool for better memory performance * Changing map[int]bool to int slice * Changing slice allocation based on Oleg's suggestion * Benchmark * PoC: fixed size backed array slice for matchers Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Adding Oleg's improvements as well * Limiting the maximum number of trackers as 64k * Making linter happy * Making preAllocDynamicSlice and also activeseries.mathches unexported * Commenting and cleaning up preAllocDynamicSlice * Failing the benchmark if the config fails to initizalize * Updating changelog * Update CHANGELOG.md Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> * Update pkg/ingester/activeseries/matchers_test.go Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
…na#2568) * Changing active series matches from []bool to map[int]bool for better memory performance * Changing map[int]bool to int slice * Changing slice allocation based on Oleg's suggestion * Benchmark * PoC: fixed size backed array slice for matchers Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Adding Oleg's improvements as well * Limiting the maximum number of trackers as 64k * Making linter happy * Making preAllocDynamicSlice and also activeseries.mathches unexported * Commenting and cleaning up preAllocDynamicSlice * Failing the benchmark if the config fails to initizalize * Updating changelog * Update CHANGELOG.md Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> * Update pkg/ingester/activeseries/matchers_test.go Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com> Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
What this PR does
This PR is replacing the bool slice tracking matches for each matcher configured for each series reference with a preAllocDynamicSlice tracking only matching matchers' indexes. The usage pattern of the custom trackers implies that in most usecases only a fraction of matchers are matching, thus this design saves on the stored values. The new data structure preAllocDynamicSlice is optimized for this.
Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]