Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix[mimirtool]: order samples when performing export #5700

Merged
merged 2 commits into from
Sep 1, 2023
Merged

fix[mimirtool]: order samples when performing export #5700

merged 2 commits into from
Sep 1, 2023

Conversation

theSuess
Copy link
Member

@theSuess theSuess commented Aug 8, 2023

What this PR does

When iterating over the samples returned by the remote read endpoint, the samples are grouped by series, e.g.:

{__name__="agent_build_info"} 1 1691474422539
{__name__="agent_build_info"} 1 1691474542539
{__name__="agent_build_info"} 1 1691474602539
{__name__="up"} 1 1691474422539
{__name__="up"} 1 1691474542539
{__name__="up"} 1 1691474602539

As long as all samples fall into the same block timespan (default is 2h and not configurable), this does not cause any issues. The same is the case if all series start at the same timestamp.

However if a block gets committed, the new minimum timestamp gets set to the lowest timestamp of the ingested samples up to now. If a series with an earlier starting timestamp exists in the remote_read response, this causes mimirtool to crash with an out of order error.

This PR fixes this by using a heap based iterator to iterate over the samples based on the timestamp value.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@theSuess theSuess marked this pull request as ready for review August 8, 2023 11:42
@theSuess theSuess requested a review from a team as a code owner August 8, 2023 11:42
@dimitarvdimitrov
Copy link
Contributor

Hi, you say that

if a block gets committed, the new minimum timestamp gets set to the lowest timestamp of the ingested samples up to now.

Can you point to the place where this happens?

@theSuess
Copy link
Member Author

The block gets committed here: https://github.com/grafana/mimir/blob/main/pkg/mimirtool/backfill/backfill.go#L138

This in turn calls this massive function https://github.com/prometheus/prometheus/blob/651b5a049a3d2ca877f8cf960a8add1a366f6da0/tsdb/head_append.go#L832

which commits writes to the WAL and keeps track of the minimum timestamp in inOrderMint.

At the end, this value gets used to create the new head https://github.com/prometheus/prometheus/blob/main/tsdb/head_append.go#L1083

@dimitarvdimitrov
Copy link
Contributor

I think I see the problem now. It's not when cutting new blocks, rather it's when committing some appended samples.

The block gets committed here: main/pkg/mimirtool/backfill/backfill.go#L138

This only commits and appender, it doesn't cut a new block. Instead, this is the place which triggers a block upload

block, err := w.Flush(ctx)

So what I suspect is happening is that the first series in the remote-write response have a very late timestamp. A flush is triggered, which sets the min-time of the head. Next series which have a timestamp closer to the requested min time then try to be appended. Am I on the right track?

Would be interesting to have this reproduced with a unit test. Do you think that's doable?

@theSuess
Copy link
Member Author

Yeah, sorry for mixing up the terms, still new to the codebase 😅

I'll try to build a unit-test and ping you once it's done!

@theSuess
Copy link
Member Author

Alright, I've managed to condense the mimir data I was getting from my instance into a testcase. To reproduce the issue in the old implementation, simply switch out newOrderedIterator(ts) for newTimeSeriesIterator(ts).

While creating this testcase, I encounterd some very strange conditions for this bug. The issue does not only depend on the number of samples/their timestamps but also the value of the start/endtime and the interval between sample timestamps. The amount of samples per block also affects this.

I'm now even more unsure what is happening here 😅, all I know, is that the issues can be fixed by ordering the samples.

@dimitarvdimitrov
Copy link
Contributor

I think I understand now why this problem is happening. The TSDB head (in-memory series storage) allows to append only up to 1 hour before the most recent sample. Compactions run on [now-3h, now-2h]. Having this 1h validation allows the head to compact and append concurrently without worrying about a race between the two.

In this test we append by series (if we use the original newTimeSeriesIterator) and we commit in groups of 1000 samples. The 1h limit is recalculated after every group of 1000 samples (i.e. after every commit). So it's possible that when iterating one series we reach 2 hours ahead of the samples of the next series. If at that point we also reach the 1000 samples threshold we commit and recalculate the 1h threshold. When we move onto the next series soon after the commit, its samples are already 2h old. This fails the append.

Each group of 1000 committed samples is bound to a 2-hour period. This allows to append samples of one series which are 2h ahead of the sample of the next series. One easy way to fix this is to reduce the period to 1 hour. This way we won't commit samples which are more than 1h ahead of other samples we have. The period is defined in the CreateBlocks function. I was able to get TestEarlyCommit to pass with newTimeSeriesIterator and blockDuration := tsdb.DefaultBlockDuration / 2.

The alternative is the ordered iterator that you added. I think I prefer changing the append period to the iterator because it's simpler and you already have added tests for the change - TestEarlyCommit. What's your take?

@theSuess
Copy link
Member Author

I've tried your suggestion of setting blockDuration := tsdb.DefaultBlockDuration / 2. It fixes the specific testcase constructed, but by increasing the series count to a value >= 18, the same issue is encountered again. I think this would also fail in the case of having only very few datapoints per hour (if I understood the root cause correctly)

@dimitarvdimitrov
Copy link
Contributor

I had overlooked all the usages of blockDuration. What I thought it would do is append to the 2-hour block in 1-hour segments. However, it also changed the duration of the block (as the variable name suggests). So with this diff I get the tests passing with different configs

diff --git a/pkg/mimirtool/backfill/backfill.go b/pkg/mimirtool/backfill/backfill.go
index c3320c9b3..032540b21 100644
--- a/pkg/mimirtool/backfill/backfill.go
+++ b/pkg/mimirtool/backfill/backfill.go
@@ -90,7 +90,7 @@ func CreateBlocks(input IteratorCreator, mint, maxt int64, maxSamplesInAppender
 
 	var wroteHeader bool
 
-	for t := mint; t <= maxt; t = t + blockDuration {
+	for t := mint; t <= maxt; t += blockDuration / 2 {
 		err := func() error {
 			w, err := tsdb.NewBlockWriter(log.NewNopLogger(), outputDir, blockDuration)
 			if err != nil {
@@ -106,7 +106,7 @@ func CreateBlocks(input IteratorCreator, mint, maxt int64, maxSamplesInAppender
 			ctx := context.Background()
 			app := w.Appender(ctx)
 			i := input()
-			tsUpper := t + blockDuration
+			tsUpper := t + blockDuration/2
 			samplesCount := 0
 			for {
 				err := i.Next()
diff --git a/pkg/mimirtool/commands/remote_read_test.go b/pkg/mimirtool/commands/remote_read_test.go
index 433bf8011..9371157a0 100644
--- a/pkg/mimirtool/commands/remote_read_test.go
+++ b/pkg/mimirtool/commands/remote_read_test.go
@@ -10,10 +10,11 @@ import (
 	"io"
 	"testing"
 
-	"github.com/grafana/mimir/pkg/mimirtool/backfill"
 	"github.com/pkg/errors"
 	"github.com/prometheus/prometheus/prompb"
 	"github.com/stretchr/testify/assert"
+
+	"github.com/grafana/mimir/pkg/mimirtool/backfill"
 )
 
 func TestTimeSeriesIterator(t *testing.T) {
@@ -154,9 +155,9 @@ func TestTimeSeriesIterator(t *testing.T) {
 }
 
 func TestEarlyCommit(t *testing.T) {
-	maxSamplesPerBlock := 1000
-	series := 9
-	samples := 140
+	maxSamplesPerBlock := 10
+	series := 18
+	samples := 1400
 
 	start := int64(1691408300722)
 	inc := int64(60000)
@@ -181,7 +182,7 @@ func TestEarlyCommit(t *testing.T) {
 		ts[i] = s
 	}
 	iterator := func() backfill.Iterator {
-		return newOrderedIterator(ts)
+		return newTimeSeriesIterator(ts)
 	}
 	err := backfill.CreateBlocks(iterator, start, end, maxSamplesPerBlock, t.TempDir(), true, io.Discard)
 	assert.NoError(t, err)

@theSuess
Copy link
Member Author

That makes sense! I've updated this PR to only include these changes. Thanks for the detailed explanation - I've learned a lot!

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left only two minor comments about the test. Happy to merge the PR afterwards

pkg/mimirtool/commands/remote_read.go Outdated Show resolved Hide resolved
pkg/mimirtool/commands/remote_read_test.go Outdated Show resolved Hide resolved
pkg/mimirtool/commands/remote_read_test.go Outdated Show resolved Hide resolved
@colega
Copy link
Contributor

colega commented Aug 29, 2023

The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase main and eventually move the CHANGELOG entry added / updated in this PR to the top of the CHANGELOG document. Thanks!

@theSuess
Copy link
Member Author

Alright, should be resolved now! I hope I understood the changelog instructions correctly 😬

@dimitarvdimitrov dimitarvdimitrov enabled auto-merge (squash) September 1, 2023 08:29
@dimitarvdimitrov dimitarvdimitrov merged commit cf87d8f into grafana:main Sep 1, 2023
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants