-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage/tests: adjust delta for restored partitions #5372
cloud_storage/tests: adjust delta for restored partitions #5372
Conversation
54d5da9
to
686ac05
Compare
@@ -164,10 +186,10 @@ def get_ntp_sizes(fdata_per_host, hosts_can_vary=True): | |||
rest_ntp_size = restored_ntps[ntp] | |||
assert rest_ntp_size <= orig_ntp_size, f"NTP {ntp} the restored partition is larger {rest_ntp_size} than the original one {orig_ntp_size}." | |||
delta = orig_ntp_size - rest_ntp_size | |||
tolerance = int(1.5 * default_log_segment_size) | |||
tolerance = max(tolerance, baseline_last_segment_sizes.get(ntp, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have the exact last segment size, do we still need the 1.5 tolerance?
Maybe there is still some tolerance needed (if the restored system has e.g. created an extra empty segment), but probably not the full 1.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need the default any more because we filter out the empty segments when getting ntp sizes. I had added it as a fallback but I think we should be okay removing it completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that should be OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this test is consistently failing on CI when we rely on the last segment size. Sometimes the 1.5x default segment size is greater than last segment. Will look through the logs.
I'm guessing the reason we're overshooting so much + nondeterministically is that the batch size is large relative to the segment size, and we're getting different batch sizes depending on client timing -- is that right? |
when asserting that the size of a restored partition is close to the original partition in topic recovery tests, we use the default segment size and a multiplier of 1.5 to tolerate errors where the last segment of the ntp is not uploaded to tiered storage because it is open. sometimes the segment is larger than 1.5x the default size causing an assertion failure in the test. this change stores the last segment size for each partition and uses it as the tolerance size in assertions.
686ac05
to
7fbda0e
Compare
This issue became much more frequent after switching to rpk producer, so it might be related to the client. I have not been able to look yet into the exact reason why we generate segments much larger than the default size but what you mentioned sounds plausible, I will look into why this could be happening. |
@abhijat latest push is failing test_fast1+test_fast3 |
Looking at one instance of failure:
original segment sizes for this ntp were: we skip segments < 4096, so effectively: 1947153+1946694+1367094 = 5260941 restored sizes are: again, we skip < 4096, so: 1946694+1946694+1367094 = 5260482 diff = 459 bytes. this is mostly due to the very first segment, original was 1947153 bytes and restored is 1946694 bytes. looking at segment download log during recovery for this segment:
offset translation skips a 459 bytes batch which accounts for the diff. in sizes. But this is as expected where we remove the config batches from restored segment? So using the exact last segment size does not cover all cases, we also need to account for the config batches being removed. cc @Lazin Notably, recording the last segment size in this case did not help as it was too small to be accounted for. |
trying a new approach here to count delta offset in partition manifest and use that to create tolerable error limits |
6bda7f3
to
4b52dd0
Compare
3523993
to
fd40c3a
Compare
files on disk can be larger than actual data uploaded to cloud storage. it appears this is due to block allocation of segments, in this case the maximum diff can be 4095 bytes, if there is an extra block allocated for one byte on disk. so we adjust the margin of error to 4096 bytes to account for these differences.
2bc743e
to
2d193b4
Compare
e1aa4cf
to
7aec931
Compare
7aec931
to
9f04de0
Compare
It was tricky to predict difference between baseline and S3 accurately, for now I have used the following:
|
tried several iterations locally and it looks good
|
non_data_batches_per_ntp = defaultdict(lambda: 0) | ||
segments = [ | ||
item for item in self._s3.list_objects(self._bucket) | ||
if not item.Key.endswith('manifest.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after filtering out all manifests you will have log segments and also transactional manifests (*.tx extension)
but here, if you don't use transactions no tx manifests will be created
/backport v22.1.x |
Failed to run cherry-pick command. I executed the below command:
|
test_missing_segment was ok_to_fail, recent changes in pr redpanda-data#5372 should improve this test by frequently uploading segments to s3, the test failure had been because the sole segment in s3 was deleted resulting in an empty restored topic. however looking at pandaresults reveals this test has been OPASS in the last 30 days with no failures. unmarking this test - the issue will be closed in a few days if no further failures are seen.
need to backport a prev. PR for rpk producer before this one can be backported. |
Let's resolve #5711 before backporting this |
when asserting that the size of a restored partition is close to the
original partition in topic recovery tests, we use the default segment
size and a multiplier of 1.5 to tolerate errors where the last segment
of the ntp is not uploaded to tiered storage because it is open.
sometimes the segment is larger than 1.5x the default size causing an
assertion failure in the test. this change stores the last segment size
for each partition and uses it as the tolerance size in assertions.
fixes #4972