Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topic recovery download with capped size. Download not more than retention.policy #6797

Merged
merged 2 commits into from
Oct 19, 2022

Conversation

ZeDRoman
Copy link
Contributor

Cover letter

In Shadow Indexing we have option to recover size more or equal to retention.bytes . So Shadow Indexing would download segments until sum of their sizes become more or equal to retention.bytes property. (partition_recovery_manager.cc download_log_with_capped_size)

In Disk log GC we start to delete segments if their total size more than retention.bytes . So after GC we would have total size less or equal to retention.bytes . (disk_log_impl.cc size_based_gc_max_offset)

So when they are working together we have such behavior: SI downloads segments more than retention.bytes then Disk log GC removes one segment because total size more than retention.bytes .

It turned out in TopicRecoveryTest.test_size_based_retention. SI downloads segments, then segments are automatically deleted by Disk log GC, then we check that SI downloaded more than retention.bytes and test fails (because segment was deleted).

Fixes #4887

Backport Required

  • not a bug fix
  • issue does not exist in previous branches
  • papercut/not impactful enough to backport
  • v22.2.x
  • v22.1.x
  • v21.11.x

@mmedenjak mmedenjak added kind/bug Something isn't working area/tests ci-failure area/cloud-storage Shadow indexing subsystem and removed area/redpanda labels Oct 18, 2022
@@ -364,7 +364,7 @@ partition_downloader::download_log_with_capped_size(
model::offset_delta start_delta{0};
for (auto it = offset_map.rbegin(); it != offset_map.rend(); it++) {
const auto& meta = it->second.meta;
if (total_size > max_size) {
if (total_size + meta.size_bytes > max_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs a special case to download at least one segment. Otherwise if someone has e.g. 1GB segments, but sets retention to 100MB, then they will recover nothing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right

"""
size_bytes_per_ntp = {}
segments_sizes_per_ntp = {}
for _, data in chk.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to follow, names like data and chk are quite obscure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well as nicer names, you could declare types in the function definition (chk: SometType, retention_policy: SomeType) to help the reader understand

Initially Shadow Indexing downloaded segments more or equal to
retention.policy
But in Disk log implementation, GC removes segments until total size
will be less or equal to retention.policy
So downloading segments to size more than retention.policy is
unnecessary because they will be deleted by GC
This change makes CI to download less to retention.policy
"""
size_bytes_per_ntp = {}
segments_sizes_per_ntp = {}
for _node, node_segments_reports in nodes_segments_report.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node_segments_report shadows the parameter and should probably be renamed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameter has data of multiple nodes, so its name is
nodes_segments_report

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for _node, report in nodes_segments_report.items()

Even if the variables don't literally shadow each other, it's a bit too subtle to have them differ by just one s in the middle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to report

Now tests checks that SI recover not more than retention.policy size
@ZeDRoman ZeDRoman marked this pull request as ready for review October 18, 2022 15:19
@ZeDRoman ZeDRoman requested review from Lazin and jcsp October 19, 2022 08:26
@mmedenjak mmedenjak merged commit fdc9b57 into redpanda-data:dev Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem area/redpanda area/tests ci-failure kind/bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure in TopicRecoveryTest.test_size_based_retention
4 participants