Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: use segment fallocate size from config #3288

Merged
merged 1 commit into from
Jan 19, 2022

Conversation

ZeDRoman
Copy link
Contributor

@ZeDRoman ZeDRoman commented Dec 16, 2021

Cover letter

"Today, redpanda fallocates 32M whenever necessary. We currently inject one raft message into every partition when a topic is created. If I create a topic with 1000 partitions, 32G of my disk is eaten.
The fallocate was chosen to be 32M, and large sizes are beneficial, but it may be useful to have this size be a configuration option" (#2876)

Fixes #2876

Release notes

Features

  • The minimum disk space allocation size is now configurable via the segment_fallocation_step property. The default is unchanged from the previous behavior (32MB). You may wish to decrease this property if creating large numbers of partitions on systems with limited disk space.

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some unit tests that run and validate a workload against the segment appender while varying the fallocation step.

@@ -49,7 +51,8 @@ class segment_appender {

ss::io_priority_class priority;
size_t number_of_chunks;
size_t falloc_step{fallocation_step};
size_t falloc_step{
config::shard_local_cfg().segment_fallocation_step()};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to stop using the thread-local for this stuff but pass it as an option on the caller, all the way to main.

testing core-local configs are super hard. generally not a good practice to use it for tests in particular.

@dotnwat
Copy link
Member

dotnwat commented Dec 21, 2021

This needs some unit tests that run and validate a workload against the segment appender while varying the fallocation step.

to clarify @ZeDRoman we should run some workload (e.g. appends/reads in the simple case, or generally a simple fuzzer like opfuzz) that checks that the segment appender appears to be working ok.

@ZeDRoman ZeDRoman force-pushed the issues-2876 branch 2 times, most recently from 3b80e85 to bff4782 Compare December 21, 2021 14:31
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great thanks for adding the tests. let's expand the tests just a little bit.

@@ -31,7 +31,7 @@ SEASTAR_THREAD_TEST_CASE(test_can_append_multiple_flushes) {
| ss::open_flags::truncate)
.get0();
auto appender = segment_appender(
f, segment_appender::options(ss::default_priority_class(), 1));
f, segment_appender::options(ss::default_priority_class(), 1, 16_KiB));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good, but can we parameter all the existing tests, not just a new test that is added, because presumably all of the existing functionality should continue to be preserved independent of the allocation size.

I think factoring out the body of the tests into helper functions will make this straight forward. Something like:

static void test_can_append_multiple_flushes(size_t size) {
   ...
}

SEASTAR_THREAD_TEST_CASE(test_can_append_multiple_flushes) {
   run_test_can_append_multiple_flushes(16_KiB);
   run_test_can_append_multiple_flushes(32_MiB);

@@ -217,3 +217,40 @@ SEASTAR_THREAD_TEST_CASE(test_can_append_little_data) {
BOOST_REQUIRE_EQUAL(appender.file_byte_offset(), data.size());
appender.close().get();
}

SEASTAR_THREAD_TEST_CASE(test_fallocate_size) {
for (const size_t fallocate_size : {1ul, 16_KiB, 32_MiB}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what validation is applied to this new parameter? can I use 0 here or non-power-of-two values? I think it is a good idea to have validation (if we don't already have that--perhaps I missed it), and make sure the tests here have a few edge cases in the space of allowed values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use any value > 0. But in segment_appender fallocation_offset will be aligned to 4096.
So I think that here will be better to have 4096 divisor in order to have page aligned fallocations
I'll add validator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may find the existing storage::internal::chunk_cache::validate_chunk_size useful as an example (currently used for the append_chunk_size setting).

It would be good to have some upper and lower bounds on this to prevent craziness if the user types it wrong: minimum should be the append_chunk_size (if we can't get at the live value from the validator then use the default 16kib) and max should be something arbitrary, maybe the default segment size (1gib). Arbitrary limits are kind of annoying, but it's better than enabling the user to instantly fill their disk by typing too many zeros in a config option.

Copy link
Contributor

@jcsp jcsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, just suggestions for making the setting adjustable without restart, and validation.

*this,
"segment_fallocation_step",
"Size for segments fallocation",
{.example = "32768", .visibility = visibility::tunable},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be relatively easy to make live-adjustable:

@@ -217,3 +217,40 @@ SEASTAR_THREAD_TEST_CASE(test_can_append_little_data) {
BOOST_REQUIRE_EQUAL(appender.file_byte_offset(), data.size());
appender.close().get();
}

SEASTAR_THREAD_TEST_CASE(test_fallocate_size) {
for (const size_t fallocate_size : {1ul, 16_KiB, 32_MiB}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may find the existing storage::internal::chunk_cache::validate_chunk_size useful as an example (currently used for the append_chunk_size setting).

It would be good to have some upper and lower bounds on this to prevent craziness if the user types it wrong: minimum should be the append_chunk_size (if we can't get at the live value from the validator then use the default 16kib) and max should be something arbitrary, maybe the default segment size (1gib). Arbitrary limits are kind of annoying, but it's better than enabling the user to instantly fill their disk by typing too many zeros in a config option.

Add segment_fallocate size to config properties
and set segment_appender fallocate size from config
@jcsp
Copy link
Contributor

jcsp commented Jan 18, 2022

LGTM, needs Noah's earlier request for changes to be dismissed or replaced with a +1 before github will let us merge.

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work!

@@ -77,6 +77,20 @@ class segment_appender {

void set_callbacks(callbacks* callbacks) { _callbacks = callbacks; }

/** Validator for fallocation step configuration setting */
static std::optional<ss::sstring>
validate_fallocation_step(const size_t& value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there is not any benefit to taking a size_t by const&. I wonder if config:: is making this requirement?

@dotnwat dotnwat merged commit 4a10444 into redpanda-data:dev Jan 19, 2022
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow fallocate size to be configurable
4 participants