Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: wire up validation & add bounded_property #3637

Merged
merged 16 commits into from
Feb 1, 2022

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Jan 28, 2022

Cover letter

Introduce bounded_property:

  • For any scalar that wants to specify one or more of a min, max or alignment.
  • Handles optional properties, a nullopt is always valid, and a has_value optional has bounds applied to its inner value.
  • Does both validation and clamping. Means that if set_value is applied with an invalid option, consumers of the property can still count on the value being within bounds. This is important for upgrade handling, where an invalid value (which was valid for an earlier redpanda version) might have been stored persistently.

Fix up how validation works generally:

  • Add a base_property::validate(Yaml::Node) function for validating inputs, rather than the existing validate() property that checked whether an already-set value was valid.
  • Hook up validation in central config: a user friendly check in the admin API to give early feedback to clients/users, plus another check when applying a config value locally in config_manager to flag up any invalid values that made it through.
  • Return a map of validation errors from config_store::read_yaml, and handle that in application.cc.

Opportunistic improvements while looking at the code:

  • Add bounds to a number of existing properties. In some cases these are "better than nothing" bounds such as setting minimums of 1 for properties that would make no sense at zero, in others they're more substantial.
  • Update kafka quota manager configuration to use config::binding for live-settable properties, as it was fairly low hanging fruit.
  • Add a debug-mode safety check on config::binding to ensure we aren't ever transferring these across cores -- I worry that we might accidentally do that if calling bind() during application setup and passing the binding to a sharded service, and want to preemptively protect against that.

Release notes

Improvements

  • Several numeric configuration properties have improved bounds checking, preventing cluster instability resulting from invalid property values.

This is a generalization of clamped_property, that
provides a validator as well as clamping the value
to safe bounds.

The clamping is redundant as long as validation
was properly applied before storing value, but
is useful if an invalid value has made it into
the system somehow, for example while upgrading
from an earlier version that had different validity
bounds on the property, or if loading a redpanda.yaml
while central config is disabled.
Previously this only allowed validating a property's
value _after_ setting it: not so useful if we're
trying to reject invalid values.

Add a validate function that takes some YAML as
an argument, so that we can validate incoming
config values much earlier in the process.
Same `oncore` helper as in iobuf: make sure that we
aren't accidentally constructing the binding on
a different shard than it is used in.  This is a
relatively easy mistake to make with bindings, because
one might call `bind()` very early in application
startup and then pass the result into a sharded
service start call.
This was only being used from a unit test, and
even there it was not testing any validation
of invalid values.

This test coverage will be replaced shortly
with something more meaningful.
bounded_property is a generalization that
also provides validation.
At this stage the config has already made it into the
persistent store, but it is still useful to detect
the issue and add the property to the list of those
with invalid values for the configuration status.
Previously we were only detecting exceptions from
YAML parsing, or YAML values that did not convert
cleanly to the value type of the property.

Now we will apply any validator hook that was declared
in the property (unless `force` is true, in which case
invalid values will be accepted and we rely on
properties to clamp input to safe values in set_value).
Aside from validation being a generally good thing,
this is necessary to prevent divisions by zero when
applying the settings in partition_allocator.
- Buffer-size type properties get some "sanity" limits
  to avoid pathologically high or low values.
- Properties which do not make sense at zero (incl. when
  used as a divisor) get a min=1
...to enable live adjustment of properties.
- needs_restart::no for all but the GC interval, which
  is used at start with arm_periodic so won't reflect changes until
  next start.
- Adjust target_quota_byte_rate to user-level visibility: this
  is something behavioural that an end user could legitimately
  want to control.
Includes validator call as well as YAML syntax/type
errors.  Also much improves output for syntax
errors by naming the problematic property.
Any errors in node_config are fatal: these properties
are key to redpanda starting up at all.  This is the
same as existing behaviour for these properties, but
with clearer output.

Validation errors in cluster config are non-fatal, values
will be whatever set_value for the property accepts (and
clamps, for bounded values).  The YAML load path for cluster
config settings is only in use until we cut over to central
config by default.
@jcsp
Copy link
Contributor Author

jcsp commented Jan 31, 2022

@jcsp jcsp marked this pull request as ready for review January 31, 2022 10:19
@jcsp jcsp requested a review from ztlpn as a code owner January 31, 2022 10:19
@jcsp
Copy link
Contributor Author

jcsp commented Jan 31, 2022

Not claiming to have exhausted all possible validations we could be doing on config properties here, but added a bunch of low hanging fruit & the infrastructure should make it much easier to do validation in future.

Copy link
Member

@mmaslankaprv mmaslankaprv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great

@jcsp jcsp merged commit f4123ce into redpanda-data:dev Feb 1, 2022
@jcsp jcsp deleted the config-bounded-property branch February 1, 2022 22:19
Comment on lines +26 to +30
concept numeric = requires(const T& x) {
{x % x};
{ x < x } -> std::same_as<bool>;
{ x > x } -> std::same_as<bool>;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

Comment on lines +75 to +86
if (align.has_value()) {
auto remainder = result % align.value();
result -= remainder;
}

if (min.has_value()) {
result = std::max(min.value(), result);
}

if (max.has_value()) {
result = std::min(max.value(), result);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcsp should alignment be verified after min/max have been applied (or verify that min/max are also aligned when they change the result)?

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants