-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tsm): Allow for deletion of series outside default rp #25312
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your shard copying algorithm is wrong, or I am misunderstanding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, and a large concern related to filtering by retention policy.
You are correct, revisiting this there is a bug in the shard filtering logic. It will effectively throw away the "database" filtered shards using the Furthermore, If I were to write another test where I have two or more databases with the same retention policy name this would effectively delete series across multiple databases which is not the desired functionality or current way this works. I suspect that what ever database is the currently selected database should be processing the deletes. I'll go ahead and adjust this today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion
tsdb/store.go
Outdated
if len(sources) != 0 { | ||
if measurement, ok := sources[0].(*influxql.Measurement); ok { | ||
if measurement.RetentionPolicy != "" { | ||
shardFilter = func(sh *Shard) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cleaner here to actually make a composed function that calls the first shardFilter function. That way we can change byDatabase
to do more or less or different things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this:
func ComposeFilter(original func(i int) bool, addition func(i int) bool) func(i int) bool {
return func(i int) bool { return original(i) && addition(i) }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like that suggestion, I think it looks really clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ComposeFilter
seems needlessly complicated, especially since it would get used one place.
You could just make use byDatabase
in the new shardFilter
definition:
shardFilter := func(sh *Shard) bool { return byDatabase(sh) && sh.retentionPolicy == measurement.RetentionPolicy }
You could then pull sh.retentionPolicy == measurement.RetentionPolicy
into a byRetentionPolicy
function if you wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably change the name of shardFilter
to something else because it is too similar to filterShards
. My brain crossed them together the first time I read the code. Maybe change to shardFilter
to sf
or filter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do support wildcards for measurements in DELETE. Please verify the syntax to use, and add tests including them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do support wildcards for measurements in DELETE. Please verify the syntax to use, and add tests including them.
Sounds good 🫡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devanbenz You could do byDatabase(database)(sh) && byRetentionPolicy(retention)(sh)
, but it's probably better to call byDatabase()
and byRetentionPolicy()
outside the lambda and assign them to variables, then use them inside the lambda. Or use @davidby-influx's ComposeFilter
idea. Another way to do the compose filter would be something like:
func ComposeFilters(fns ...func(sh *Shard) bool) func(sh *Shard) bool {
return func(sh *Shard) bool {
for _, f := range fns {
if !f(sh) {
return false
}
}
return true
}
}
....
shards := s.filterShards(ComposeFilters(byDatabase(db), byRetentionPolicy(rp)))
I find that way more readable, but that's personal preference.
I'd also be tempted to add a type ShardPredicate func(sh *Shard) bool
😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes! @gwossum this is similar to what I was attempting to go for with my earlier design of passing in multiple functions but much much more elegant 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay modifications have been made -- testing of wildcards as well as adding the composable function for shard filtering + I refactored other areas that is calling the function to include the composition fn
65f3d72
to
822a750
Compare
51c70bb
to
9d116f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I still think we should handle more than one source, lest we regret it in the future.
tsdb/store.go
Outdated
if len(sources) != 0 { | ||
if measurement, ok := sources[0].(*influxql.Measurement); ok { | ||
if measurement.RetentionPolicy != "" { | ||
shardFilterFn = ComposeShardFilter(shardFilterFn, byRetentionPolicy(measurement.RetentionPolicy)) | ||
} | ||
} else { | ||
return fmt.Errorf("unsupported source type in delete %v", sources[0]) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we decided the parser wouldn't allow more than a single source, but I think it would still be good to check for that here and return an error. There might be another way to get more than measurement (Flux? Probably not, but are you sure?). There also might be a change made to the parser that allows more than one source, and if we don't check for that possibility here that could lead to strange issues in the future that we spend weeks tracking down. Returning an error instead what make it clear what happened.
9d116f6 This PR adds the ability for deletion of series that are outside of the default retention policy. This updates InfluxQL to include changes from: influxdata/influxql#71 closes: influxdata/feature-requests#175
756b0c8
to
cbb1ea9
Compare
9d116f6 This PR adds the ability for deletion of series that are outside of the default retention policy. This updates InfluxQL to include changes from: influxdata/influxql#71 closes: influxdata/feature-requests#175
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Modifies logic of TSDB deletion to include filtering by retention policy
Part one of this FR is located here: influxdata/influxql#71
This is part two, InfluxQL is updated here and this adds filtering by shard's retention policy within the delete series method.
Closes https://github.com/influxdata/feature-requests/issues/175