-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: fix integer overflow in max_timestamp #4010
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch. there is something here that needs fixed. two things I'm thinking about, but not sure of:
- iirc the only query we do on time is related to finding a lower bound. so if a timestamp was artificially further ahead in the future, the affect should merely be to return more data than necessary but that seems like a compliant behavior. I'm less certain about the interaction with the max_timestamp calculation. what is the bug that results from
Hence max_timestamp shooting about 6 weeks into the future.
? - iirc kafka also permits clients to send in arbitrary timestamps, so presumably kafka has to tackle a similar scenario. i wonder if we should poke around and see what upstream behavior is in this case?
@@ -64,7 +64,7 @@ bool index_state::maybe_index( | |||
// We know that a segment cannot be > 4GB | |||
add_entry( | |||
batch_base_offset() - base_offset(), | |||
last_timestamp() - base_timestamp(), | |||
std::max(last_timestamp() - base_timestamp(), int64_t{0}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subtle.
I was thinking base_timestamp() as the bottom time, but you are right, it has to be 0.
That bogus timestamp is used for retention.ms enforcement, so we end up failing to delete segments during gc. I went with a minimal fix here because this is a relatively frequent CI failure that I'd like to get fixed quickly. The general case where clients can send arbitrary timestamps (they can go backward, they can differ by more than 2**32) is incompatible with the way we store relative timestamp in a uint32_t. In terms of why we're ending up with timestamps <= base_timestamp in our tests, I'm not quite sure, but I don't think we really prevent it: the timestamps are generated in produce.cc before any ordering rules are in effect, so it's entirely possible for incoming batches to be stamped in a different order than they land in the segment. Maybe we should be applying timestamps later, once the order of storage is established. |
I checked how we were getting base_timestamps in the future wrt subsequent messages -- it's raft configuration messages, which can easily be timestamped after actual kafka batches, where the kafka batches are held up waiting for the consensus group to be ready (after the configuration lands). afaict we don't implement the timestamp->offsets Kafka feature (
|
Cover letter
Batches with timestamps behind the base_timestamp of the segment could have negative relative timestamp, which overflows when cast to uint32_t to insert to index_state::relative_time_index.
index_state::relative_time_index is later used to update max_timestamp when truncating the log. Hence max_timestamp shooting about 6 weeks into the future.
Fixes #3924
Release notes