Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[V22.1.x] backport of #4901 #5068

Merged
merged 4 commits into from
Jun 14, 2022
Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Jun 8, 2022

Cover letter

Group metadata key serializer returns an iobuf. Wrapping it with another serialization layer is incorrect as it introduces additional size field in the binary format that deserializer does not expect. This leads to deserialization error and exception is being thrown.

Fixes: #4900
Fixes: #5008

Proof for iobuf unwrapping approach

import struct

# Case 1: old serializer, group_metadata, no wrapper
#group_metadata_serializer::key_value ret;
#ret.key = reflection::to_iobuf(old::group_log_record_key{
#  .record_type = old::group_log_record_key::type::group_metadata,
#  .key = reflection::to_iobuf(md.key.group_id),
#});
def group_metadata_old_key_unwrapped():
    for record_type in (0, 1, 2):
        for group_id_len in range(2**15):
            # the group id is serialized inside an iobuf
            # so the length after the record type is the
            # group id bytes length plus the string length
            # prefix which is an extra 4 bytes
            iobuf_len = group_id_len + 4
            data = struct.pack("<bi", record_type, iobuf_len)
            assert len(data) == 5
            # content plus 4 bytes for length plus 1 byte for type
            yield data[:4], iobuf_len + 4 + 1, group_id_len, 1


# Case 2: old serializer, offset_metadata, no wrapper
#group_metadata_serializer::key_value ret;
#ret.key = reflection::to_iobuf(old::group_log_record_key{
#  .record_type = old::group_log_record_key::type::offset_commit,
#  .key = reflection::to_iobuf(old::group_log_offset_key{
#    std::move(md.key.group_id),
#    std::move(md.key.topic),
#    md.key.partition,
#  }),
def offset_metadata_old_key_unwrapped():
    for record_type in (0, 1, 2):
        for strings_len in range(2 * 2**15):
            # partition id (4) two 4 byte string lengths plus string bytes
            iobuf_len = strings_len + 4 + 4 + 4
            data = struct.pack("<bi", record_type, iobuf_len)
            assert len(data) == 5
            # content plus 4 bytes for length plus 1 byte for type
            yield data[:4], iobuf_len + 4 + 1, strings_len, 1

# version: int16
# group_id: string (length prefix int16)
#void group_metadata_key::encode(
#  response_writer& writer, const group_metadata_key& v) {
#    writer.write(v.version);
#    writer.write(v.group_id);
#}
def group_metadata_new_key_unwrapped():
    for version in (0, 1, 2):
        for group_id_len in range(2**15):
            data = struct.pack(">hh", version, group_id_len)
            assert len(data) == 4
            # group id bytes plus (2) version and (2) group id len
            yield data, group_id_len + 2 + 2, group_id_len, 1

# same as group_metadata_new_key_unwrapped
# except that the topic makes the remainder
# of the serialized size variable length.
#void offset_metadata_key::encode(
#  response_writer& writer, const offset_metadata_key& v) {
#    writer.write(v.version);
#    writer.write(v.group_id);
#    writer.write(v.topic);
#    writer.write(v.partition);
#}
def offset_metadata_new_key_unwrapped():
    for version in (0, 1, 2):
        for group_id_len in range(2**15):
            for topic_len in range(2**15):
                data = struct.pack(">hh", version, group_id_len)
                assert len(data) == 4
                # group/topic bytes, 2ver, 2group len, 2topic len, partition id
                yield data, group_id_len + topic_len + 2 + 2 + 2 + 4, group_id_len, topic_len

unwrapped = [group_metadata_old_key_unwrapped,
        offset_metadata_old_key_unwrapped,
        group_metadata_new_key_unwrapped,
        offset_metadata_new_key_unwrapped]

# unwrapped cases
# data that was written as unwrapped should never be unwrapped
for f in unwrapped:
    for prefix, size, invariant1, invariant2 in f():
        length = struct.unpack("<i", prefix)[0]
        # invariant == 0 would be like a zero-length group-id or topic-id
        # so we might trigger the condition in which we incorrect unwrap
        # but as long as the invariant says it won't happen in practice
        # then ok. so this would be like a group-id length == 0
        if size == (length + 4):
            if invariant1 == 0 or invariant2 == 0:
                print("{}, {}, i={}, i={}, s={}, l={}".format(
                        f.__name__, prefix, invariant1, invariant2, size,
                        length))
                continue
            assert False

Release notes

Signed-off-by: Michal Maslanka <michal@vectorized.io>
(cherry picked from commit d32ac70)
When serialized a key is already an iobuf. Wrapping it with another
serialization layer is incorrect as it introduces additional size field
in the binary format that deserializer doesn't expect.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
(cherry picked from commit 412dbcd)
Previously in redpanda we had an error leading to wrapping tombstone
record keys with an iobuf serialization envelope. In order to provide
compatibility and be able to read keys wrapped with an additional iobuf
we introduce this check. The check validates the key version being in
allowed range. Otherwise we try to unwrap a key from iobuf.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
(cherry picked from commit 19ee4eb)
Added test validating a compatibility of decoding group metadata keys
wrapped in iobuf.

Signed-off-by: Michal Maslanka <michal@vectorized.io>
(cherry picked from commit a352a17)
@dotnwat
Copy link
Member

dotnwat commented Jun 8, 2022

the python in the cover letter looks outdated. perhaps link to the updated one i posted on the original issue?

@mmaslankaprv
Copy link
Member Author

the python in the cover letter looks outdated. perhaps link to the updated one i posted on the original issue?

done

@andrwng
Copy link
Contributor

andrwng commented Jun 9, 2022

Since this didn't backport cleanly, were all the merge conflicts trivial / non-functional?

Otherwise LGTM assuming the cover letter is the right one.

@mmaslankaprv
Copy link
Member Author

Since this didn't backport cleanly, were all the merge conflicts trivial / non-functional?

Otherwise LGTM assuming the cover letter is the right one.

yes, they were, related to tests

@mmaslankaprv mmaslankaprv merged commit 25d8e27 into redpanda-data:v22.1.x Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants