Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema registry failed to initialize internal topic: kafka::exception (unknown_topic_or_partition) #5725

Closed
VadimPlh opened this issue Jul 29, 2022 · 2 comments · Fixed by #5729
Assignees
Labels
area/schema-registry Schema Registry service within Redpanda area/tests ci-failure kind/bug Something isn't working

Comments

@VadimPlh
Copy link
Contributor

VadimPlh commented Jul 29, 2022

This error manifests an different ways, essentially any request to schema registry can fail if it fails to initialise its topic.

SchemaRegistryTest.test_post_subjects_subject_versions
https://buildkite.com/redpanda/redpanda/builds/13286#018246cc-6232-43bb-9134-c81a6ad4b946

Summary: AssertionError()
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 35, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/schema_registry_test.py", line 377, in test_post_subjects_subject_versions
    assert result_raw.json() == []
AssertionError

SchemaRegistryTest.test_protobuf
https://buildkite.com/redpanda/redpanda/builds/13300#018248a7-cb3a-4031-8a0e-8f4c7187645e

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 35, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/schema_registry_test.py", line 999, in test_protobuf
    assert result_raw.status_code == requests.codes.unprocessable_entity
AssertionError
@BenPope BenPope self-assigned this Jul 29, 2022
@BenPope
Copy link
Member

BenPope commented Jul 29, 2022

Schema registry tries to create the topic (it always does this on startup, it's easier than querying for it)

DEBUG 2022-07-29 08:13:29,627 [shard 0] pandaproxy - service.cc:151 - Schema registry: attempting to create internal topic (replication=1)
WARN  2022-07-29 08:13:29,827 [shard 0] kafka/client - client.cc:156 - broker_error: kafka::client::broker_error ({ node: -1 }, { error_code: broker_not_available [8] })

But the broker wasn't available. That's ok, it hasn't started yet.

It retries to connect:

INFO  2022-07-29 08:13:29,827 [shard 0] kafka/client - broker.cc:41 - connected to broker:-1 - docker-rp-22:9092
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - sasl_client.cc:20 - Connecting to broker without authentication
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - broker.h:63 - Dispatch: metadata req: {topics={nullopt} allow_auto_topic_creation=true include_cluster_authorized_operations=false include_topic_authorized_operations=false}
TRACE 2022-07-29 08:13:29,828 [shard 0] kafka - requests.cc:97 - [172.16.16.5:52604] processing name:metadata, key:3, version:7 for test_client, mem_units: 29588
DEBUG 2022-07-29 08:13:29,828 [shard 0] cluster - health_monitor_backend.cc:431 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-29 08:13:29,828 [shard 0] kafka - request_context.h:160 - [172.16.16.5:52604] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=2 host=docker-rp-14 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-6 port=9092 rack={nullopt}}, {node_id=3 host=docker-rp-22 port=9092 rack={nullopt}}} cluster_id={redpanda.initializing} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - broker.h:65 - Dispatch: metadata res: {throttle_time_ms=0 brokers={{node_id=2 host=docker-rp-14 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-6 port=9092 rack={nullopt}}, {node_id=3 host=docker-rp-22 port=9092 rack={nullopt}}} cluster_id={redpanda.initializing} controller_id=1 topics={} cluster_authorized_operations=-2147483648}

It got the list of brokers, good.

It connects to them and re-requests topic creation.

INFO  2022-07-29 08:13:29,828 [shard 0] kafka/client - broker.cc:41 - connected to broker:2 - docker-rp-14:9092
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - sasl_client.cc:20 - Connecting to broker without authentication
INFO  2022-07-29 08:13:29,828 [shard 0] kafka/client - broker.cc:41 - connected to broker:1 - docker-rp-6:9092
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - sasl_client.cc:20 - Connecting to broker without authentication
INFO  2022-07-29 08:13:29,828 [shard 0] kafka/client - broker.cc:41 - connected to broker:3 - docker-rp-22:9092
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - sasl_client.cc:20 - Connecting to broker without authentication
DEBUG 2022-07-29 08:13:29,828 [shard 0] kafka/client - broker.h:63 - Dispatch: create_topics req: {topics={{name={_schemas} num_partitions=1 replication_factor=1 assignments={} configs={{name=cleanup.policy value={compact}}}}} timeout_ms=60000 validate_only=false}
DEBUG 2022-07-29 08:13:29,831 [shard 0] cluster - topic_updates_dispatcher.cc:228 - update_leaders_with_estimates: new NTP {kafka/_schemas/0} leader 3
TRACE 2022-07-29 08:13:29,831 [shard 0] cluster - partition_leaders_table.cc:160 - updated partition: {kafka/_schemas/0} leader: {term: 1, current leader: {3}, previous leader: {nullopt}, revision: -9223372036854775808}
DEBUG 2022-07-29 08:13:29,831 [shard 0] kafka/client - broker.h:65 - Dispatch: create_topics res: {throttle_time_ms=0 topics={{name={_schemas} error_code={ error_code: none [0] } error_message={nullopt} num_partitions=-1 replication_factor=-1 configs={nullopt}}}}
DEBUG 2022-07-29 08:13:29,831 [shard 0] pandaproxy - service.cc:170 - Schema registry: created internal topic
DEBUG 2022-07-29 08:13:29,831 [shard 0] pandaproxy - service.cc:186 - Schema registry: loading internal topic

_schemas topic now exists (the leader is broker 3, but Schema Registry doesn't know this yet).

It requests metadata:

DEBUG 2022-07-29 08:13:30,031 [shard 0] kafka/client - client.cc:178 - partition_error: kafka::client::partition_error ({topic_partition: {_schemas}:0}, { error_code: unknown_topic_or_partition [3] })
DEBUG 2022-07-29 08:13:30,031 [shard 0] kafka/client - client.cc:119 - updating metadata
DEBUG 2022-07-29 08:13:30,031 [shard 0] kafka/client - broker.h:63 - Dispatch: metadata req: {topics={nullopt} allow_auto_topic_creation=true include_cluster_authorized_operations=false include_topic_authorized_operations=false}
DEBUG 2022-07-29 08:13:30,031 [shard 0] kafka/client - broker.h:65 - Dispatch: metadata res: {throttle_time_ms=0 brokers={{node_id=2 host=docker-rp-14 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-6 port=9092 rack={nullopt}}, {node_id=3 host=docker-rp-22 port=9092 rack={nullopt}}} cluster_id={redpanda.initializing} controller_id=1 topics={{error_code={ error_code: none [0] } name={_schemas} is_internal=false partitions={{error_code={ error_code: none [0] } partition_index=0 leader_id=3 leader_epoch=1 replica_nodes={{3}} isr_nodes={{3}} offline_replicas={}}} topic_authorized_operations=-2147483648}} cluster_authorized_operations=-2147483648}
TRACE 2022-07-29 08:13:30,031 [shard 0] kafka/client - client.cc:137 - updated metadata

The leader of _schemas is 3. Great.

It tries to get the offsets from broker 3, which, conveniently is the same node:

DEBUG 2022-07-29 08:13:30,031 [shard 0] kafka/client - broker.h:63 - Dispatch: list_offsets req: {replica_id=-2147483648 isolation_level=0 topics={{name={_schemas} partitions={{partition_index=0 current_leader_epoch=-2147483648 timestamp={timestamp: missing} max_num_offsets=1}}}}}
TRACE 2022-07-29 08:13:30,031 [shard 0] kafka - requests.cc:97 - [172.16.16.5:57312] processing name:list_offsets, key:2, version:3 for test_client, mem_units: 8112
TRACE 2022-07-29 08:13:30,031 [shard 0] kafka - list_offsets.cc:250 - Handling request {replica_id=-2147483648 isolation_level=0 topics={{name={_schemas} partitions={{partition_index=0 current_leader_epoch=-2147483648 timestamp={timestamp: missing} max_num_offsets=0}}}}}
TRACE 2022-07-29 08:13:30,031 [shard 0] kafka - request_context.h:160 - [172.16.16.5:57312] sending 2:list_offsets response {throttle_time_ms=0 topics={{name={_schemas} partitions={{partition_index=0 error_code={ error_code: unknown_topic_or_partition [3] } old_style_offsets={} timestamp={timestamp: missing} offset=-1 leader_epoch=-1}}}}}
DEBUG 2022-07-29 08:13:30,031 [shard 0] kafka/client - broker.h:65 - Dispatch: list_offsets res: {throttle_time_ms=0 topics={{name={_schemas} partitions={{partition_index=0 error_code={ error_code: unknown_topic_or_partition [3] } old_style_offsets={} timestamp={timestamp: missing} offset=-1 leader_epoch=-2147483648}}}}}

But broker 3 denies all knowledge of the topic, let alone being the leader. Ooops.

And then schema registry gives up:

ERROR 2022-07-29 08:13:30,031 [shard 0] pandaproxy - service.cc:136 - Schema registry failed to initialize internal topic: kafka::exception (unknown_topic_or_partition)

Which is a shame, because I thought it would retry.

And then Redpanda figures out the leader:

TRACE 2022-07-29 08:13:30,052 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:2542 - Updating follower stats with config {current: {voters: {{id: {3}, revision: {16}}}, learners: {}}, old:{nullopt}, revision: 16, update: {nullopt}, version: 4, brokers: {{id: 3, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-22, port: 9092}}, {iplistener:{host: 172.16.16.5, port: 9093}}}, rpc_address: {host: docker-rp-22, port: 33145}, rack: {nullopt}, properties: {cores 1, mem_available 2, disk_available 63}, membership_state: active}}}
DEBUG 2022-07-29 08:13:30,052 [shard 0] offset_translator - ntp: {kafka/_schemas/0} - offset_translator.cc:197 - starting sync with log, state: {base offset/delta: {-9223372036854775808}/0, map size: 1, last delta: 0}, reading offsets -9223372036854775808--9223372036854775808
INFO  2022-07-29 08:13:30,052 [shard 0] offset_translator - ntp: {kafka/_schemas/0} - offset_translator.cc:232 - synced with log, state: {base offset/delta: {-9223372036854775808}/0, map size: 1, last delta: 0}, highest_known_offset: -9223372036854775808
INFO  2022-07-29 08:13:30,052 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:1295 - started raft, log offsets: {start_offset:-9223372036854775808, committed_offset:-9223372036854775808, committed_offset_term:-9223372036854775808, dirty_offset:-9223372036854775808, dirty_offset_term:-9223372036854775808, last_term_start_offset:-9223372036854775808}, term: 0, configuration: {current: {voters: {{id: {3}, revision: {16}}}, learners: {}}, old:{nullopt}, revision: 16, update: {nullopt}, version: 4, brokers: {{id: 3, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-22, port: 9092}}, {iplistener:{host: 172.16.16.5, port: 9093}}}, rpc_address: {host: docker-rp-22, port: 33145}, rack: {nullopt}, properties: {cores 1, mem_available 2, disk_available 63}, membership_state: active}}}
TRACE 2022-07-29 08:13:30,052 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] vote_stm.cc:302 - Voting for self in term 1
DEBUG 2022-07-29 08:13:30,052 [shard 0] cluster - state_machine.cc:29 - Starting state machine for ntp={kafka/_schemas/0}
TRACE 2022-07-29 08:13:30,052 [shard 0] cluster - controller_backend.cc:1396 - adding 16 to shard table at {kafka/_schemas/0}
DEBUG 2022-07-29 08:13:30,052 [shard 0] cluster - controller_backend.cc:678 - partition operation {type: addition, ntp: {kafka/_schemas/0}, offset: 16, new_assignment: { id: 0, group_id: 1, replicas: {{node_id: 3, shard: 0}} }, previous_replica_set: {nullopt}} finished
TRACE 2022-07-29 08:13:30,062 [shard 0] kvstore - kvstore.cc:193 - Apply op: insert: key={bytes:10} value={{bytes=20, fragments=1}}
TRACE 2022-07-29 08:13:30,062 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] vote_stm.cc:246 - vote acks in term 1 from: {{id: {3}, revision: {16}}}
INFO  2022-07-29 08:13:30,062 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] vote_stm.cc:255 - becoming the leader term:1
DEBUG 2022-07-29 08:13:30,062 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:2552 - triggering leadership notification with term: 1, new leader: {{id: {3}, revision: {16}}}
TRACE 2022-07-29 08:13:30,062 [shard 0] cluster - partition_leaders_table.cc:160 - updated partition: {kafka/_schemas/0} leader: {term: 1, current leader: {3}, previous leader: {3}, revision: 16}
TRACE 2022-07-29 08:13:30,062 [shard 0] cluster - metadata_dissemination_service.cc:88 - Dissemination request for {kafka/_schemas/0}, leader 3
DEBUG 2022-07-29 08:13:30,062 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:2091 - Replicating group configuration {current: {voters: {{id: {3}, revision: {16}}}, learners: {}}, old:{nullopt}, revision: 16, update: {nullopt}, version: 4, brokers: {{id: 3, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-22, port: 9092}}, {iplistener:{host: 172.16.16.5, port: 9093}}}, rpc_address: {host: docker-rp-22, port: 33145}, rack: {nullopt}, properties: {cores 1, mem_available 2, disk_available 63}, membership_state: active}}}
TRACE 2022-07-29 08:13:30,062 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] replicate_entries_stm.cc:186 - Self append entries - {raft_group:{1}, commit_index:{-9223372036854775808}, term:{1}, prev_log_index:{-9223372036854775808}, prev_log_term:{-9223372036854775808}}
INFO  2022-07-29 08:13:30,062 [shard 0] storage - segment.cc:614 - Creating new segment /var/lib/redpanda/data/kafka/_schemas/0_16/0-1-v1.log
TRACE 2022-07-29 08:13:30,062 [shard 0] storage - segment_reader.cc:106 - ::get segment file /var/lib/redpanda/data/kafka/_schemas/0_16/0-1-v1.log, refcount=0
DEBUG 2022-07-29 08:13:30,062 [shard 0] storage - segment_reader.cc:110 - Opening segment file /var/lib/redpanda/data/kafka/_schemas/0_16/0-1-v1.log
TRACE 2022-07-29 08:13:30,062 [shard 0] storage - segment_reader.cc:133 - ::put segment file /var/lib/redpanda/data/kafka/_schemas/0_16/0-1-v1.log, refcount=1
DEBUG 2022-07-29 08:13:30,062 [shard 0] storage - segment_reader.cc:137 - Closing segment file /var/lib/redpanda/data/kafka/_schemas/0_16/0-1-v1.log
TRACE 2022-07-29 08:13:30,063 [shard 0] offset_translator - ntp: {kafka/_schemas/0} - offset_translator.cc:66 - adding batch, offsets: [0,0], delta: 1
TRACE 2022-07-29 08:13:30,063 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:2542 - Updating follower stats with config {current: {voters: {{id: {3}, revision: {16}}}, learners: {}}, old:{nullopt}, revision: 16, update: {nullopt}, version: 4, brokers: {{id: 3, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-22, port: 9092}}, {iplistener:{host: 172.16.16.5, port: 9093}}}, rpc_address: {host: docker-rp-22, port: 33145}, rack: {nullopt}, properties: {cores 1, mem_available 2, disk_available 63}, membership_state: active}}}
TRACE 2022-07-29 08:13:30,063 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] configuration_manager.cc:154 - Adding configuration at offset 0 index 1: {current: {voters: {{id: {3}, revision: {16}}}, learners: {}}, old:{nullopt}, revision: 16, update: {nullopt}, version: 4, brokers: {{id: 3, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-22, port: 9092}}, {iplistener:{host: 172.16.16.5, port: 9093}}}, rpc_address: {host: docker-rp-22, port: 33145}, rack: {nullopt}, properties: {cores 1, mem_available 2, disk_available 63}, membership_state: active}}}
TRACE 2022-07-29 08:13:30,083 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] replicate_entries_stm.cc:196 - Leader append result: {append_time:6044337, base_offset:{0}, last_offset:{0}, byte_size:216}
TRACE 2022-07-29 08:13:30,084 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:2196 - flushed offset updated: 0
TRACE 2022-07-29 08:13:30,084 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] consensus.cc:2504 - Leader commit index updated 0
TRACE 2022-07-29 08:13:30,084 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] replicate_entries_stm.cc:374 - Replication result [offset: 0, term: 1, commit_idx: 0, current_term: 1]
TRACE 2022-07-29 08:13:30,084 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] replicate_entries_stm.cc:407 - Replication success, last offset: 0, term: 1
TRACE 2022-07-29 08:13:30,084 [shard 0] storage - readers_cache.cc:101 - {kafka/_schemas/0} - trying to get reader for: {start_offset:{0}, max_offset:{0}, min_bytes:0, max_bytes:18446744073709551615, type_filter:nullopt, first_timestamp:nullopt}
TRACE 2022-07-29 08:13:30,084 [shard 0] storage - readers_cache.cc:130 - {kafka/_schemas/0} - reader cache miss for: {start_offset:{0}, max_offset:{0}, min_bytes:0, max_bytes:18446744073709551615, type_filter:nullopt, first_timestamp:nullopt}
TRACE 2022-07-29 08:13:30,084 [shard 0] storage - readers_cache.cc:75 - {kafka/_schemas/0} - adding reader [0,0]
TRACE 2022-07-29 08:13:30,084 [shard 0] storage - readers_cache.cc:322 - {kafka/_schemas/0} - removing reader: [0,0] lower_bound: 1
INFO  2022-07-29 08:13:30,084 [shard 0] raft - [group_id:1, {kafka/_schemas/0}] vote_stm.cc:270 - became the leader term:1

@mmedenjak mmedenjak added kind/bug Something isn't working area/schema-registry Schema Registry service within Redpanda area/tests labels Jul 29, 2022
@BenPope BenPope changed the title SchemaRegistryTest.test_post_subjects_subject_versions Schema registry failed to initialize internal topic: kafka::exception (unknown_topic_or_partition) Aug 1, 2022
@BenPope
Copy link
Member

BenPope commented Aug 1, 2022

Also seen:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema-registry Schema Registry service within Redpanda area/tests ci-failure kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants