Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddressSanitizer: SEGV on unknown address #5575

Closed
dotnwat opened this issue Jul 22, 2022 · 11 comments
Closed

AddressSanitizer: SEGV on unknown address #5575

dotnwat opened this issue Jul 22, 2022 · 11 comments
Assignees
Labels

Comments

@dotnwat
Copy link
Member

dotnwat commented Jul 22, 2022

https://buildkite.com/redpanda/redpanda/builds/12919#018225ac-ab26-48df-92dd-dc5ab24fa10f/1500-9014

TRACE 2022-07-22 12:44:06,694 [shard 0] kafka - request_context.h:160 - [172.16.16.27:43258] sending 18:api_versions response {error_code={ error_code: none [0] } api_keys={{api_key=0 min_version=0 max_version=7}, {api_key=1 min_version=4 max_version=11}, {api_key=2 min_version=0 max_version=4}, {api_key=3 min_version=0 max_version=7}, {api_key=9 min_version=1 max_version=7}, {api_key=10 min_version=0 max_version=2}, {api_key=16 min_version=0 max_version=2}, {api_key=18 min_version=0 max_version=3}, {api_key=11 min_version=0 max_version=5}, {api_key=12 min_version=0 max_version=3}, {api_key=13 min_version=0 max_version=3}, {api_key=14 min_version=0 max_version=3}, {api_key=19 min_version=0 max_version=5}, {api_key=8 min_version=1 max_version=7}, {api_key=32 min_version=0 max_version=2}, {api_key=33 min_version=0 max_version=1}, {api_key=20 min_version=0 max_version=3}, {api_key=15 min_version=0 max_version=4}, {api_key=17 min_version=0 max_version=1}, {api_key=36 min_version=0 max_version=1}, {api_key=44 min_version=0 max_version=0}, {api_key=42 min_version=0 max_version=1}, {api_key=29 min_version=0 max_version=1}, {api_key=35 min_version=0 max_version=1}, {api_key=30 min_version=0 max_version=1}, {api_key=31 min_version=0 max_version=1}, {api_key=22 min_version=0 max_version=1}, {api_key=37 min_version=0 max_version=1}, {api_key=23 min_version=0 max_version=3}} throttle_time_ms=0}
TRACE 2022-07-22 12:44:06,697 [shard 0] kafka - requests.cc:97 - [172.16.16.27:43258] processing name:metadata, key:3, version:4 for rdkafka, mem_units: 29580
DEBUG 2022-07-22 12:44:06,697 [shard 0] cluster - health_monitor_backend.cc:430 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-22 12:44:06,698 [shard 0] kafka - request_context.h:160 - [172.16.16.27:43258] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker-rp-7 port=9092 rack={nullopt}}, {node_id=2 host=docker-rp-20 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-10 port=9092 rack={nullopt}}} cluster_id={redpanda.initializing} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-07-22 12:44:06,700 [shard 0] kafka - requests.cc:97 - [172.16.16.27:43258] processing name:metadata, key:3, version:4 for rdkafka, mem_units: 29580
DEBUG 2022-07-22 12:44:06,700 [shard 0] cluster - health_monitor_backend.cc:430 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-22 12:44:06,701 [shard 0] kafka - request_context.h:160 - [172.16.16.27:43258] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker-rp-7 port=9092 rack={nullopt}}, {node_id=2 host=docker-rp-20 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-10 port=9092 rack={nullopt}}} cluster_id={redpanda.initializing} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-07-22 12:44:06,702 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2960==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f48a8dc018f bp 0x7fff2b519840 sp 0x7fff2b5195a0 T0)
==2960==The signal is caused by a READ memory access.
==2960==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.
@dotnwat dotnwat added kind/bug Something isn't working ci-failure labels Jul 22, 2022
@dotnwat
Copy link
Member Author

dotnwat commented Jul 22, 2022

@abhijat i noticed you reported an address sanitizer issue a couple days ago related to config. #5507

This may or may not be related, but immediately after the last log line here (conn:quota.cc:118 put) we access a configuration value if (_cfg.max_connections()) {

@mmaslankaprv
Copy link
Member

This may be related:

TRACE 2022-07-27 07:49:05,273 [shard 0] rpc - conn_quota.cc:140 - get(172.16.16.27)
TRACE 2022-07-27 07:49:05,273 [shard 0] rpc - conn_quota.cc:147 - Global conn limit disabled
TRACE 2022-07-27 07:49:05,273 [shard 0] rpc - server.cc:230 - kafka rpc protocol - Incoming connection from 172.16.16.27:39778 on "dnslistener"
TRACE 2022-07-27 07:49:05,274 [shard 0] kafka - requests.cc:97 - [172.16.16.27:39778] processing name:api_versions, key:18, version:3 for rdkafka, mem_units: 8124
TRACE 2022-07-27 07:49:05,275 [shard 0] kafka - request_context.h:160 - [172.16.16.27:39778] sending 18:api_versions response {error_code={ error_code: none [0] } api_keys={{api_key=0 min_version=0 max_version=7}, {api_key=1 min_version=4 max_version=11}, {api_key=2 min_version=0 max_version=4}, {api_key=3 min_version=0 max_version=7}, {api_key=9 min_version=1 max_version=7}, {api_key=10 min_version=0 max_version=2}, {api_key=16 min_version=0 max_version=2}, {api_key=18 min_version=0 max_version=3}, {api_key=11 min_version=0 max_version=5}, {api_key=12 min_version=0 max_version=3}, {api_key=13 min_version=0 max_version=3}, {api_key=14 min_version=0 max_version=3}, {api_key=19 min_version=0 max_version=5}, {api_key=8 min_version=1 max_version=7}, {api_key=32 min_version=0 max_version=2}, {api_key=33 min_version=0 max_version=1}, {api_key=20 min_version=0 max_version=3}, {api_key=15 min_version=0 max_version=4}, {api_key=17 min_version=0 max_version=1}, {api_key=36 min_version=0 max_version=1}, {api_key=44 min_version=0 max_version=0}, {api_key=42 min_version=0 max_version=1}, {api_key=29 min_version=0 max_version=1}, {api_key=35 min_version=0 max_version=1}, {api_key=30 min_version=0 max_version=1}, {api_key=31 min_version=0 max_version=1}, {api_key=22 min_version=0 max_version=1}, {api_key=37 min_version=0 max_version=1}, {api_key=23 min_version=0 max_version=3}} throttle_time_ms=0}
TRACE 2022-07-27 07:49:05,278 [shard 0] kafka - requests.cc:97 - [172.16.16.27:39778] processing name:metadata, key:3, version:4 for rdkafka, mem_units: 31116
DEBUG 2022-07-27 07:49:05,278 [shard 0] cluster - health_monitor_backend.cc:430 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-27 07:49:05,279 [shard 0] kafka - request_context.h:160 - [172.16.16.27:39778] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=5 host=docker-rp-21 port=9092 rack={E}}, {node_id=3 host=docker-rp-17 port=9092 rack={C}}, {node_id=1 host=docker-rp-5 port=9092 rack={A}}, {node_id=4 host=docker-rp-8 port=9092 rack={D}}, {node_id=2 host=docker-rp-6 port=9092 rack={B}}, {node_id=6 host=docker-rp-22 port=9092 rack={F}}} cluster_id={redpanda.f66f4221-6103-4b9b-b93f-d2399cdd732f} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-07-27 07:49:05,281 [shard 0] kafka - requests.cc:97 - [172.16.16.27:39778] processing name:metadata, key:3, version:4 for rdkafka, mem_units: 31116
DEBUG 2022-07-27 07:49:05,281 [shard 0] cluster - health_monitor_backend.cc:430 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-27 07:49:05,282 [shard 0] kafka - request_context.h:160 - [172.16.16.27:39778] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=5 host=docker-rp-21 port=9092 rack={E}}, {node_id=3 host=docker-rp-17 port=9092 rack={C}}, {node_id=1 host=docker-rp-5 port=9092 rack={A}}, {node_id=4 host=docker-rp-8 port=9092 rack={D}}, {node_id=2 host=docker-rp-6 port=9092 rack={B}}, {node_id=6 host=docker-rp-22 port=9092 rack={F}}} cluster_id={redpanda.f66f4221-6103-4b9b-b93f-d2399cdd732f} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-07-27 07:49:05,284 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==4025==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f287269218f bp 0x7fffaeb74c00 sp 0x7fffaeb74960 T0)
==4025==The signal is caused by a READ memory access.
==4025==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.
FAIL test: RackAwarePlacementTest.test_replica_placement.rack_layout_str=ABCDEF.num_partitions=400.replication_factor=5.num_topics=2 (1/28 runs)
  failure at 2022-07-27T08:36:30.473Z: NodeCrash([(<ducktape.cluster.cluster.ClusterNode object at 0x7f65fa248fa0>, '==4025==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f287269218f bp 0x7fffaeb74c00 sp 0x7fffaeb74960 T0)\n')])
      in job https://buildkite.com/redpanda/redpanda/builds/13167#01823e5c-0dc6-448f-8382-23ae5c2e2291

@mmaslankaprv
Copy link
Member

mmaslankaprv commented Jul 28, 2022

Another one like this:

TRACE 2022-07-28 08:08:18,744 [shard 0] rpc - conn_quota.cc:147 - Global conn limit disabled
TRACE 2022-07-28 08:08:18,744 [shard 0] rpc - server.cc:230 - kafka rpc protocol - Incoming connection from 172.16.16.27:49292 on "dnslistener"
TRACE 2022-07-28 08:08:18,746 [shard 0] kafka - requests.cc:97 - [172.16.16.27:49292] processing name:api_versions, key:18, version:3 for rdkafka, mem_units: 8124
TRACE 2022-07-28 08:08:18,746 [shard 0] kafka - request_context.h:160 - [172.16.16.27:49292] sending 18:api_versions response {error_code={ error_code: none [0] } api_keys={{api_key=0 min_version=0 max_version=7}, {api_key=1 min_version=4 max_version=11}, {api_key=2 min_version=0 max_version=4}, {api_key=3 min_version=0 max_version=7}, {api_key=9 min_version=1 max_version=7}, {api_key=10 min_version=0 max_version=2}, {api_key=16 min_version=0 max_version=2}, {api_key=18 min_version=0 max_version=3}, {api_key=11 min_version=0 max_version=5}, {api_key=12 min_version=0 max_version=3}, {api_key=13 min_version=0 max_version=3}, {api_key=14 min_version=0 max_version=3}, {api_key=19 min_version=0 max_version=5}, {api_key=8 min_version=1 max_version=7}, {api_key=32 min_version=0 max_version=2}, {api_key=33 min_version=0 max_version=1}, {api_key=20 min_version=0 max_version=3}, {api_key=15 min_version=0 max_version=4}, {api_key=17 min_version=0 max_version=1}, {api_key=36 min_version=0 max_version=1}, {api_key=44 min_version=0 max_version=0}, {api_key=42 min_version=0 max_version=1}, {api_key=29 min_version=0 max_version=1}, {api_key=35 min_version=0 max_version=1}, {api_key=30 min_version=0 max_version=1}, {api_key=31 min_version=0 max_version=1}, {api_key=22 min_version=0 max_version=1}, {api_key=37 min_version=0 max_version=1}, {api_key=23 min_version=0 max_version=3}} throttle_time_ms=0}
TRACE 2022-07-28 08:08:18,749 [shard 0] kafka - requests.cc:97 - [172.16.16.27:49292] processing name:metadata, key:3, version:4 for rdkafka, mem_units: 30092
DEBUG 2022-07-28 08:08:18,749 [shard 0] cluster - health_monitor_backend.cc:430 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-28 08:08:18,750 [shard 0] kafka - request_context.h:160 - [172.16.16.27:49292] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker-rp-23 port=9092 rack={nullopt}}, {node_id=2 host=docker-rp-22 port=9092 rack={nullopt}}, {node_id=4 host=docker-rp-24 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-21 port=9092 rack={nullopt}}} cluster_id={redpanda.f4c206e1-e25d-4621-888c-8088dd896827} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-07-28 08:08:18,752 [shard 0] kafka - requests.cc:97 - [172.16.16.27:49292] processing name:metadata, key:3, version:4 for rdkafka, mem_units: 30092
DEBUG 2022-07-28 08:08:18,753 [shard 0] cluster - health_monitor_backend.cc:430 - requesing cluster state report with filter: {per_node_filter: {include_partitions: false, ntp_filters: {}}, nodes: {}}, force refresh: false
TRACE 2022-07-28 08:08:18,753 [shard 0] kafka - request_context.h:160 - [172.16.16.27:49292] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker-rp-23 port=9092 rack={nullopt}}, {node_id=2 host=docker-rp-22 port=9092 rack={nullopt}}, {node_id=4 host=docker-rp-24 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-21 port=9092 rack={nullopt}}} cluster_id={redpanda.f4c206e1-e25d-4621-888c-8088dd896827} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-07-28 08:08:18,755 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==6426==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7fb6ed3ad18f bp 0x7ffd696b8900 sp 0x7ffd696b8660 T0)
==6426==The signal is caused by a READ memory access.
==6426==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.

https://ci-artifacts.dev.vectorized.cloud/redpanda/01824390-cb8d-42dc-85a8-d5109df02d76/vbuild/ducktape/results/2022-07-28--001/report.html

dotnwat added a commit to dotnwat/redpanda that referenced this issue Jul 28, 2022
Shot in the dark trying to debug
redpanda-data#5575

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
@ztlpn
Copy link
Contributor

ztlpn commented Aug 15, 2022

Happened again here

TRACE 2022-08-15 13:29:37,481 [shard 0] kafka - request_context.h:160 - [172.16.16.27:47064] sending 3:metadata response {throttle_time_ms=0 brokers={{node_id=3 host=docker-rp-23 port=9092 rack={nullopt}}, {node_id=1 host=docker-rp-21 port=9092 rack={nullopt}}, {node_id=2 host=docker-rp-22 port=9092 rack={nullopt}}} cluster_id={redpanda.initializing} controller_id=1 topics={} cluster_authorized_operations=-2147483648}
TRACE 2022-08-15 13:29:37,483 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27)
TRACE 2022-08-15 13:29:37,483 [shard 0] rpc - conn_quota.cc:130 - leaving put(172.16.16.27)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==1303==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f46237ed18f bp 0x7ffe69254d00 sp 0x7ffe69254a60 T0)
==1303==The signal is caused by a READ memory access.
==1303==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.

@ztlpn
Copy link
Contributor

ztlpn commented Aug 16, 2022

Another one here

@jcsp
Copy link
Contributor

jcsp commented Sep 20, 2022

An instance of "membership failed to stabilize" where one of the nodes had crashed with this backtrace-less segfault
https://buildkite.com/redpanda/redpanda/builds/15463#0183560e-d5c1-4cfe-ab92-9fa0149f700b

TRACE 2022-09-19 15:14:56,846 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27)
TRACE 2022-09-19 15:14:56,846 [shard 0] rpc - conn_quota.cc:130 - leaving put(172.16.16.27)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==15776==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f9c9e71b18f bp 0x7ffeee6add40 sp 0x7ffeee6adaa0 T0)
==15776==The signal is caused by a READ memory access.
==15776==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.

@mmaslankaprv
Copy link
Member

We asked devprod team to provide a way to store coredumps on Ci to help us investigate this failure

@jcsp
Copy link
Contributor

jcsp commented Sep 26, 2022

@dotnwat
Copy link
Member Author

dotnwat commented Sep 26, 2022

https://buildkite.com/redpanda/redpanda/builds/15755#01837ad0-b2b6-4a08-a088-b2d257defc38

this one looks different in that it doesn't seem to have this common log prefix ending with

TRACE 2022-09-19 15:14:56,846 [shard 0] rpc - conn_quota.cc:130 - leaving put(172.16.16.27)

?

@andrwng
Copy link
Contributor

andrwng commented Oct 7, 2022

Seen again in https://buildkite.com/redpanda/redpanda/builds/16208#0183a9c9-4cb9-4911-a107-536c79f8f3d5

https://ci-artifacts.dev.vectorized.cloud/redpanda/0183a9c9-4cb9-4911-a107-536c79f8f3d5/vbuild/ducktape/results/2022-10-05--001/report.html

TRACE 2022-10-05 21:21:59,819 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27)
TRACE 2022-10-05 21:21:59,819 [shard 0] rpc - conn_quota.cc:130 - leaving put(172.16.16.27)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==13054==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f8ae449d18f bp 0x7ffc12ec9540 sp 0x7ffc12ec92a0 T0)
==13054==The signal is caused by a READ memory access.
==13054==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.```

@dotnwat
Copy link
Member Author

dotnwat commented Nov 2, 2022

#6769 is the potential fix. Please re-open if you see this again.

@dotnwat dotnwat closed this as completed Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants