Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TBS 6985 not detected after driver install #56

Open
utelisysadmin opened this issue Jun 14, 2017 · 6 comments
Open

TBS 6985 not detected after driver install #56

utelisysadmin opened this issue Jun 14, 2017 · 6 comments

Comments

@utelisysadmin
Copy link

There are:
4 6991 cards
2 6985 cards
in the server.

They all work with proprietary driver, but the server freezes every 4.2 hours. So I decided to give the OS driver a go.

uname -a
Linux tv-box 4.8.0-54-generic #57~16.04.1-Ubuntu SMP Wed May 24 16:22:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

lspci -v | grep --after-context=10 7160
0e:00.0 Multimedia controller: Philips Semiconductors SAA7160 (rev 03)
Subsystem: Device 6991:0002
Flags: bus master, fast devsel, latency 0, IRQ 47
Memory at de600000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable+ Count=1/32 Maskable- 64bit+
Capabilities: [50] Express Endpoint, MSI 00
Capabilities: [74] Power Management version 2
Capabilities: [80] Vendor Specific Information: Len=50 Capabilities: [100] Vendor Specific Information: ID=0000 Rev=0 Len=088
Kernel driver in use: SAA716x Budget
Kernel modules: saa716x_budget

0f:00.0 Multimedia controller: Philips Semiconductors SAA7160 (rev 03)
Subsystem: Device 6991:0002
Flags: bus master, fast devsel, latency 0, IRQ 49
Memory at de500000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable+ Count=1/32 Maskable- 64bit+
Capabilities: [50] Express Endpoint, MSI 00
Capabilities: [74] Power Management version 2
Capabilities: [80] Vendor Specific Information: Len=50 Capabilities: [100] Vendor Specific Information: ID=0000 Rev=0 Len=088
Kernel driver in use: SAA716x Budget
Kernel modules: saa716x_budget

12:00.0 Multimedia controller: Philips Semiconductors SAA7160 (rev 03)
Subsystem: Device 6991:0002
Flags: bus master, fast devsel, latency 0, IRQ 50
Memory at de800000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable+ Count=1/32 Maskable- 64bit+
Capabilities: [50] Express Endpoint, MSI 00
Capabilities: [74] Power Management version 2
Capabilities: [80] Vendor Specific Information: Len=50 Capabilities: [100] Vendor Specific Information: ID=0000 Rev=0 Len=088
Kernel driver in use: SAA716x Budget
Kernel modules: saa716x_budget

13:00.0 Multimedia controller: Philips Semiconductors SAA7160 (rev 03)
Subsystem: Device 6991:0002
Flags: bus master, fast devsel, latency 0, IRQ 52
Memory at de700000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable+ Count=1/32 Maskable- 64bit+
Capabilities: [50] Express Endpoint, MSI 00
Capabilities: [74] Power Management version 2
Capabilities: [80] Vendor Specific Information: Len=50 Capabilities: [100] Vendor Specific Information: ID=0000 Rev=0 Len=088
Kernel driver in use: SAA716x Budget
Kernel modules: saa716x_budget

15:00.0 Multimedia controller: Philips Semiconductors SAA7160 (rev 02)
Subsystem: Device 6985:0002
Physical Slot: 6
Flags: fast devsel, IRQ 16
Memory at de900000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable- Count=1/32 Maskable- 64bit+
Capabilities: [50] Express Endpoint, MSI 00
Capabilities: [74] Power Management version 2
Capabilities: [80] Vendor Specific Information: Len=50 Capabilities: [100] Vendor Specific Information: ID=0000 Rev=0 Len=088
Kernel modules: saa716x_budget

1a:00.0 Multimedia controller: Philips Semiconductors SAA7160 (rev 02)
Subsystem: Device 6985:0002
Flags: fast devsel, IRQ 19
Memory at de400000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable- Count=1/32 Maskable- 64bit+
Capabilities: [50] Express Endpoint, MSI 00
Capabilities: [74] Power Management version 2
Capabilities: [80] Vendor Specific Information: Len=50 Capabilities: [100] Vendor Specific Information: ID=0000 Rev=0 Len=088
Kernel modules: saa716x_budget

kern.log

Jun 14 17:38:11 tv-box kernel: [ 1143.283435] media: loading out-of-tree module taints kernel.
Jun 14 17:38:11 tv-box kernel: [ 1143.283524] media: module verification failed: signature and/or required key missing - tainting kernel
Jun 14 17:38:11 tv-box kernel: [ 1143.285000] media: Linux media interface: v0.10
Jun 14 17:38:11 tv-box kernel: [ 1143.289672] WARNING: You are using an experimental version of the media stack.
Jun 14 17:38:11 tv-box kernel: [ 1143.289672] As the driver is backported to an older kernel, it doesn't offer
Jun 14 17:38:11 tv-box kernel: [ 1143.289672] enough quality for its usage in production.
Jun 14 17:38:11 tv-box kernel: [ 1143.289672] Use it with care.
Jun 14 17:38:11 tv-box kernel: [ 1143.289672] Latest git patches (needed if you report a bug to linux-media@vger.kernel.org):
Jun 14 17:38:11 tv-box kernel: [ 1143.289672] 989e2fe tbsecp3: Added TBS6281SE
Jun 14 17:38:11 tv-box kernel: [ 1143.320448] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:11 tv-box kernel: [ 1143.320967] i2c i2c-5: Added multiplexed i2c bus 6
Jun 14 17:38:11 tv-box kernel: [ 1143.321078] i2c i2c-5: Added multiplexed i2c bus 7
Jun 14 17:38:12 tv-box kernel: [ 1143.430955] i2c i2c-7: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1143.430964] SAA716x Budget 0000:0e:00.0: DVB: registering adapter 0 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1143.431187] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1143.432507] i2c i2c-4: Added multiplexed i2c bus 8
Jun 14 17:38:12 tv-box kernel: [ 1143.432660] i2c i2c-4: Added multiplexed i2c bus 9
Jun 14 17:38:12 tv-box kernel: [ 1143.540242] i2c i2c-9: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1143.540248] SAA716x Budget 0000:0e:00.0: DVB: registering adapter 1 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1143.568407] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1143.568979] i2c i2c-11: Added multiplexed i2c bus 12
Jun 14 17:38:12 tv-box kernel: [ 1143.569099] i2c i2c-11: Added multiplexed i2c bus 13
Jun 14 17:38:12 tv-box kernel: [ 1143.676248] i2c i2c-13: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1143.676256] SAA716x Budget 0000:0f:00.0: DVB: registering adapter 2 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1143.676457] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1143.677045] i2c i2c-10: Added multiplexed i2c bus 14
Jun 14 17:38:12 tv-box kernel: [ 1143.677173] i2c i2c-10: Added multiplexed i2c bus 15
Jun 14 17:38:12 tv-box kernel: [ 1143.784209] i2c i2c-15: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1143.784215] SAA716x Budget 0000:0f:00.0: DVB: registering adapter 3 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1143.812426] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1143.812996] i2c i2c-17: Added multiplexed i2c bus 18
Jun 14 17:38:12 tv-box kernel: [ 1143.813128] i2c i2c-17: Added multiplexed i2c bus 19
Jun 14 17:38:12 tv-box kernel: [ 1143.920233] i2c i2c-19: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1143.920238] SAA716x Budget 0000:12:00.0: DVB: registering adapter 4 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1143.920466] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1143.921037] i2c i2c-16: Added multiplexed i2c bus 20
Jun 14 17:38:12 tv-box kernel: [ 1143.921155] i2c i2c-16: Added multiplexed i2c bus 21
Jun 14 17:38:12 tv-box kernel: [ 1144.028257] i2c i2c-21: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1144.028262] SAA716x Budget 0000:12:00.0: DVB: registering adapter 5 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1144.056452] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1144.057018] i2c i2c-23: Added multiplexed i2c bus 24
Jun 14 17:38:12 tv-box kernel: [ 1144.057144] i2c i2c-23: Added multiplexed i2c bus 25
Jun 14 17:38:12 tv-box kernel: [ 1144.164242] i2c i2c-25: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1144.164248] SAA716x Budget 0000:13:00.0: DVB: registering adapter 6 frontend 0 (Tmax TAS2101)...
Jun 14 17:38:12 tv-box kernel: [ 1144.164494] dvbdev: DVB: registering new adapter (SAA716x dvb adapter)
Jun 14 17:38:12 tv-box kernel: [ 1144.165078] i2c i2c-22: Added multiplexed i2c bus 26
Jun 14 17:38:12 tv-box kernel: [ 1144.165210] i2c i2c-22: Added multiplexed i2c bus 27
Jun 14 17:38:12 tv-box kernel: [ 1144.272230] i2c i2c-27: av201x: Airoha Technology AV201x successfully attached
Jun 14 17:38:12 tv-box kernel: [ 1144.272235] SAA716x Budget 0000:13:00.0: DVB: registering adapter 7 frontend 0 (Tmax TAS2101)...

I also have tbs.conf in /etc/modprobe.d/:
options tbs_pcie-dvb tbs_int_type=1
options saa716x_tbs-dvb int_type=1

lsmod | grep budget
saa716x_budget 40960 0
tas2101 24576 9 saa716x_budget
cx24117 36864 1 saa716x_budget
saa716x_core 73728 1 saa716x_budget
dvb_core 131072 2 saa716x_budget,saa716x_core

Followed the instructions to install driver from wiki.

@crazycat69
Copy link

8 adapters detected, so maybe need increase max number of adapters ?

@utelisysadmin
Copy link
Author

It was my understanding that this was increased to 48, but I think I am mixing the closed source repo with OS one.

Increased to 64 an all cards detected now, however none of the hardware CAMs are detected.

@utelisysadmin
Copy link
Author

Actually getting kernel traces:

kernel.txt

@utelisysadmin
Copy link
Author

Wiped the driver and re-did the installation. Now all cards and CAMs are detected, however when I start more than 4 cards they all start throwing discontinuities.

@crazycat69
Copy link

Try disable MSI interrupt

@utelisysadmin
Copy link
Author

I did. Its the same situation. I went back to the closed-source driver to confirm. There I get discontinuities as well, but not as many.

crazycat69 pushed a commit that referenced this issue Sep 22, 2017
The following warning was triggered by missing srcu locks around
the storage key handling functions.

=============================
WARNING: suspicious RCU usage
4.12.0+ #56 Not tainted
-----------------------------
./include/linux/kvm_host.h:572 suspicious rcu_dereference_check() usage!
rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by live_migration/4936:
  #0:  (&mm->mmap_sem){++++++}, at: [<0000000000141be0>]
kvm_arch_vm_ioctl+0x6b8/0x22d0

 CPU: 8 PID: 4936 Comm: live_migration Not tainted 4.12.0+ #56
 Hardware name: IBM 2964 NC9 704 (LPAR)
 Call Trace:
 ([<000000000011378a>] show_stack+0xea/0xf0)
  [<000000000055cc4c>] dump_stack+0x94/0xd8
  [<000000000012ee70>] gfn_to_memslot+0x1a0/0x1b8
  [<0000000000130b76>] gfn_to_hva+0x2e/0x48
  [<0000000000141c3c>] kvm_arch_vm_ioctl+0x714/0x22d0
  [<000000000013306c>] kvm_vm_ioctl+0x11c/0x7b8
  [<000000000037e2c0>] do_vfs_ioctl+0xa8/0x6c8
  [<000000000037e984>] SyS_ioctl+0xa4/0xb8
  [<00000000008b20a4>] system_call+0xc4/0x27c
 1 lock held by live_migration/4936:
  #0:  (&mm->mmap_sem){++++++}, at: [<0000000000141be0>]
kvm_arch_vm_ioctl+0x6b8/0x22d0

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Pierre Morel<pmorel@linux.vnet.ibm.com>
crazycat69 pushed a commit that referenced this issue Mar 14, 2018
The race is between lookup_get_idr_uobject and
uverbs_idr_remove_uobj -> uverbs_uobject_put.

We deliberately do not call sychronize_rcu after the idr_remove in
uverbs_idr_remove_uobj for performance reasons, instead we call
kfree_rcu() during uverbs_uobject_put.

However, this means we can obtain pointers to uobj's that have
already been released and must protect against krefing them
using kref_get_unless_zero.

==================================================================
BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
Read of size 4 at addr ffff88005fda1ac8 by task syz-executor2/441

CPU: 1 PID: 441 Comm: syz-executor2 Not tainted 4.15.0-rc2+ #56
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xd4
print_address_description+0x73/0x290
kasan_report+0x25c/0x370
? copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
? uverbs_try_lock_object+0x68/0xc0
? modify_qp.isra.7+0xdc4/0x10e0
modify_qp.isra.7+0xdc4/0x10e0
ib_uverbs_modify_qp+0xfe/0x170
? ib_uverbs_query_qp+0x970/0x970
? __lock_acquire+0xa11/0x1da0
ib_uverbs_write+0x55a/0xad0
? ib_uverbs_query_qp+0x970/0x970
? ib_uverbs_query_qp+0x970/0x970
? ib_uverbs_open+0x760/0x760
? futex_wake+0x147/0x410
? sched_clock_cpu+0x18/0x180
? check_prev_add+0x1680/0x1680
? do_futex+0x3b6/0xa30
? sched_clock_cpu+0x18/0x180
__vfs_write+0xf7/0x5c0
? ib_uverbs_open+0x760/0x760
? kernel_read+0x110/0x110
? lock_acquire+0x370/0x370
? __fget+0x264/0x3b0
vfs_write+0x18a/0x460
SyS_write+0xc7/0x1a0
? SyS_read+0x1a0/0x1a0
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x448e29
RSP: 002b:00007f443fee0c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f443fee16bc RCX: 0000000000448e29
RDX: 0000000000000078 RSI: 00000000209f8000 RDI: 0000000000000012
RBP: 000000000070bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000008e98 R14: 00000000006ebf38 R15: 0000000000000000

Allocated by task 1:
kmem_cache_alloc_trace+0x16c/0x2f0
mlx5_alloc_cmd_msg+0x12e/0x670
cmd_exec+0x419/0x1810
mlx5_cmd_exec+0x40/0x70
mlx5_core_mad_ifc+0x187/0x220
mlx5_MAD_IFC+0xd7/0x1b0
mlx5_query_mad_ifc_gids+0x1f3/0x650
mlx5_ib_query_gid+0xa4/0xc0
ib_query_gid+0x152/0x1a0
ib_query_port+0x21e/0x290
mlx5_port_immutable+0x30f/0x490
ib_register_device+0x5dd/0x1130
mlx5_ib_add+0x3e7/0x700
mlx5_add_device+0x124/0x510
mlx5_register_interface+0x11f/0x1c0
mlx5_ib_init+0x56/0x61
do_one_initcall+0xa3/0x250
kernel_init_freeable+0x309/0x3b8
kernel_init+0x14/0x180
ret_from_fork+0x24/0x30

Freed by task 1:
kfree+0xeb/0x2f0
mlx5_free_cmd_msg+0xcd/0x140
cmd_exec+0xeba/0x1810
mlx5_cmd_exec+0x40/0x70
mlx5_core_mad_ifc+0x187/0x220
mlx5_MAD_IFC+0xd7/0x1b0
mlx5_query_mad_ifc_gids+0x1f3/0x650
mlx5_ib_query_gid+0xa4/0xc0
ib_query_gid+0x152/0x1a0
ib_query_port+0x21e/0x290
mlx5_port_immutable+0x30f/0x490
ib_register_device+0x5dd/0x1130
mlx5_ib_add+0x3e7/0x700
mlx5_add_device+0x124/0x510
mlx5_register_interface+0x11f/0x1c0
mlx5_ib_init+0x56/0x61
do_one_initcall+0xa3/0x250
kernel_init_freeable+0x309/0x3b8
kernel_init+0x14/0x180
ret_from_fork+0x24/0x30

The buggy address belongs to the object at ffff88005fda1ab0
which belongs to the cache kmalloc-32 of size 32
The buggy address is located 24 bytes inside of
32-byte region [ffff88005fda1ab0, ffff88005fda1ad0)
The buggy address belongs to the page:
page:00000000d5655c19 count:1 mapcount:0 mapping: (null)
index:0xffff88005fda1fc0
flags: 0x4000000000000100(slab)
raw: 4000000000000100 0000000000000000 ffff88005fda1fc0 0000000180550008
raw: ffffea00017f6780 0000000400000004 ffff88006c803980 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88005fda1980: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
ffff88005fda1a00: fb fb fc fc fb fb fb fb fc fc 00 00 00 00 fc fc
ffff88005fda1a80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
ffff88005fda1b00: fc fc 00 00 00 00 fc fc fb fb fb fb fc fc fb fb
ffff88005fda1b80: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
==================================================================@

Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 4.11
Fixes: 3832125 ("IB/core: Add support for idr types")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
crazycat69 pushed a commit that referenced this issue Jan 28, 2020
…isten()

With multi-transport support, listener sockets are not bound to any
transport. So, calling virtio_transport_reset(), when an error
occurs, on a listener socket produces the following null-pointer
dereference:

  BUG: kernel NULL pointer dereference, address: 00000000000000e8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 0 PID: 20 Comm: kworker/0:1 Not tainted 5.5.0-rc1-ste-00003-gb4be21f316ac-dirty #56
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
  Workqueue: virtio_vsock virtio_transport_rx_work [vmw_vsock_virtio_transport]
  RIP: 0010:virtio_transport_send_pkt_info+0x20/0x130 [vmw_vsock_virtio_transport_common]
  Code: 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 10 44 8b 76 20 e8 c0 ba fe ff <48> 8b 80 e8 00 00 00 e8 64 e3 7d c1 45 8b 45 00 41 8b 8c 24 d4 02
  RSP: 0018:ffffc900000b7d08 EFLAGS: 00010282
  RAX: 0000000000000000 RBX: ffff88807bf12728 RCX: 0000000000000000
  RDX: ffff88807bf12700 RSI: ffffc900000b7d50 RDI: ffff888035c84000
  RBP: ffffc900000b7d40 R08: ffff888035c84000 R09: ffffc900000b7d08
  R10: ffff8880781de800 R11: 0000000000000018 R12: ffff888035c84000
  R13: ffffc900000b7d50 R14: 0000000000000000 R15: ffff88807bf12724
  FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000e8 CR3: 00000000790f4004 CR4: 0000000000160ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   virtio_transport_reset+0x59/0x70 [vmw_vsock_virtio_transport_common]
   virtio_transport_recv_pkt+0x5bb/0xe50 [vmw_vsock_virtio_transport_common]
   ? detach_buf_split+0xf1/0x130
   virtio_transport_rx_work+0xba/0x130 [vmw_vsock_virtio_transport]
   process_one_work+0x1c0/0x300
   worker_thread+0x45/0x3c0
   kthread+0xfc/0x130
   ? current_work+0x40/0x40
   ? kthread_park+0x90/0x90
   ret_from_fork+0x35/0x40
  Modules linked in: sunrpc kvm_intel kvm vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common irqbypass vsock virtio_rng rng_core
  CR2: 00000000000000e8
  ---[ end trace e75400e2ea2fa824 ]---

This happens because virtio_transport_reset() calls
virtio_transport_send_pkt_info() that can be used only on
connecting/connected sockets.

This patch fixes the issue, using virtio_transport_reset_no_sock()
instead of virtio_transport_reset() when we are handling a listener
socket.

Fixes: c0cfa2d ("vsock: add multi-transports support")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
crazycat69 pushed a commit that referenced this issue Feb 1, 2021
I was hitting the below panic continuously when attaching kprobes to
scheduler functions

	[  159.045212] Unexpected kernel BRK exception at EL1
	[  159.053753] Internal error: BRK handler: f2000006 [#1] PREEMPT SMP
	[  159.059954] Modules linked in:
	[  159.063025] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.11.0-rc4-00008-g1e2a199f6ccd #56
	[rt-app] <notice> [1] Exiting.[  159.071166] Hardware name: ARM Juno development board (r2) (DT)
	[  159.079689] pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO BTYPE=--)

	[  159.085723] pc : 0xffff80001624501c
	[  159.089377] lr : attach_entity_load_avg+0x2ac/0x350
	[  159.094271] sp : ffff80001622b640
	[rt-app] <notice> [0] Exiting.[  159.097591] x29: ffff80001622b640 x28: 0000000000000001
	[  159.105515] x27: 0000000000000049 x26: ffff000800b79980

	[  159.110847] x25: ffff00097ef37840 x24: 0000000000000000
	[  159.116331] x23: 00000024eacec1ec x22: ffff00097ef12b90
	[  159.121663] x21: ffff00097ef37700 x20: ffff800010119170
	[rt-app] <notice> [11] Exiting.[  159.126995] x19: ffff00097ef37840 x18: 000000000000000e
	[  159.135003] x17: 0000000000000001 x16: 0000000000000019
	[  159.140335] x15: 0000000000000000 x14: 0000000000000000
	[  159.145666] x13: 0000000000000002 x12: 0000000000000002
	[  159.150996] x11: ffff80001592f9f0 x10: 0000000000000060
	[  159.156327] x9 : ffff8000100f6f9c x8 : be618290de0999a1
	[  159.161659] x7 : ffff80096a4b1000 x6 : 0000000000000000
	[  159.166990] x5 : ffff00097ef37840 x4 : 0000000000000000
	[  159.172321] x3 : ffff000800328948 x2 : 0000000000000000
	[  159.177652] x1 : 0000002507d52fec x0 : ffff00097ef12b90
	[  159.182983] Call trace:
	[  159.185433]  0xffff80001624501c
	[  159.188581]  update_load_avg+0x2d0/0x778
	[  159.192516]  enqueue_task_fair+0x134/0xe20
	[  159.196625]  enqueue_task+0x4c/0x2c8
	[  159.200211]  ttwu_do_activate+0x70/0x138
	[  159.204147]  sched_ttwu_pending+0xbc/0x160
	[  159.208253]  flush_smp_call_function_queue+0x16c/0x320
	[  159.213408]  generic_smp_call_function_single_interrupt+0x1c/0x28
	[  159.219521]  ipi_handler+0x1e8/0x3c8
	[  159.223106]  handle_percpu_devid_irq+0xd8/0x460
	[  159.227650]  generic_handle_irq+0x38/0x50
	[  159.231672]  __handle_domain_irq+0x6c/0xc8
	[  159.235781]  gic_handle_irq+0xcc/0xf0
	[  159.239452]  el1_irq+0xb4/0x180
	[  159.242600]  rcu_is_watching+0x28/0x70
	[  159.246359]  rcu_read_lock_held_common+0x44/0x88
	[  159.250991]  rcu_read_lock_any_held+0x30/0xc0
	[  159.255360]  kretprobe_dispatcher+0xc4/0xf0
	[  159.259555]  __kretprobe_trampoline_handler+0xc0/0x150
	[  159.264710]  trampoline_probe_handler+0x38/0x58
	[  159.269255]  kretprobe_trampoline+0x70/0xc4
	[  159.273450]  run_rebalance_domains+0x54/0x80
	[  159.277734]  __do_softirq+0x164/0x684
	[  159.281406]  irq_exit+0x198/0x1b8
	[  159.284731]  __handle_domain_irq+0x70/0xc8
	[  159.288840]  gic_handle_irq+0xb0/0xf0
	[  159.292510]  el1_irq+0xb4/0x180
	[  159.295658]  arch_cpu_idle+0x18/0x28
	[  159.299245]  default_idle_call+0x9c/0x3e8
	[  159.303265]  do_idle+0x25c/0x2a8
	[  159.306502]  cpu_startup_entry+0x2c/0x78
	[  159.310436]  secondary_start_kernel+0x160/0x198
	[  159.314984] Code: d42000c0 aa1e03e9 d42000c0 aa1e03e9 (d42000c0)

After a bit of head scratching and debugging it turned out that it is
due to kprobe handler being interrupted by a tick that causes us to go
into (I think another) kprobe handler.

The culprit was kprobe_breakpoint_ss_handler() returning DBG_HOOK_ERROR
which leads to the Unexpected kernel BRK exception.

Reverting commit ba090f9 ("arm64: kprobes: Remove redundant
kprobe_step_ctx") seemed to fix the problem for me.

Further analysis showed that kcb->kprobe_status is set to
KPROBE_REENTER when the error occurs. By teaching
kprobe_breakpoint_ss_handler() to handle this status I can no  longer
reproduce the problem.

Fixes: ba090f9 ("arm64: kprobes: Remove redundant kprobe_step_ctx")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20210122110909.3324607-1-qais.yousef@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants