Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in msquic on Linux ARM32 #103404

Closed
MichalStrehovsky opened this issue Jun 13, 2024 · 7 comments · Fixed by #105109
Closed

SIGSEGV in msquic on Linux ARM32 #103404

MichalStrehovsky opened this issue Jun 13, 2024 · 7 comments · Fixed by #105109
Labels
arch-arm32 area-System.Net.Quic os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Milestone

Comments

@MichalStrehovsky
Copy link
Member

We're often seeing sigsegv in the System.Net.Http.Functional.Tests on Linux ARM32 in native AOT testing. I couldn't find Linux ARM32 runs on top of CoreCLR so I don't know if we run it.

Most recently in https://dev.azure.com/dnceng-public/public/_build/results?buildId=706302&view=logs&jobId=a8f24b3c-c71a-5a83-5031-ad8ed12efa6f.

I pulled down the core file and managed to find the msquic transport package to get symbols. The crash is in msquic:

(lldb) bt
* thread #1, name = 'System.Net.Http', stop reason = signal SIGSEGV
  * frame #0: 0xef4f799c libmsquic.so.2`QuicSendCanSendStreamNow(Stream=<unavailable>) at send.c:956:1
    frame #1: 0xef4c7204 libmsquic.so.2`QuicConnProcessPeerTransportParameters at connection.c.clog.h.lttng.h:1253:1
    frame #2: 0xef4c71a8 libmsquic.so.2`QuicConnProcessPeerTransportParameters(Connection=0x00000000, FromResumptionTicket='\x80') at connection.c:2976:13

Grab the dump and test symbols with runfo get-helix-payload -j fa19deed-d149-4234-8c44-9e123af06c24 -w System.Net.Http.Functional.Tests -o c:\myhell. Grab the transport package from https://dnceng.visualstudio.com/public/_artifacts/feed/dotnet9-transport/NuGet/runtime.linux-arm.runtime.native.System.Net.MsQuic.Transport/overview/9.0.0-alpha.1.24167.3

Don't know if this should be in the native AOT or networking area path. I don't know if we do any regular testing on Linux-arm32 with CoreCLR (not musl-arm32, just arm32).

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jun 13, 2024
@MichalStrehovsky MichalStrehovsky added arch-arm32 os-linux Linux OS (any supported distro) labels Jun 13, 2024
@filipnavara
Copy link
Member

I remember checking this one in the past and it was not NativeAOT specific.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@ManickaP
Copy link
Member

There's an ARM32 issue in MsQuic (microsoft/msquic#3958) that is fixed, but not out yet. The callstack is different though. @nibanks this looks like it might be a problem in MsQuic.

@ManickaP ManickaP removed the untriaged New issue has not been triaged by the area owner label Jun 13, 2024
@ManickaP ManickaP added this to the Future milestone Jun 13, 2024
@janvorli
Copy link
Member

Looking at the call stack above, I can see at frame 2 that the Connection=0x00000000, maybe that's the source of the problem?

@liveans
Copy link
Member

liveans commented Jun 21, 2024

@ManickaP Do you think it's worth to close this as duplicate of #103703?

@ManickaP
Copy link
Member

Those are different callstacks? We can probably merge it in one issue and copy the details from here there.

@janvorli
Copy link
Member

The other issue looks quite different, I don't think it would make sense to merge them together.

@karelz karelz added the tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly label Jun 25, 2024
@ManickaP ManickaP assigned liveans and unassigned liveans Jun 25, 2024
@liveans liveans added the test-run-core Test failures in .NET Core test runs label Jul 15, 2024
@liveans liveans removed their assignment Jul 18, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Aug 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm32 area-System.Net.Quic os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants