Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash at latest under GraalVM #386

Open
sgammon opened this issue Jun 1, 2024 · 3 comments
Open

Crash at latest under GraalVM #386

sgammon opened this issue Jun 1, 2024 · 3 comments
Labels
bug Something isn't working upstream/coroutines

Comments

@sgammon
Copy link

sgammon commented Jun 1, 2024

We have a suite of "self tests" in our binary which use the Jest sample. The output looks like this normally:

Running Elide self-tests...

 PASS  elide.tool.engine/OsArchTest
... bunch of tests ...
 PASS  elide.tool.engine/RubyEngineTest

Tests: 17 passed, 17 total
Time:  0s (207ms) 

Running these at the latest GraalVM release, and release of Mosaic, produces a crash:

Process 1239 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x300000008)
    frame #0: 0x0000000104bd33d4 elide`dUpdater$AtomicReferenceFieldUpdaterImpl_accessCheck_Pp7eKPE4nbGIVfcKJI1jE5 + 84
elide`dUpdater$AtomicReferenceFieldUpdaterImpl_accessCheck_Pp7eKPE4nbGIVfcKJI1jE5:
->  0x104bd33d4 <+84>: ldr    w3, [x3, x4]
    0x104bd33d8 <+88>: ldrh   w4, [x2, #0x8a]
    0x104bd33dc <+92>: ldrh   w2, [x2, #0x8c]
    0x104bd33e0 <+96>: and    w3, w3, #0xffff

After investigating further, I was able to produce an exception:

java.lang.ClassCastException
	at java.base@23/java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.throwAccessCheckException(AtomicReferenceFieldUpdater.java:418)
	at java.base@23/java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:409)
	at java.base@23/java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:466)
	at kotlinx.coroutines.CancellableContinuationImpl.getParentHandle(CancellableContinuationImpl.kt:103)
	at kotlinx.coroutines.CancellableContinuationImpl.detachChild$kotlinx_coroutines_core(CancellableContinuationImpl.kt:569)
	at kotlinx.coroutines.CancellableContinuationImpl.detachChildIfNonResuable(CancellableContinuationImpl.kt:562)
	at kotlinx.coroutines.CancellableContinuationImpl.resumeImpl$kotlinx_coroutines_core(CancellableContinuationImpl.kt:503)
	at kotlinx.coroutines.CancellableContinuationImpl.resumeImpl$kotlinx_coroutines_core$default(CancellableContinuationImpl.kt:493)
	at kotlinx.coroutines.CancellableContinuationImpl.resumeUndispatched(CancellableContinuationImpl.kt:596)
	at kotlinx.coroutines.EventLoopImplBase$DelayedResumeTask.run(EventLoop.common.kt:497)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:263)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:47)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at elide.tool.cli.AbstractToolCommand.execute(AbstractToolCommand.kt:129)
        # (abridged)

The snippets of code in question:

AbstractToolCommand.kt:

runMosaicBlocking {
  runCatching {
    // (run all discovered self-tests)
  }
}

I have shortened this to use the counter example instead. The code I'm using is:

suspend fun MosaicScope.runCounter() {
  // TODO https://github.com/JakeWharton/mosaic/issues/3
  var count by mutableIntStateOf(0)

  setContent {
    Text("The count is: $count")
  }

  for (i in 1..20) {
    delay(250)
    count = i
  }
}

Versions:

Component Version
Kotlin 2.0.0
Compose 2.0.0 (built-in)
Mosaic 0.13.0
Coroutines 1.9.0-RC
AtomicFU 0.24.0

The relevant lines from Coroutines show that this is failing at an AtomicFU callsite:
Screenshot 2024-06-01 at 1 46 23 AM

I have tried really hard to debug further and gotten nowhere. Some other details I've been able to confirm:

  • This started with a recent version of Coroutines; our previous release, at the matrix below, does not display this issue
  • Rolling back the atomicfu or coroutines version alone does not fix it (I've pinned and made sure)
  • Our "debug" build, which uses a different suite of flags, but the same code and versions, does not display the issue
  • The bug only surfaces in optimized build modes, and changes based on the optimization mode
    • -O3 and -O4 produce crashes (bus error or segmentation fault)
    • -O2 produces the stacktrace above
    • -O1 and -O0 seem to produce bus error

The failure ultimately happens because System.classData is called, which is substituted under SVM. AtomicReferenceFieldUpdater calls this as part of accessCheck, through a failed call to isInstance; the failed isInstance check throws with throwAccessCheckException, which uses reflection and is present in the stacktrace.

Working version matrix (previous):

Component Version
Kotlin 2.0.0-RC3
Compose 1.5.13-dev-k2.0.0-RC1-50f08dfa4b4 (Android)
Mosaic 0.11.0
Coroutines 1.8.0
AtomicFU 0.23.2

I intend to cross-file with the Coroutines and GraalVM teams. I can supply test binaries, symbols, and a reproducer in-situ.

@JakeWharton
Copy link
Owner

Thanks for such a detailed issue description. I'll follow along at the coroutines issue for now, as I suspect this behavior is entirely out of our control at this layer.

@sgammon
Copy link
Author

sgammon commented Jun 5, 2024

@JakeWharton I thought initially that Mosaic might be off the hook, but actually this might be multiple issues (or just one really complex one), and at least coroutines thinks it may have to do with Mosaic. I've isolated this to a simple reproducer; see the tracker issue at elide-dev/elide#972. Back on our own codebase, I can isolate this into an interesting stacktrace:

...
	Suppressed: kotlinx.coroutines.CoroutinesInternalError: Fatal exception in coroutines machinery for DispatchedContinuation[BlockingEventLoop@4a98064b, Continuation at com.jakewharton.mosaic.MosaicKt$runMosaic$2.invokeSuspend(mosaic.kt)@6732a44]. Please read KDoc to 'handleFatalException' method and report this incident to maintainers
		at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:140)
		... 21 more
	Caused by: java.lang.IncompatibleClassChangeError
		at kotlin.coroutines.CombinedContext.get(CoroutineContextImpl.kt:125)
		at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:93)
		... 20 more
...
Click to see full stacktrace
Exception in thread "main" java.lang.RuntimeException: Exception while trying to handle coroutine exception
	at kotlinx.coroutines.CoroutineExceptionHandlerKt.handlerException(CoroutineExceptionHandler.kt:35)
	at kotlinx.coroutines.CoroutineExceptionHandlerKt.handleCoroutineException(CoroutineExceptionHandler.kt:26)
	at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:142)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:111)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:263)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:47)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at com.jakewharton.mosaic.BlockingKt.runMosaicBlocking(blocking.kt:6)
	at elide.tool.cli.AbstractToolCommand.call(AbstractToolCommand.kt:260)
	at elide.tool.cli.AbstractToolCommand.call(AbstractToolCommand.kt:30)
	at picocli.CommandLine.executeUserObject(CommandLine.java:2045)
	at picocli.CommandLine.access$1500(CommandLine.java:148)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2457)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2419)
	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277)
	at picocli.CommandLine$RunLast.execute(CommandLine.java:2421)
	at picocli.CommandLine.execute(CommandLine.java:2174)
	at elide.tool.cli.Elide$Companion.exec$runtime(Elide.kt:232)
	at elide.tool.cli.Elide$Companion.entry(Elide.kt:195)
	at elide.tool.cli.ElideKt.main(Elide.kt:305)
	Suppressed: kotlinx.coroutines.CoroutinesInternalError: Fatal exception in coroutines machinery for DispatchedContinuation[BlockingEventLoop@4a98064b, Continuation at com.jakewharton.mosaic.MosaicKt$runMosaic$2.invokeSuspend(mosaic.kt)@6732a44]. Please read KDoc to 'handleFatalException' method and report this incident to maintainers
		at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:140)
		... 21 more
	Caused by: java.lang.IncompatibleClassChangeError
		at kotlin.coroutines.CombinedContext.get(CoroutineContextImpl.kt:125)
		at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:93)
		... 20 more
Exception in thread "main" java.lang.IncompatibleClassChangeError
	at kotlin.coroutines.CombinedContext.fold(CoroutineContextImpl.kt:131)
	at kotlin.coroutines.CombinedContext.toString(CoroutineContextImpl.kt:174)
	at kotlinx.coroutines.internal.DiagnosticCoroutineContextException.getLocalizedMessage(CoroutineExceptionHandlerImpl.kt:37)
	at [email protected]/java.lang.Throwable.toString(Throwable.java:519)
	at [email protected]/java.lang.String.valueOf(String.java:4507)
	at [email protected]/java.lang.Throwable.printEnclosedStackTrace(Throwable.java:745)
	at [email protected]/java.lang.Throwable.lockedPrintStackTrace(Throwable.java:713)
	at [email protected]/java.lang.Throwable.printStackTrace(Throwable.java:695)
	at [email protected]/java.lang.Throwable.printStackTrace(Throwable.java:682)
	at [email protected]/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:698)
	at [email protected]/java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:690)
	at kotlinx.coroutines.internal.CoroutineExceptionHandlerImplKt.propagateExceptionFinalResort(CoroutineExceptionHandlerImpl.kt:31)
	at kotlinx.coroutines.internal.CoroutineExceptionHandlerImpl_commonKt.handleUncaughtCoroutineException(CoroutineExceptionHandlerImpl.common.kt:48)
	at kotlinx.coroutines.CoroutineExceptionHandlerKt.handleCoroutineException(CoroutineExceptionHandler.kt:26)
	at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:142)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:111)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:263)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:47)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at com.jakewharton.mosaic.BlockingKt.runMosaicBlocking(blocking.kt:6)
	at elide.tool.cli.AbstractToolCommand.call(AbstractToolCommand.kt:260)
	at elide.tool.cli.AbstractToolCommand.call(AbstractToolCommand.kt:30)
	at picocli.CommandLine.executeUserObject(CommandLine.java:2045)
	at picocli.CommandLine.access$1500(CommandLine.java:148)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2457)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2419)
	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277)
	at picocli.CommandLine$RunLast.execute(CommandLine.java:2421)
	at picocli.CommandLine.execute(CommandLine.java:2174)
	at elide.tool.cli.Elide$Companion.exec$runtime(Elide.kt:232)
	at elide.tool.cli.Elide$Companion.entry(Elide.kt:195)
	at elide.tool.cli.ElideKt.main(Elide.kt:305)

Does this exception even make any sense to you? I'm in Gradle, and this was against Mosaic latest/coroutines latest.

@sgammon
Copy link
Author

sgammon commented Jun 5, 2024

Okay, sorry for the volume of comments, but I've been able to work around this for now by restoring to:

  • Mosaic: 0.12.0
  • AndroidX Compose: 1.5.13-dev-k2.0.0-RC1-50f08dfa4b4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream/coroutines
Projects
None yet
Development

No branches or pull requests

2 participants