Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate to jdk 22 and fix upcalls #9

Open
wants to merge 12 commits into
base: feature/jdk21-support
Choose a base branch
from

Conversation

rutenkolk
Copy link

Hi, in this pull request I propose migrating the project to jdk22 support.

I also managed to track down the reason the upcall test failed.
After adding tests for other upcalls i noticed, it only failed for jvm functions that return strings.

I eventually inserted a few debug prints into the generated bytecode of upcall-class and downcall-class and saw this:

("downcall-fn called")
hello from downcall-class/invoke
to prim-asm of supposed type [:coffi.ffi/fn [] :coffi.mem/c-string]. actual class of value:
jdk.internal.foreign.NativeMemorySegmentImpl
hello from downcall-class before invokeExact
hello from upcall-class/upcall
hello from upcall-class right before the invoke
hello from upcall-class right after the invoke
to prim-asm of supposed return type :coffi.mem/c-string. actual class of value:
java.lang.Long
java.lang.ClassCastException: class java.lang.Long cannot be cast to class java.lang.foreign.MemorySegment (java.lang.Long and java.lang.foreign.MemorySegment are in module java.base of loader 'bootstrap')

Turns out the downcall function that is wrapped by downcall-class correctly handles the function argument as a MemorySegment (thus not necessitating any to-prim-asm conversion), but the function wrapped by upcall-class returns a Long for the ret-type ::mem/c-string. I'm not entirely sure if that's the case for any value that has primitive-type of ::mem/pointer but I suspect that's the case.

Because of that a simple change to to-prim-asm didn't work since it would have to behave differently for upcall-class and downcall-class.

The solution I've settled on in this pull request is to still let upcall-class/upcall use the to-prim-asm instructions but also for specifically pointer types call MemorySegment.ofAddress. There is a downside to doing that though. Since this is a default static method of an Interface, calling it from bytecode necessitates a higher bytecode version than is default. Version 8 is fine, but it needs to be specified on the generated class. I've also bumped the insn version to 0.5.4 which allows emitting more recent jvm bytecode versions should that be necessary. In any case, with that change we can invokestatic with the extra argument true, which sets ASMs itf flag so we can specify that we're talking about an InterfaceMethodref and successfully call the interface method directly from bytecode.

I'm not exactly sure if the function wrapped by upcall-class shouldn't just return a MemorySegment in the first place instead of a Long, but i wasn't sure how to robustly integrate this.

The reason it does so, is because of how serialize is implemented for c-string. In migrating to the jdk22 API, I kept the old behavior, which simply returns the address of the string as a Long. I'm not sure if other parts of the library depend on that specific behavior, so the additional call to MemorySegment.ofAddress seemed like the pragmatic choice to me.

Any feedback on this?

@rutenkolk
Copy link
Author

(this also addresses what was discussed in #8)

@rutenkolk rutenkolk changed the title migrate to jdk 22 support and fix upcalls migrate to jdk 22 and fix upcalls Jun 26, 2024
@rutenkolk
Copy link
Author

in it's current state, the tests pass on JDK22 and the library can be loaded into another project, but when I try to actually call a function defined via defcfn, a ClassCastException is being thrown, trying to convert a Long into a AbstractMemorySegmentImpl.

@IGJoshua
Copy link
Owner

Hey, this is absolutely fantastic! I really appreciate the in-depth discussion of the issues, the process you went through, and the decision points we have left.

You've caught me right in the last week leading up to moving across the country, so I don't have time to get into the details to help decide exactly what to do about the c-string return type just now, but in about a week and a half I'll have some time between arriving at the new house and having my stuff arrive, and I can go into detail then.

I will say that I don't think just returning a Long is what's intended there, it's supposed to call .getUtf8String on the returned MemoryAddress impl back when that existed, but it looks like when I was just getting the types to be happy when doing the initial migration to the JDK 20 api I didn't fully work that out when they changed how addresses are returned.

I suspect this might be the case for any situation where a pointer is being returned by a coffi type that tries to read it as a memory segment, since I think during that conversion I missed that it returns a Long instead of a MemorySegment in that case.

Copy link
Owner

@IGJoshua IGJoshua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review is just about the build script change. I looked over the rest of the code and it looks good, but I need to spend more time than I have today to fully understand the implications of the changes, especially having the returned pointer types be converted to Long with .ofAddress.

I'll be back with some specific thoughts on that as soon as I can.

build.clj Outdated Show resolved Hide resolved
@rutenkolk rutenkolk requested a review from IGJoshua June 28, 2024 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants