Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Musing about allocate() and allocateDirect() and WASM #1360

Open
swankjesse opened this issue Oct 1, 2023 · 1 comment
Open

Musing about allocate() and allocateDirect() and WASM #1360

swankjesse opened this issue Oct 1, 2023 · 1 comment

Comments

@swankjesse
Copy link
Member

swankjesse commented Oct 1, 2023

WASI has me thinking about how Okio’s Buffer only offers a ByteArray backing store. I’m collecting my thoughts here...

The Java NIO ByteBuffer type has allocate() (garbage-collected in managed memory) and allocateDirect() (non-garbage-collected in unmanaged memory) APIs. The physical RAM on the machine is the same, but being garbage-collected or not has consequences when these buffers interact with system APIs.

JNI needs to pin a garbage-collected ByteBuffer’s byte[], to prevent the GC from relocating it on the heap while it’s being accessed. Non-garbage-collected ByteBuffers don’t need to be pinned because nothing will ever move it.

Depending on the GC, pinning either requires setting a marker “don’t move this object” or copying the array to non-managed memory on pin() and copying it back to managed-memory on unpin(). (Are both of these strategies in play in modern JVMs or Android runtimes?) If the runtime cost of pin() is significant, then APIs that create buffers backed by unmanaged memory have value.

Direct buffers save calls to pin(), but they put memory-management work on the application layer. Netty pools its direct buffers, and the application layer must release() their buffers when they’re done with them. If you forget to release() the finalizer/cleaner will save a memory leak, but at a higher cost of allocations.

Okio also pools its segments, but only to avoid zero-fill. It’s a low-stakes optimization that I wrote about here.

For kotlinx.io on the JVM, here’s the variables:

  • What’s the runtime cost of pinning a managed segment?
  • What’s the runtime cost of going from monomorphic segments (one segment class) to bimorphic segments (two segment classes) ?
  • What’s the developer experience cost of offering this choice? I’m imagining a beautiful world where I never needed to learn about direct buffers, and I never needed decide between direct and non-direct every single time I used a buffer.
  • What’s the developer experience cost of releasing direct buffers?

With Okio I chose to say ‘fuck direct buffers! that’s needless complexity!’ and I’m still happy with this choice. My favourite gRPC bug report shows that the cost of pin() isn’t problematic in practice.

So that’s the JVM, but these musings started with WASM?

A bunch of design decisions in the JVM are being re-decided for WASM and WASI. There’s a managed heap (WASM GC) and there’s a different class called WebAssembly.Memory that’s used for inputs and outputs of system calls. I don’t believe there’s any mechanism to pin a byte array as a Memory. (I aught to request this!)

So Okio needs to do a byte-by-byte copy when doing a system call with a Buffer. If we had a Buffer that was backed by a Memory, perhaps we could avoid that fate? Though it could get awkward because the payloads and offsets all share the same memory?

It’s still pretty early for WASI. I should petition for APIs that operate on byte arrays to complement the ones that operate on memory. If they get that we can dodge another needless direct/non-direct bifurcation. And if we don’t get that, well then I suppose that on WebAssembly a case could be made for segments backed by something other than a byte array.

@swankjesse
Copy link
Member Author

After further studying of the WASM GC docs, I think it’ll be a natural win for WASI to support byte arrays directly, once WASM GC is stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant