Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConstrainableInputStream pins virtual threads (Java-21) #2054

Closed
malkusch opened this issue Nov 18, 2023 · 6 comments · Fixed by #2058
Closed

ConstrainableInputStream pins virtual threads (Java-21) #2054

malkusch opened this issue Nov 18, 2023 · 6 comments · Fixed by #2058
Assignees
Milestone

Comments

@malkusch
Copy link

malkusch commented Nov 18, 2023

Sample to reproduce:

InputStream in = load();
Jsoup.parse(in, "UTF-8", "http://example.org/")

Running that with -Djdk.tracePinnedThreads=full in Java-21(on a virtual thread) will give the following print:

Thread[#35,ForkJoinPool-1-worker-4,5,CarrierThreads]
  …
  java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:405) <== monitors:1
  org.jsoup.internal.ConstrainableInputStream.read(ConstrainableInputStream.java:64)
  …

BufferedInputStream was refactored to be virtual thread friendly, however with a caveat: It becomes unfriendly when inherited (which ConstrainableInputStream does):

    public BufferedInputStream(InputStream in, int size) {
        …
        if (getClass() == BufferedInputStream.class) { // false in case of ConstrainableInputStream
            lock = InternalLock.newLockOrNull();
            …
        } else {
            lock = null;
            …
        }
    }
…
    public int read(byte[] b, int off, int len) throws IOException {
        if (lock != null) {
            … // Virtual Thread friendly code
        } else {
            synchronized (this) {
                return implRead(b, off, len); // Line 405
            }
        }
    }

With Loom gaining popularity with the latest LTS release, do you want to reconsider ConstrainableInputStream inheriting from BufferedInputStream? A possible mitigation might be to use composition instead, i.e. inherit from FilterInputStream and use a new instance of BufferedInputStream as source:

class ConstrainableInputStream extends FilterInputStream {
…
  static ConstrainableInputStream wrap(InputStream in,…) {
    var buffered = new BufferedInputStream(in);
    return new ConstrainableInputStream(buffered);
  }
}
@jhy jhy self-assigned this Nov 21, 2023
@jhy
Copy link
Owner

jhy commented Nov 22, 2023

Thanks for raising this.

After first review, a few notes:

  • Interestingly, I can only replicate the pinned thread warning if I have the InputStream be a remote URL. A local file, or a local URL, don't hit it, even with very high concurrency (10K). I guess that's just based on a timeout in the JDK that detects a pinned thread, and those aren't hitting it.
  • Updating this will be a bit tricky to maintain backward compatibility:
    • The Connection.response.bodyStream() returns a declared BufferedInputStream (and we return the extended ConstrainedInputStream). Could change this by adding a new method response.constrainedBodyStream() and have the bodyStream be unconstrained. And hope folks aren't using bodyStream() and expecting the constraints. Which might be OK because that's not actually documented.
    • Constrainable extends Buffered and changing it to extend Filtered makes the jar not binary compatible as superclass removed. Source compat is OK. Maybe deprecate it and add a new ConstrainedInputStream.
    • Need to rethink how the remaining size is calculated. The current impl hits the Buffered's visible fields markpos and read to understand where it's up to. If we're not extending, we can't hit that.

It's kind of annoying that we can't just supply a lock object in the Buffered constructor, and keep the rest as-is. We could use the multi-jar support to do that if it were a new constructor option.

@jhy jhy linked a pull request Nov 22, 2023 that will close this issue
@jhy jhy closed this as completed in #2058 Nov 22, 2023
@jhy jhy added this to the 1.17.1 milestone Nov 22, 2023
@jhy
Copy link
Owner

jhy commented Nov 22, 2023

OK, fixed!

It would be great if you could validate this by running a test on 1.17.1-SNAPSHOT. (git pull; mvn install).

I verified that the pin warnings from Jsoup.connect.get() are gone when hitting a remote URL.

It would be good to identify a way to have a integration test for this.

@malkusch
Copy link
Author

malkusch commented Nov 22, 2023

Thanks a ton for addressing this so timely. I will validate the patch this weekend and come back here.

A local file, or a local URL, don't hit it,

I think that might be related to another limitation in Loom. Some FS operations might not unload a blocking call. I don't remember where I read that, but a keyword was the missing io_uring support in the JDK. Here's an article hinting that (under "Blocking and unmounting").

It would be good to identify a way to have a integration test for this.

I had also that thought. There are JFR events to help that, but it would be nice (if possible at all) if a testing framework would provide such assertions. I will give it a thought and maybe raise an issue at another place, so that everybody can benefit from that.

Thanks again for the fix.

@malkusch
Copy link
Author

Here's my promised updated: I can confirm that 1.17.1-SNAPSHOT fixed the issue. No more thread pinning events. Thanks a ton.

@jhy
Copy link
Owner

jhy commented Nov 23, 2023

That's great, thanks for confirming!

@jhy
Copy link
Owner

jhy commented Aug 2, 2024

(Ah the way that BufferedInputStream is effectively locked for extension past Java 21 is bugging me. As part of #2186 I want to be able to replace the byte[] buffer with a reused buffer, vs creating a new one for each new InputStream read. Which would be easy enough as it's a protected field, designed to be accessible. But because of the instance check in the constructor, it's no longer viable. It would be great if there was a default non-synchronized BufferedInputStream we could re-use, which didn't include such extension restrictions. As it stands I guess we need to implement one directly which unfortunately adds a bit more to the jar size. Other ideas are of course welcomed!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants