Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit system ram to 4GB #4048

Merged
merged 2 commits into from
Mar 12, 2020
Merged

Limit system ram to 4GB #4048

merged 2 commits into from
Mar 12, 2020

Conversation

freimair
Copy link
Member

Might treat some symptoms of #3128, #3657, #3918, #3917, #3686, #3677

OpenJDK assumes total system RAM of 128GB. Setting it to 4GB manually
reduces the greediness of the JVM drastically.

Only after setting the MaxRAM to 1.5GB it finally went into a heap space error. 2GB work fine on my machine.

Proposed strategy: do a full release test with this, see how it goes.

This find is courtesy of @ghubstan #3918 (comment)

OpenJDK assumes total system RAM of 128GB. Setting it to 4GB manually
reduces the greediness of the JVM drastically.
@freimair freimair requested a review from cbeams as a code owner March 11, 2020 13:55
@wiz
Copy link
Member

wiz commented Mar 11, 2020

@ripcurlx @sqrrm please also consider cherry-pick for v1.2.8 - many users have RAM issues and it goes towards our goal of "improve reliability"

@ghubstan
Copy link
Member

@freimair: "Only after setting the MaxRAM to 1.5GB it finally went into a heap space error. 2GB work fine on my machine". 2 GB was the minimum MaxRAM for my machine too; I got an OOM when setting MaxRAM=1.5GB.

@ripcurlx
Copy link
Contributor

@ripcurlx @sqrrm please also consider cherry-pick for v1.2.8 - many users have RAM issues and it goes towards our goal of "improve reliability"

I am happy to cherry-pick this for v1.2.8 if someone with a linux system does the required testing for it.

@freimair
Copy link
Member Author

please specify "required testing". It has been developed on Linux. It worked.

I would rather see it tested on Windows.

@ripcurlx
Copy link
Contributor

This is just for the startup script when building from source, correct? So the created binaries won't have this change.

Copy link
Member

@ghubstan ghubstan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting this param value to "4GB" doesn't work on my machine; it needs to be "4G" or "4g".

$ java -XX:MaxRAM=4GB -version
Improperly specified VM option 'MaxRAM=4GB'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

$ java -XX:MaxRAM=4g -version
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)

build.gradle Outdated Show resolved Hide resolved
@freimair
Copy link
Member Author

freimair commented Mar 11, 2020

This is just for the startup script when building from source, correct? So the created binaries won't have this change.

Yes, however, as discussed, I also changed the packager scripts.

Copy link
Contributor

@ripcurlx ripcurlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@freimair Let's give this a try as discussed pointing users in this issue to try out master (if they are able to build and run it). I'll create new snapshot Linux binaries after the release for testing.

@cbeams
Copy link
Member

cbeams commented May 7, 2020

I just ran into the following heap space error on my always-on Bisq node. I upgraded that node from v1.3.2 to v1.3.4 on March 1st, and the heap space error occurred on March 4th. I haven't had any such errors before, or at least not for a very long time. Makes me think this change is probably the culprit. I just asked @bisq-network/support in Keybase whether they've been seeing similar reports from users, no response yet.

image

@cd2357
Copy link
Contributor

cd2357 commented May 7, 2020

I haven't had any such errors before, or at least not for a very long time. Makes me think this change is probably the culprit.

Probably the culprit, or probably just didn't fix the underlying issue.

I read somewhere that JavaFX also uses the system's video RAM, in addition to the normal RAM usually used by the Java VM.

This PR seems to restrict only the normal RAM usage -- we might have to investigate the VRAM side as well.

Btw, this means Bisq (or any JavaFX app) could behave differently on different systems, depending on that system's video settings (graphic card hardware? graphics drivers? amount of shared RAM between video and system? etc -- don't know what could play a role, but smth to keep in mind when looking at performance).

@cd2357 cd2357 mentioned this pull request May 7, 2020
@ghubstan
Copy link
Member

ghubstan commented May 7, 2020

Btw, this means Bisq (or any JavaFX app) could behave differently on different systems, depending on that system's video settings (graphic card hardware? graphics drivers? amount of shared RAM between video and system? etc -- don't know what could play a role, but smth to keep in mind when looking at performance).

JavaFX does behave differently with different GPUs. It does use native GPU drivers, and there have been many leak problems in JFX related to GPU and mesa drivers.

Here is a list of some jfx leak related issues -- old and new. You have to hunt around for those related to GPUs, and for issues about leak related bugs found and fixed after JFX 11 was released.

Issue 8188094 shows how to make JFX leak from a simple app running on Linux. The unresolved issue reports the problem in JDK/JFX 8/9, and I reproduced it running JFX 11 on Ubuntu + NVIDIA GPU in Feb. 2020, when I started working on this problem.

JFX 14 has some leak bugfixes, but Bisq is holding back on upgrading to OpenJDK 14 until the new java packaging tool is released from incubation to GA.

I did a lot of profiling to find the leaks (maybe you saw some of my comments in issue 3918) using JMC/JFR, JProfiler, async-profiler, bcc and bpftrace tools, under the assumption the problem was/is in Bisq and JFX. I'm sure there are leaks to find and fix, but the only firm conclusion my analysis supported was that Bisq's JVM was reserving far more resident and virtual memory at startup than the Linux JVM needs. (I could not find problems with Bisq's on-heap management.) The problem was more serious when it starts with an empty data dir, syncs the entire chain, and the JVM never gives back any memory (to the OS).

@cd2357
Copy link
Contributor

cd2357 commented May 7, 2020

Thanks for the detailed reply @ghubstan, that clarifies a lot.

I'm sure there are leaks to find and fix, but the only firm conclusion my analysis supported was that Bisq's JVM was reserving far more resident and virtual memory at startup than the Linux JVM needs. (I could not find problems with Bisq's on-heap management.) The problem was more serious when it starts with an empty data dir, syncs the entire chain, and the JVM never gives back any memory (to the OS).

Have you tried using the -XX:+UseG1GC flag? According to some reports it causes JavaFX to use more aggressive garbage collection, which reduces memory consumption and effectively give it back to the system.

@freimair
Copy link
Member Author

freimair commented May 15, 2020

I have seen that too but only once for my always-on bisq. Seems like a legitimate Out-of-Heap error since Bisq is resource hungry beyond compare. Restart Bisq and everything is fine again.

There are multiple efforts and fixes in the pipeline, though.

@cd2357 we played around with the -XX:+UseG1GC flag way back and it did not change anything. Only when I hunted down and found the faulty line of code that produced the resource leak, thing changed. Bisq still holds the whole data stores in RAM (not at first, but just click the "trades" tab and there you go) and it will do so until we change things.

As long as we do not fix the underlying issues, we will only treat single failures and waste time and resources over and over again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants