Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set -XX:MaxRAMPercentage if resource limit set #54

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ReillyBrogan
Copy link

We started noticing that the Spotinst controller was crash-looping on our largest Kubernetes cluster after we added the new m7a.medium instance types to our Ocean (which have 4GB of memory). The pods were exiting with the message "Terminating due to java.lang.OutOfMemoryError: Java heap space". We switched to using the Helm charts to deploy Spotinst with requests/limits set to avoid node memory contention issues but continued to see the errors unless we increased pod memory limits to the 6GB-8GB range.

Since we deploy Java applications to k8s ourselves we are very familiar with the memory characteristics of JVM applications running in containers and were able to quickly deduce that the JVM was auto-sizing the heap to the default value of 25% of the available memory (the memory limit if set, the host total memory if not). Since this cluster apparently requires a > 1GB heap size for the spotinst controller it would run out of heap space and exit even with 4GB total memory given to it, despite the fact that that memory was only ~50% utilized. To get to a 2GB heap size we would need to run the controller with a memory limit of 8GB, which is incredibly excessive considering it would barely go above ~2.5GB usage with heap + non-heap overhead.

To prevent such excessive memory consumption let's add the -XX:MaxRAMPercentage argument to the JAVA_OPTS environmental variable (which is read by the JVM on startup), but only if resources.limits.memory is set to a value. Since we want to ensure that the JVM has enough memory overhead for off-heap let's scale the percent by the memory limit so that pods running with smaller memory limits have a larger amount of memory set aside for off-heap.

This PR implements that. It sets the heap percentage to 50% for memory limits less than 512MB, 60% for 512MB-1GB, 70% for 1GB-2GB, and 80% for 2GB+. This is working for us now, we've also tested it with some diagnostic arguments to ensure that the value was indeed getting set in the JVM. The values may need some further tuning in a follow-up, but for now are likely to be sufficient. I tested that the functions are robust against the .requests being nil, the dig function will return nil unless the key .resources.limits.memory actually exists and is set to something and the rest of the function will be skipped if dig returns nil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant