Set -XX:MaxRAMPercentage if resource limit set #54

ReillyBrogan · 2023-11-08T00:06:45Z

We started noticing that the Spotinst controller was crash-looping on our largest Kubernetes cluster after we added the new m7a.medium instance types to our Ocean (which have 4GB of memory). The pods were exiting with the message "Terminating due to java.lang.OutOfMemoryError: Java heap space". We switched to using the Helm charts to deploy Spotinst with requests/limits set to avoid node memory contention issues but continued to see the errors unless we increased pod memory limits to the 6GB-8GB range.

Since we deploy Java applications to k8s ourselves we are very familiar with the memory characteristics of JVM applications running in containers and were able to quickly deduce that the JVM was auto-sizing the heap to the default value of 25% of the available memory (the memory limit if set, the host total memory if not). Since this cluster apparently requires a > 1GB heap size for the spotinst controller it would run out of heap space and exit even with 4GB total memory given to it, despite the fact that that memory was only ~50% utilized. To get to a 2GB heap size we would need to run the controller with a memory limit of 8GB, which is incredibly excessive considering it would barely go above ~2.5GB usage with heap + non-heap overhead.

To prevent such excessive memory consumption let's add the -XX:MaxRAMPercentage argument to the JAVA_OPTS environmental variable (which is read by the JVM on startup), but only if resources.limits.memory is set to a value. Since we want to ensure that the JVM has enough memory overhead for off-heap let's scale the percent by the memory limit so that pods running with smaller memory limits have a larger amount of memory set aside for off-heap.

This PR implements that. It sets the heap percentage to 50% for memory limits less than 512MB, 60% for 512MB-1GB, 70% for 1GB-2GB, and 80% for 2GB+. This is working for us now, we've also tested it with some diagnostic arguments to ensure that the value was indeed getting set in the JVM. The values may need some further tuning in a follow-up, but for now are likely to be sufficient. I tested that the functions are robust against the .requests being nil, the dig function will return nil unless the key .resources.limits.memory actually exists and is set to something and the rest of the function will be skipped if dig returns nil.

Set -XX:MaxRAMPercentage if resource limit set

ca1f3c2

ReillyBrogan force-pushed the resourceLimit branch from e70a452 to ca1f3c2 Compare November 8, 2023 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set -XX:MaxRAMPercentage if resource limit set #54

Set -XX:MaxRAMPercentage if resource limit set #54

ReillyBrogan commented Nov 8, 2023

Set -XX:MaxRAMPercentage if resource limit set #54

Are you sure you want to change the base?

Set -XX:MaxRAMPercentage if resource limit set #54

Conversation

ReillyBrogan commented Nov 8, 2023