Memory leak with add_process_metadata and k8s manifest for Auditbeat #24890

jsoriano · 2021-04-01T09:35:57Z

There seems to be a memory leak with add_process_metadata that is reproduced with the reference configuration provided to run Auditbeat in Kubernetes.
This processor is used in this scenario to obtain the container.id from the process.id, so add_kubernetes_metadata can enrich events. But issue is also reproduced when add_kubernetes_metadata is not used.

Tried to reproduce in a simpler scenario, only with docker, but memory usage of this processor didn't seem to increase beyond ~13MB. In the linked discuss issue there seems to be problems even with 1GB memory limits. Difference could be in the maximum number of pids allowed (sysctl kernel.pid_max).

add_process_metadata has a process cache whose entries are never cleaned, but the key is the pid, so its size is effectively limited by the maximum number of pids in the machine. The problem may be that kernel.pid_max can be quite big.

Some stragegy should be applied to remove unneeded or expired entries from this cache.

For confirmed bugs, please report:

Version: At least 7.11 and 7.12.
Discuss Forum URL: https://discuss.elastic.co/t/auditbeat-7-11-2-and-7-12-0-memory-issue/268830/7
Steps to Reproduce: Run auditbeat in kubernetes with the reference configuration, produce some activity and wait.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-04-01T09:35:58Z

Pinging @elastic/integrations (Team:Integrations)

elasticmachine · 2021-04-01T09:35:58Z

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

mareckii · 2021-04-01T15:43:24Z

Maybe it doesn't really matter but i'm using GKE with google COS as operating system on nodes.

heap2.prof.zip - both processors active
heap3.prof.zip - only add_process_metadata active

jsoriano · 2021-04-01T16:42:54Z

@mareckii thanks for the memory profiles, it actually looks like the problem is around the process cache in add_process_metadata. You mention in the discuss issue that after some days it ends up taking hundreds of MB. It'd be great if you could share a profile after a couple of days, to double-check if it is the cache in add_process_metadata what continues growing.

jsoriano · 2021-04-01T18:41:57Z

It seems that google COS is configured with a pid max of 2**22 (4194304), this is 128 times what I have in the machine where I tried (32768). If same memory usage ratio is maintained, a cache for so many pids would take more than 1.5GB.

@mareckii could you confirm by checking the pid max in one of your affected machines? This can be checked with cat /proc/sys/kernel/pid_max or sysctl kernel.pid_max.

mareckii · 2021-04-07T13:09:23Z

Hi,

cat /proc/sys/kernel/pid_max
4194304

if it helps memory dumps after few days:
heap4.prof.zip

jsoriano · 2021-04-07T16:08:19Z

Thanks @mareckii.

Yes, in this profile most of the memory in use is allocated by the process cache in add_process_metadata, this would be consistent with the high value of pid_max.

This cache should have a different strategy for these cases.

jsoriano added bug Team:Integrations Label for the Integrations team Team:Security-External Integrations labels Apr 1, 2021

efd6 mentioned this issue Jan 6, 2022

libbeat/processors/add_process_metadata: implement a process cache eviction policy #29717

Merged

4 tasks

efd6 closed this as completed in #29717 Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak with add_process_metadata and k8s manifest for Auditbeat #24890

Memory leak with add_process_metadata and k8s manifest for Auditbeat #24890

jsoriano commented Apr 1, 2021 •

edited

Loading

elasticmachine commented Apr 1, 2021

elasticmachine commented Apr 1, 2021

mareckii commented Apr 1, 2021

jsoriano commented Apr 1, 2021 •

edited

Loading

jsoriano commented Apr 1, 2021

mareckii commented Apr 7, 2021

jsoriano commented Apr 7, 2021

Memory leak with add_process_metadata and k8s manifest for Auditbeat #24890

Memory leak with add_process_metadata and k8s manifest for Auditbeat #24890

Comments

jsoriano commented Apr 1, 2021 • edited Loading

elasticmachine commented Apr 1, 2021

elasticmachine commented Apr 1, 2021

mareckii commented Apr 1, 2021

jsoriano commented Apr 1, 2021 • edited Loading

jsoriano commented Apr 1, 2021

mareckii commented Apr 7, 2021

jsoriano commented Apr 7, 2021

jsoriano commented Apr 1, 2021 •

edited

Loading

jsoriano commented Apr 1, 2021 •

edited

Loading