oom kills #1218

runningman84 · 2023-12-15T12:13:09Z

/kind bug

What happened?
Lately we discovered that some of our efs csi pods crash due to oom kills.

What you expected to happen?
na

How to reproduce it (as minimally and precisely as possible)?
unclear because it does not effect all clusters or nodes

Anything else we need to know?:
The plugin container was tested with 64/256, 128/128, 256/256 mb requests/limits. Only increasing it to 512mb/512mb solved the issue.

From the prometheus stats the container does not consume more than 100mb of memory.

Environment

Kubernetes version (use kubectl version): 1.28.x
Driver version: 1.7.2

Please also attach debug logs to help us better diagnose
This is the dmesg output:

[17534.176984] openssl invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[17534.176991] CPU: 0 PID: 136084 Comm: openssl Not tainted 6.1.59 #1
[17534.176994] Hardware name: Amazon EC2 m6id.xlarge/, BIOS 1.0 10/16/2017
[17534.176996] Call Trace:
[17534.176999]  <TASK>
[17534.177003]  dump_stack_lvl+0x34/0x48
[17534.177010]  dump_header+0x4a/0x213
[17534.177014]  oom_kill_process.cold+0xb/0x10
[17534.177017]  out_of_memory+0xed/0x2e0
[17534.177022]  mem_cgroup_out_of_memory+0x136/0x150
[17534.177029]  try_charge_memcg+0x7d7/0x890
[17534.177033]  charge_memcg+0x35/0xe0
[17534.177036]  __mem_cgroup_charge+0x29/0x80
[17534.177040]  do_anonymous_page+0x104/0x5c0
[17534.177044]  __handle_mm_fault+0x513/0x5e0
[17534.177048]  handle_mm_fault+0xc5/0x2b0
[17534.177051]  do_user_addr_fault+0x1a1/0x5a0
[17534.177055]  exc_page_fault+0x62/0x140
[17534.177058]  asm_exc_page_fault+0x22/0x30
[17534.177063] RIP: 0033:0x7f43b2462485
[17534.177066] Code: 4c 29 f0 31 f6 4a 8d 3c 31 48 39 d3 4c 89 f2 40 0f 95 c6 48 83 ca 01 48 83 c8 01 48 c1 e6 02 48 89 7b 60 48 09 f2 48 89 51 08 <48> 89 47 08 e9 bd fe ff ff 48 89 de 4c 89 f7 e8 a7 ea ff ff 48 85
[17534.177068] RSP: 002b:00007ffd086d3bd0 EFLAGS: 00010206
[17534.177071] RAX: 000000000001e351 RBX: 00007f43b278cc40 RCX: 0000000001f0bca0
[17534.177073] RDX: 0000000000001011 RSI: 0000000000000000 RDI: 0000000001f0ccb0
[17534.177074] RBP: 00000000000000ff R08: 0000000000000003 R09: 0000000000000077
[17534.177075] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[17534.177077] R13: 0000000001ee9010 R14: 0000000000001010 R15: 00007f43b278cca0
[17534.177080]  </TASK>
[17534.177081] memory: usage 262144kB, limit 262144kB, failcnt 3197
[17534.177083] swap: usage 0kB, limit 0kB, failcnt 0
[17534.177084] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podc7483131_5b05_443b_877f_68446c17997f.slice/cri-containerd-b6f10b59e59e4a6ded45057def7d5b74a9e2a9276882b284271c55cb6153f6fd.scope:
[17534.177098] anon 257835008
[17534.177098] file 86016
[17534.177098] kernel 10481664
[17534.177098] kernel_stack 737280
[17534.177098] pagetables 5222400
[17534.177098] sec_pagetables 0
[17534.177098] percpu 0
[17534.177098] sock 32768
[17534.177098] vmalloc 0
[17534.177098] shmem 81920
[17534.177098] zswap 0
[17534.177098] zswapped 0
[17534.177098] file_mapped 0
[17534.177098] file_dirty 0
[17534.177098] file_writeback 0
[17534.177098] swapcached 0
[17534.177098] anon_thp 0
[17534.177098] file_thp 0
[17534.177098] shmem_thp 0
[17534.177098] inactive_anon 257679360
[17534.177098] active_anon 237568
[17534.177098] inactive_file 0
[17534.177098] active_file 4096
[17534.177098] unevictable 0
[17534.177098] slab_reclaimable 1716768
[17534.177098] slab_unreclaimable 2272184
[17534.177098] slab 3988952
[17534.177098] workingset_refault_anon 0
[17534.177098] workingset_refault_file 3927
[17534.177101] Tasks state (memory values in pages):
[17534.177102] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[17534.177103] [ 135564]     0 135564   188328     9453   192512        0          -997 aws-efs-csi-dri
[17534.177106] [ 135594]     0 135594    30208     6132   286720        0          -997 python3
[17534.177108] [ 136058]     0 136058     5502      288    86016        0          -997 mount
[17534.177110] [ 136059]     0 136059     5502      285    86016        0          -997 mount
[17534.177112] [ 136060]     0 136060     5502      281    90112        0          -997 mount
[17534.177113] [ 136061]     0 136061     5502      282    81920        0          -997 mount
[17534.177115] [ 136063]     0 136063     5502      282    90112        0          -997 mount
[17534.177116] [ 136066]     0 136066     5502      289    94208        0          -997 mount
[17534.177118] [ 136069]     0 136069     5502      277    86016        0          -997 mount
[17534.177119] [ 136070]     0 136070    38351     8263   352256        0          -997 python3
[17534.177121] [ 136071]     0 136071    38142     8074   339968        0          -997 python3
[17534.177122] [ 136072]     0 136072    38350     8279   339968        0          -997 python3
[17534.177124] [ 136073]     0 136073     5502      276    90112        0          -997 mount
[17534.177125] [ 136074]     0 136074    38332     8233   339968        0          -997 python3
[17534.177126] [ 136075]     0 136075    38487     8397   344064        0          -997 python3
[17534.177128] [ 136076]     0 136076    38222     8169   339968        0          -997 python3
[17534.177129] [ 136077]     0 136077    38466     8395   348160        0          -997 python3
[17534.177131] [ 136078]     0 136078     5502      268    81920        0          -997 mount
[17534.177132] [ 136081]     0 136081    37509     8448   335872        0          -997 python3
[17534.177133] [ 136083]     0 136083    38550     8468   352256        0          -997 python3
[17534.177135] [ 136084]     0 136084     9705     1131   122880        0          -997 openssl
[17534.177136] [ 136085]     0 136085     9705     1122   118784        0          -997 openssl
[17534.177138] [ 136086]     0 136086     9641      575   110592        0          -997 openssl
[17534.177139] [ 136088]     0 136088     9705     1129   122880        0          -997 openssl
[17534.177140] [ 136089]     0 136089     9705     1111   126976        0          -997 openssl
[17534.177142] [ 136091]     0 136091     9705     1122   118784        0          -997 openssl
[17534.177143] [ 136092]     0 136092     9639      641   114688        0          -997 openssl
[17534.177144] [ 136093]     0 136093    38351     6017   323584        0          -997 python3
[17534.177146] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-b6f10b59e59e4a6ded45057def7d5b74a9e2a9276882b284271c55cb6153f6fd.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podc7483131_5b05_443b_877f_68446c17997f.slice/cri-containerd-b6f10b59e59e4a6ded45057def7d5b74a9e2a9276882b284271c55cb6153f6fd.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podc7483131_5b05_443b_877f_68446c17997f.slice/cri-containerd-b6f10b59e59e4a6ded45057def7d5b74a9e2a9276882b284271c55cb6153f6fd.scope,task=aws-efs-csi-dri,pid=135564,uid=0
[17534.177198] Memory cgroup out of memory: Killed process 135564 (aws-efs-csi-dri) total-vm:753312kB, anon-rss:10176kB, file-rss:27732kB, shmem-rss:0kB, UID:0 pgtables:188kB oom_score_adj:-997
[17534.179925] Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podc7483131_5b05_443b_877f_68446c17997f.slice/cri-containerd-b6f10b59e59e4a6ded45057def7d5b74a9e2a9276882b284271c55cb6153f6fd.scope are going to be killed due to memory.oom.group set
[17534.179933] Memory cgroup out of memory: Killed process 135594 (python3) total-vm:120832kB, anon-rss:15592kB, file-rss:8936kB, shmem-rss:0kB, UID:0 pgtables:280kB oom_score_adj:-997
[17534.182614] Memory cgroup out of memory: Killed process 136070 (python3) total-vm:153404kB, anon-rss:24068kB, file-rss:8984kB, shmem-rss:0kB, UID:0 pgtables:344kB oom_score_adj:-997
[17534.185053] Memory cgroup out of memory: OOM victim 136072 (python3) is already exiting. Skip killing the task
[17534.185056] Memory cgroup out of memory: Killed process 136073 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:940kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17534.187726] Memory cgroup out of memory: Killed process 136074 (python3) total-vm:153328kB, anon-rss:24028kB, file-rss:8904kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[17534.190523] Memory cgroup out of memory: Killed process 136075 (python3) total-vm:153948kB, anon-rss:24672kB, file-rss:8916kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17534.193213] Memory cgroup out of memory: Killed process 136084 (openssl) total-vm:38820kB, anon-rss:612kB, file-rss:3912kB, shmem-rss:0kB, UID:0 pgtables:120kB oom_score_adj:-997
[17608.521893] IPv6: ADDRCONF(NETDEV_CHANGE): eni6d86ca2d45f: link becomes ready
[17608.521969] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[17608.611562] IPv6: ADDRCONF(NETDEV_CHANGE): v6if0: link becomes ready
[17610.661519] IPv6: ADDRCONF(NETDEV_CHANGE): eni7c9cb71731b: link becomes ready
[17610.661613] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[17610.731785] IPv6: ADDRCONF(NETDEV_CHANGE): v6if0: link becomes ready
[17657.186101] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[17657.186109] CPU: 3 PID: 137892 Comm: python3 Not tainted 6.1.59 #1
[17657.186112] Hardware name: Amazon EC2 m6id.xlarge/, BIOS 1.0 10/16/2017
[17657.186114] Call Trace:
[17657.186117]  <TASK>
[17657.186121]  dump_stack_lvl+0x34/0x48
[17657.186128]  dump_header+0x4a/0x213
[17657.186132]  oom_kill_process.cold+0xb/0x10
[17657.186135]  out_of_memory+0xed/0x2e0
[17657.186140]  mem_cgroup_out_of_memory+0x136/0x150
[17657.186147]  try_charge_memcg+0x7d7/0x890
[17657.186151]  ? __memcg_kmem_charge_page+0x1b9/0x270
[17657.186155]  obj_cgroup_charge+0x74/0x170
[17657.186158]  ? vm_area_dup+0x21/0x90
[17657.186162]  kmem_cache_alloc+0x82/0x2d0
[17657.186166]  vm_area_dup+0x21/0x90
[17657.186171]  ? copy_pud_range+0x20e/0x400
[17657.186174]  ? avc_has_perm_noaudit+0x94/0x110
[17657.186178]  ? copy_page_range+0x13c/0x2d0
[17657.186180]  ? selinux_vm_enough_memory+0x5a/0x90
[17657.186183]  ? __vm_enough_memory+0x26/0x130
[17657.186187]  dup_mmap+0x311/0x5f0
[17657.186191]  dup_mm.constprop.0+0x61/0x110
[17657.186194]  copy_process+0xa5f/0x15b0
[17657.186198]  kernel_clone+0x9b/0x3b0
[17657.186202]  __do_sys_clone+0x66/0x90
[17657.186206]  do_syscall_64+0x38/0x90
[17657.186208]  entry_SYSCALL_64_after_hwframe+0x64/0xce
[17657.186213] RIP: 0033:0x7f7608fd8d30
[17657.186216] Code: cc 45 85 f6 0f 85 98 01 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 e8 00 00 00 85 c0 41 89 c4 0f 85 ef 00 00
[17657.186219] RSP: 002b:00007ffcadab16d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[17657.186222] RAX: ffffffffffffffda RBX: 00007ffcadab16d0 RCX: 00007f7608fd8d30
[17657.186224] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[17657.186225] RBP: 00007ffcadab1730 R08: 00007f760a5cf740 R09: 00007ffcadab1480
[17657.186226] R10: 00007f760a5cfa10 R11: 0000000000000246 R12: 0000000000000000
[17657.186228] R13: 0000000000000020 R14: 0000000000000000 R15: 0000000000000000
[17657.186231]  </TASK>
[17657.186231] memory: usage 262144kB, limit 262144kB, failcnt 6002
[17657.186233] swap: usage 0kB, limit 0kB, failcnt 0
[17657.186235] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-279f883734e4a78ea025dd9adaff94ffff3a843e4b6b2664eee495475543597b.scope:
[17657.186249] anon 257978368
[17657.186249] file 212992
[17657.186249] kernel 10207232
[17657.186249] kernel_stack 688128
[17657.186249] pagetables 5222400
[17657.186249] sec_pagetables 0
[17657.186249] percpu 0
[17657.186249] sock 32768
[17657.186249] vmalloc 0
[17657.186249] shmem 73728
[17657.186249] zswap 0
[17657.186249] zswapped 0
[17657.186249] file_mapped 0
[17657.186249] file_dirty 0
[17657.186249] file_writeback 0
[17657.186249] swapcached 0
[17657.186249] anon_thp 0
[17657.186249] file_thp 0
[17657.186249] shmem_thp 0
[17657.186249] inactive_anon 257855488
[17657.186249] active_anon 196608
[17657.186249] inactive_file 126976
[17657.186249] active_file 12288
[17657.186249] unevictable 0
[17657.186249] slab_reclaimable 1674952
[17657.186249] slab_unreclaimable 2132560
[17657.186249] slab 3807512
[17657.186249] workingset_refault_anon 0
[17657.186249] workingset_refault_file 4442
[17657.186253] Tasks state (memory values in pages):
[17657.186254] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[17657.186255] [ 136889]     0 136889   188456     9151   188416        0          -997 aws-efs-csi-dri
[17657.186259] [ 136942]     0 136942    30208     6108   266240        0          -997 python3
[17657.186262] [ 137874]     0 137874     5502      283    86016        0          -997 mount
[17657.186264] [ 137875]     0 137875     5502      277    86016        0          -997 mount
[17657.186266] [ 137877]     0 137877     5502      277    90112        0          -997 mount
[17657.186268] [ 137878]     0 137878     5502      273    90112        0          -997 mount
[17657.186270] [ 137879]     0 137879    38615     8540   348160        0          -997 python3
[17657.186273] [ 137880]     0 137880     5502      291    94208        0          -997 mount
[17657.186275] [ 137883]     0 137883    37230     8111   331776        0          -997 python3
[17657.186277] [ 137885]     0 137885     5502      287    94208        0          -997 mount
[17657.186279] [ 137886]     0 137886    38408     8293   348160        0          -997 python3
[17657.186281] [ 137888]     0 137888     5502      282    86016        0          -997 mount
[17657.186283] [ 137889]     0 137889    38487     8413   344064        0          -997 python3
[17657.186284] [ 137890]     0 137890    38621     8539   344064        0          -997 python3
[17657.186286] [ 137891]     0 137891     5502      270    94208        0          -997 mount
[17657.186288] [ 137892]     0 137892    38362     8227   339968        0          -997 python3
[17657.186290] [ 137893]     0 137893    38408     8333   344064        0          -997 python3
[17657.186292] [ 137895]     0 137895     5502      271    90112        0          -997 mount
[17657.186294] [ 137897]     0 137897    38555     8520   339968        0          -997 python3
[17657.186296] [ 137898]     0 137898    38562     8432   348160        0          -997 python3
[17657.186298] [ 137899]     0 137899     9636      288   102400        0          -997 openssl
[17657.186300] [ 137900]     0 137900     9641      508   106496        0          -997 openssl
[17657.186302] [ 137901]     0 137901     9641      287   106496        0          -997 openssl
[17657.186304] [ 137902]     0 137902    38408     6111   323584        0          -997 python3
[17657.186306] [ 137903]     0 137903    38615     6300   331776        0          -997 python3
[17657.186309] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-279f883734e4a78ea025dd9adaff94ffff3a843e4b6b2664eee495475543597b.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-279f883734e4a78ea025dd9adaff94ffff3a843e4b6b2664eee495475543597b.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-279f883734e4a78ea025dd9adaff94ffff3a843e4b6b2664eee495475543597b.scope,task=aws-efs-csi-dri,pid=136889,uid=0
[17657.186374] Memory cgroup out of memory: Killed process 136889 (aws-efs-csi-dri) total-vm:753824kB, anon-rss:9384kB, file-rss:27220kB, shmem-rss:0kB, UID:0 pgtables:184kB oom_score_adj:-997
[17657.189530] Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-279f883734e4a78ea025dd9adaff94ffff3a843e4b6b2664eee495475543597b.scope are going to be killed due to memory.oom.group set
[17657.189537] Memory cgroup out of memory: Killed process 136889 (aws-efs-csi-dri) total-vm:753824kB, anon-rss:9812kB, file-rss:27668kB, shmem-rss:0kB, UID:0 pgtables:184kB oom_score_adj:-997
[17657.192614] Memory cgroup out of memory: Killed process 136942 (python3) total-vm:120832kB, anon-rss:15592kB, file-rss:8840kB, shmem-rss:0kB, UID:0 pgtables:260kB oom_score_adj:-997
[17657.195588] Memory cgroup out of memory: Killed process 137874 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:964kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[17657.198457] Memory cgroup out of memory: Killed process 137875 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:944kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[17657.201343] Memory cgroup out of memory: Killed process 137877 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:944kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17657.203746] Memory cgroup out of memory: Killed process 137878 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:928kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17657.206495] Memory cgroup out of memory: Killed process 137879 (python3) total-vm:154460kB, anon-rss:25192kB, file-rss:8968kB, shmem-rss:0kB, UID:0 pgtables:340kB oom_score_adj:-997
[17657.209357] Memory cgroup out of memory: Killed process 137880 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:1000kB, shmem-rss:0kB, UID:0 pgtables:92kB oom_score_adj:-997
[17657.212256] Memory cgroup out of memory: Killed process 137883 (python3) total-vm:148920kB, anon-rss:23764kB, file-rss:8680kB, shmem-rss:0kB, UID:0 pgtables:324kB oom_score_adj:-997
[17657.215186] Memory cgroup out of memory: Killed process 137885 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:984kB, shmem-rss:0kB, UID:0 pgtables:92kB oom_score_adj:-997
[17657.218041] Memory cgroup out of memory: Killed process 137886 (python3) total-vm:153632kB, anon-rss:24448kB, file-rss:8888kB, shmem-rss:0kB, UID:0 pgtables:340kB oom_score_adj:-997
[17657.220976] Memory cgroup out of memory: Killed process 137888 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:964kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[17657.223845] Memory cgroup out of memory: Killed process 137889 (python3) total-vm:153948kB, anon-rss:24680kB, file-rss:8972kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17657.226773] Memory cgroup out of memory: Killed process 137890 (python3) total-vm:154484kB, anon-rss:25204kB, file-rss:8952kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17657.229701] Memory cgroup out of memory: Killed process 137891 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:916kB, shmem-rss:0kB, UID:0 pgtables:92kB oom_score_adj:-997
[17657.232340] Memory cgroup out of memory: Killed process 137892 (python3) total-vm:153448kB, anon-rss:24056kB, file-rss:8852kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[17657.235145] Memory cgroup out of memory: Killed process 137893 (python3) total-vm:153632kB, anon-rss:24444kB, file-rss:8888kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17657.238080] Memory cgroup out of memory: Killed process 137895 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:916kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17657.240836] Memory cgroup out of memory: Killed process 137897 (python3) total-vm:154220kB, anon-rss:25048kB, file-rss:9032kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[17657.243679] Memory cgroup out of memory: Killed process 137898 (python3) total-vm:154248kB, anon-rss:25060kB, file-rss:8952kB, shmem-rss:0kB, UID:0 pgtables:340kB oom_score_adj:-997
[17657.246607] Memory cgroup out of memory: Killed process 137901 (openssl) total-vm:38820kB, anon-rss:696kB, file-rss:4384kB, shmem-rss:0kB, UID:0 pgtables:108kB oom_score_adj:-997
[17657.249496] Memory cgroup out of memory: Killed process 137902 (openssl) total-vm:29588kB, anon-rss:132kB, file-rss:1056kB, shmem-rss:0kB, UID:0 pgtables:92kB oom_score_adj:-997
[17657.252289] Memory cgroup out of memory: Killed process 137903 (openssl) total-vm:38544kB, anon-rss:172kB, file-rss:1176kB, shmem-rss:0kB, UID:0 pgtables:100kB oom_score_adj:-997
[17657.255203] Memory cgroup out of memory: Killed process 137904 (openssl) total-vm:38820kB, anon-rss:696kB, file-rss:4428kB, shmem-rss:0kB, UID:0 pgtables:120kB oom_score_adj:-997
[17780.401507] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[17780.401514] CPU: 3 PID: 138007 Comm: python3 Not tainted 6.1.59 #1
[17780.401517] Hardware name: Amazon EC2 m6id.xlarge/, BIOS 1.0 10/16/2017
[17780.401518] Call Trace:
[17780.401522]  <TASK>
[17780.401526]  dump_stack_lvl+0x34/0x48
[17780.401534]  dump_header+0x4a/0x213
[17780.401538]  oom_kill_process.cold+0xb/0x10
[17780.401540]  out_of_memory+0xed/0x2e0
[17780.401546]  mem_cgroup_out_of_memory+0x136/0x150
[17780.401553]  try_charge_memcg+0x7d7/0x890
[17780.401557]  ? memcg_list_lru_alloc+0xa7/0x330
[17780.401561]  obj_cgroup_charge+0x74/0x170
[17780.401565]  slab_pre_alloc_hook.constprop.0+0xba/0x1d0
[17780.401569]  ? __d_alloc+0x29/0x1f0
[17780.401572]  kmem_cache_alloc_lru+0x4d/0x250
[17780.401575]  __d_alloc+0x29/0x1f0
[17780.401578]  d_alloc+0x1b/0xa0
[17780.401580]  d_alloc_parallel+0x54/0x400
[17780.401584]  __lookup_slow+0x5b/0x130
[17780.401588]  walk_component+0xe5/0x160
[17780.401591]  link_path_walk.part.0.constprop.0+0x24e/0x3c0
[17780.401595]  ? path_init+0x2d2/0x400
[17780.401598]  path_openat+0xb1/0x280
[17780.401601]  do_filp_open+0xb2/0x160
[17780.401605]  ? __check_object_size.part.0+0x47/0xd0
[17780.401609]  do_sys_openat2+0x9a/0x160
[17780.401612]  __x64_sys_openat+0x53/0xa0
[17780.401614]  do_syscall_64+0x38/0x90
[17780.401617]  entry_SYSCALL_64_after_hwframe+0x64/0xce
[17780.401622] RIP: 0033:0x7f4d15437e8e
[17780.401626] Code: 25 00 00 41 00 3d 00 00 41 00 74 43 8b 05 6a c5 20 00 48 63 d6 85 c0 75 67 48 89 fe b8 01 01 00 00 48 c7 c7 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 9e 00 00 00 48 8b 4c 24 38 64 48 33 0c 25
[17780.401628] RSP: 002b:00007ffd34cbf1e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[17780.401631] RAX: ffffffffffffffda RBX: 00007f4d0f66c880 RCX: 00007f4d15437e8e
[17780.401633] RDX: 0000000000080000 RSI: 00007f4d0f1add10 RDI: ffffffffffffff9c
[17780.401634] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[17780.401635] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000153ab20
[17780.401637] R13: 00007f4d148ff9e0 R14: 00007f4d15fe96c0 R15: 0000000000000000
[17780.401640]  </TASK>
[17780.401641] memory: usage 262144kB, limit 262144kB, failcnt 3847
[17780.401642] swap: usage 0kB, limit 0kB, failcnt 0
[17780.401644] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-f6b09caae6cfa22cd45491888e0e7001c0b690f8b3c9b47e1bc20d69315e722c.scope:
[17780.401658] anon 257044480
[17780.401658] file 925696
[17780.401658] kernel 10436608
[17780.401658] kernel_stack 720896
[17780.401658] pagetables 5267456
[17780.401658] sec_pagetables 0
[17780.401658] percpu 0
[17780.401658] sock 28672
[17780.401658] vmalloc 0
[17780.401658] shmem 98304
[17780.401658] zswap 0
[17780.401658] zswapped 0
[17780.401658] file_mapped 0
[17780.401658] file_dirty 110592
[17780.401658] file_writeback 0
[17780.401658] swapcached 0
[17780.401658] anon_thp 0
[17780.401658] file_thp 0
[17780.401658] shmem_thp 0
[17780.401658] inactive_anon 256913408
[17780.401658] active_anon 221184
[17780.401658] inactive_file 667648
[17780.401658] active_file 155648
[17780.401658] unevictable 0
[17780.401658] slab_reclaimable 1667288
[17780.401658] slab_unreclaimable 2274928
[17780.401658] slab 3942216
[17780.401658] workingset_refault_anon 0
[17780.401658] workingset_refault_file 4056
[17780.401663] Tasks state (memory values in pages):
[17780.401663] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[17780.401665] [ 137981]     0 137981   188328     9437   196608        0          -997 aws-efs-csi-dri
[17780.401668] [ 138007]     0 138007    30208     6127   278528        0          -997 python3
[17780.401671] [ 138904]     0 138904     5502      282    90112        0          -997 mount
[17780.401673] [ 138905]     0 138905     5502      294    86016        0          -997 mount
[17780.401675] [ 138906]     0 138906     5502      292    86016        0          -997 mount
[17780.401677] [ 138907]     0 138907     5502      277    90112        0          -997 mount
[17780.401679] [ 138908]     0 138908    38278     8228   339968        0          -997 python3
[17780.401681] [ 138909]     0 138909    38369     8251   344064        0          -997 python3
[17780.401683] [ 138910]     0 138910    38558     8514   344064        0          -997 python3
[17780.401685] [ 138911]     0 138911     5502      286    90112        0          -997 mount
[17780.401687] [ 138913]     0 138913     5502      277    90112        0          -997 mount
[17780.401690] [ 138914]     0 138914     5502      279    81920        0          -997 mount
[17780.401692] [ 138916]     0 138916    38204     8105   331776        0          -997 python3
[17780.401694] [ 138918]     0 138918     5502      279    86016        0          -997 mount
[17780.401696] [ 138919]     0 138919    38516     8467   335872        0          -997 python3
[17780.401698] [ 138922]     0 138922    38580     8515   344064        0          -997 python3
[17780.401700] [ 138923]     0 138923     5502      287    81920        0          -997 mount
[17780.401702] [ 138924]     0 138924    38222     8149   344064        0          -997 python3
[17780.401704] [ 138926]     0 138926    37401     8420   327680        0          -997 python3
[17780.401706] [ 138928]     0 138928    38028     7927   335872        0          -997 python3
[17780.401708] [ 138964]     0 138964     9705     1123   126976        0          -997 openssl
[17780.401710] [ 138965]     0 138965     9705     1120   110592        0          -997 openssl
[17780.401712] [ 138966]     0 138966     9705     1120   118784        0          -997 openssl
[17780.401714] [ 138967]     0 138967     9672     1126   114688        0          -997 openssl
[17780.401715] [ 138968]     0 138968     9672      642   118784        0          -997 openssl
[17780.401717] [ 138969]     0 138969    38516     6218   315392        0          -997 python3
[17780.401719] [ 138970]     0 138970    38369     6016   327680        0          -997 python3
[17780.401721] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-f6b09caae6cfa22cd45491888e0e7001c0b690f8b3c9b47e1bc20d69315e722c.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-f6b09caae6cfa22cd45491888e0e7001c0b690f8b3c9b47e1bc20d69315e722c.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-f6b09caae6cfa22cd45491888e0e7001c0b690f8b3c9b47e1bc20d69315e722c.scope,task=aws-efs-csi-dri,pid=137981,uid=0
[17780.401771] Memory cgroup out of memory: Killed process 137981 (aws-efs-csi-dri) total-vm:753312kB, anon-rss:10336kB, file-rss:27412kB, shmem-rss:0kB, UID:0 pgtables:192kB oom_score_adj:-997
[17780.404884] Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-f6b09caae6cfa22cd45491888e0e7001c0b690f8b3c9b47e1bc20d69315e722c.scope are going to be killed due to memory.oom.group set
[17780.404891] Memory cgroup out of memory: Killed process 137981 (aws-efs-csi-dri) total-vm:753312kB, anon-rss:10336kB, file-rss:27412kB, shmem-rss:0kB, UID:0 pgtables:192kB oom_score_adj:-997
[17780.407800] Memory cgroup out of memory: Killed process 138007 (python3) total-vm:120832kB, anon-rss:15588kB, file-rss:8920kB, shmem-rss:0kB, UID:0 pgtables:272kB oom_score_adj:-997
[17780.410592] Memory cgroup out of memory: Killed process 138904 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:964kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17780.413321] Memory cgroup out of memory: Killed process 138905 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:1012kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[17780.416056] Memory cgroup out of memory: Killed process 138906 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:1004kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[17780.418803] Memory cgroup out of memory: Killed process 138908 (python3) total-vm:153112kB, anon-rss:23928kB, file-rss:8984kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[17780.421590] Memory cgroup out of memory: OOM victim 138909 (python3) is already exiting. Skip killing the task
[17780.421593] Memory cgroup out of memory: Killed process 138910 (python3) total-vm:154232kB, anon-rss:25052kB, file-rss:9004kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17780.424390] Memory cgroup out of memory: OOM victim 138914 (mount) is already exiting. Skip killing the task
[17780.424395] Memory cgroup out of memory: Killed process 138916 (python3) total-vm:152816kB, anon-rss:23516kB, file-rss:8904kB, shmem-rss:0kB, UID:0 pgtables:324kB oom_score_adj:-997
[17780.427172] Memory cgroup out of memory: OOM victim 138922 (python3) is already exiting. Skip killing the task
[17780.427175] Memory cgroup out of memory: Killed process 138924 (python3) total-vm:152888kB, anon-rss:23688kB, file-rss:8908kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17780.429959] Memory cgroup out of memory: Killed process 138926 (python3) total-vm:149604kB, anon-rss:24696kB, file-rss:8984kB, shmem-rss:0kB, UID:0 pgtables:320kB oom_score_adj:-997
[17780.432759] Memory cgroup out of memory: Killed process 138970 (python3) total-vm:153476kB, anon-rss:24064kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:320kB oom_score_adj:-997
[17780.435536] Memory cgroup out of memory: Killed process 138972 (openssl) total-vm:876kB, anon-rss:8kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:36kB oom_score_adj:-997
[17780.438212] Memory cgroup out of memory: Killed process 138973 (python3) total-vm:153112kB, anon-rss:23928kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:304kB oom_score_adj:-997
[17903.271548] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[17903.271555] CPU: 2 PID: 139959 Comm: python3 Not tainted 6.1.59 #1
[17903.271559] Hardware name: Amazon EC2 m6id.xlarge/, BIOS 1.0 10/16/2017
[17903.271560] Call Trace:
[17903.271564]  <TASK>
[17903.271568]  dump_stack_lvl+0x34/0x48
[17903.271576]  dump_header+0x4a/0x213
[17903.271580]  oom_kill_process.cold+0xb/0x10
[17903.271582]  out_of_memory+0xed/0x2e0
[17903.271587]  mem_cgroup_out_of_memory+0x136/0x150
[17903.271594]  try_charge_memcg+0x7d7/0x890
[17903.271598]  ? memcg_list_lru_alloc+0xa7/0x330
[17903.271602]  obj_cgroup_charge+0x74/0x170
[17903.271606]  slab_pre_alloc_hook.constprop.0+0xba/0x1d0
[17903.271610]  ? alloc_inode+0x91/0xc0
[17903.271613]  kmem_cache_alloc_lru+0x4d/0x250
[17903.271616]  alloc_inode+0x91/0xc0
[17903.271619]  new_inode_pseudo+0xd/0x50
[17903.271622]  create_pipe_files+0x27/0x250
[17903.271626]  do_pipe2+0x3a/0xf0
[17903.271629]  ? exit_to_user_mode_prepare+0xf2/0x110
[17903.271633]  __x64_sys_pipe2+0x14/0x20
[17903.271635]  do_syscall_64+0x38/0x90
[17903.271637]  entry_SYSCALL_64_after_hwframe+0x64/0xce
[17903.271642] RIP: 0033:0x7f8255622d77
[17903.271645] Code: 73 01 c3 48 8b 0d 09 51 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 25 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d9 50 2c 00 f7 d8 64 89 01 48
[17903.271647] RSP: 002b:00007ffed503c5f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000125
[17903.271650] RAX: ffffffffffffffda RBX: 00007ffed503c600 RCX: 00007f8255622d77
[17903.271652] RDX: 00007f82569bab70 RSI: 0000000000080000 RDI: 00007ffed503c600
[17903.271653] RBP: 0000000000c1e570 R08: 0000000000000000 R09: 0000000000c1e570
[17903.271654] R10: 0000000000000847 R11: 0000000000000202 R12: 0000000000c1e570
[17903.271655] R13: 00007f824e675608 R14: 00007f8256afdea0 R15: 00007f824e675608
[17903.271658]  </TASK>
[17903.271658] memory: usage 262144kB, limit 262144kB, failcnt 2924
[17903.271660] swap: usage 0kB, limit 0kB, failcnt 0
[17903.271661] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-a28819170de4d64955a2a86f847e9a3737bc2222ddf96987d636b2e5fc3959d3.scope:
[17903.271675] anon 257167360
[17903.271675] file 155648
[17903.271675] kernel 11079680
[17903.271675] kernel_stack 720896
[17903.271675] pagetables 5861376
[17903.271675] sec_pagetables 0
[17903.271675] percpu 0
[17903.271675] sock 32768
[17903.271675] vmalloc 0
[17903.271675] shmem 86016
[17903.271675] zswap 0
[17903.271675] zswapped 0
[17903.271675] file_mapped 0
[17903.271675] file_dirty 0
[17903.271675] file_writeback 0
[17903.271675] swapcached 0
[17903.271675] anon_thp 0
[17903.271675] file_thp 0
[17903.271675] shmem_thp 0
[17903.271675] inactive_anon 257089536
[17903.271675] active_anon 163840
[17903.271675] inactive_file 4096
[17903.271675] active_file 32768
[17903.271675] unevictable 0
[17903.271675] slab_reclaimable 1646976
[17903.271675] slab_unreclaimable 2358456
[17903.271675] slab 4005432
[17903.271675] workingset_refault_anon 0
[17903.271675] workingset_refault_file 3405
[17903.271678] Tasks state (memory values in pages):
[17903.271679] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[17903.271681] [ 139151]     0 139151   188392     9617   192512        0          -997 aws-efs-csi-dri
[17903.271684] [ 139180]     0 139180    30208     6114   282624        0          -997 python3
[17903.271686] [ 139935]     0 139935     5502      281    90112        0          -997 mount
[17903.271689] [ 139936]     0 139936     5502      301    86016        0          -997 mount
[17903.271690] [ 139938]     0 139938     5502      305    90112        0          -997 mount
[17903.271692] [ 139939]     0 139939     5502      278    90112        0          -997 mount
[17903.271694] [ 139940]     0 139940     5502      279    90112        0          -997 mount
[17903.271696] [ 139943]     0 139943    38280     8254   335872        0          -997 python3
[17903.271697] [ 139944]     0 139944    38635     8605   344064        0          -997 python3
[17903.271699] [ 139945]     0 139945    38628     8567   348160        0          -997 python3
[17903.271702] [ 139946]     0 139946     5502      304    90112        0          -997 mount
[17903.271703] [ 139947]     0 139947    38560     8537   339968        0          -997 python3
[17903.271705] [ 139948]     0 139948     5502      295    86016        0          -997 mount
[17903.271707] [ 139951]     0 139951    38550     8511   344064        0          -997 python3
[17903.271709] [ 139952]     0 139952    38200     8183   339968        0          -997 python3
[17903.271711] [ 139953]     0 139953     5502      288    94208        0          -997 mount
[17903.271713] [ 139955]     0 139955    37412     8394   327680        0          -997 python3
[17903.271715] [ 139956]     0 139956     5502      293    90112        0          -997 mount
[17903.271716] [ 139957]     0 139957    38607     8539   352256        0          -997 python3
[17903.271718] [ 139959]     0 139959    38139     8073   344064        0          -997 python3
[17903.271721] [ 139962]     0 139962    38635     6318   319488        0          -997 python3
[17903.271722] [ 139963]     0 139963    38560     6270   311296        0          -997 python3
[17903.271724] [ 139964]     0 139964    38200     5911   311296        0          -997 python3
[17903.271726] [ 139965]     0 139965    38550     6262   311296        0          -997 python3
[17903.271728] [ 139966]     0 139966    38607     6284   323584        0          -997 python3
[17903.271731] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-a28819170de4d64955a2a86f847e9a3737bc2222ddf96987d636b2e5fc3959d3.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-a28819170de4d64955a2a86f847e9a3737bc2222ddf96987d636b2e5fc3959d3.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-a28819170de4d64955a2a86f847e9a3737bc2222ddf96987d636b2e5fc3959d3.scope,task=aws-efs-csi-dri,pid=139151,uid=0
[17903.271787] Memory cgroup out of memory: Killed process 139151 (aws-efs-csi-dri) total-vm:753568kB, anon-rss:10480kB, file-rss:27988kB, shmem-rss:0kB, UID:0 pgtables:188kB oom_score_adj:-997
[17903.274556] Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06d44939_d15e_491f_84a3_2dbba9fc0440.slice/cri-containerd-a28819170de4d64955a2a86f847e9a3737bc2222ddf96987d636b2e5fc3959d3.scope are going to be killed due to memory.oom.group set
[17903.274561] Memory cgroup out of memory: Killed process 139151 (aws-efs-csi-dri) total-vm:753568kB, anon-rss:10636kB, file-rss:28052kB, shmem-rss:0kB, UID:0 pgtables:188kB oom_score_adj:-997
[17903.277349] Memory cgroup out of memory: Killed process 139180 (python3) total-vm:120832kB, anon-rss:15592kB, file-rss:8864kB, shmem-rss:0kB, UID:0 pgtables:276kB oom_score_adj:-997
[17903.280091] Memory cgroup out of memory: Killed process 139935 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:960kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17903.282464] Memory cgroup out of memory: OOM victim 139939 (mount) is already exiting. Skip killing the task
[17903.282468] Memory cgroup out of memory: Killed process 139940 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:952kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[17903.285069] Memory cgroup out of memory: Killed process 139943 (python3) total-vm:153120kB, anon-rss:23864kB, file-rss:9152kB, shmem-rss:0kB, UID:0 pgtables:328kB oom_score_adj:-997
[17903.287789] Memory cgroup out of memory: OOM victim 139948 (mount) is already exiting. Skip killing the task
[17903.287793] Memory cgroup out of memory: Killed process 139951 (python3) total-vm:154200kB, anon-rss:24948kB, file-rss:9096kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[17903.290591] Memory cgroup out of memory: Killed process 139952 (python3) total-vm:152800kB, anon-rss:23644kB, file-rss:9088kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[17903.293397] Memory cgroup out of memory: Killed process 139955 (python3) total-vm:149648kB, anon-rss:24552kB, file-rss:9024kB, shmem-rss:0kB, UID:0 pgtables:320kB oom_score_adj:-997
[17903.296063] Memory cgroup out of memory: Killed process 139959 (python3) total-vm:152556kB, anon-rss:23300kB, file-rss:8992kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[18026.215147] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[18026.215155] CPU: 1 PID: 141320 Comm: python3 Not tainted 6.1.59 #1
[18026.215158] Hardware name: Amazon EC2 m6id.xlarge/, BIOS 1.0 10/16/2017
[18026.215160] Call Trace:
[18026.215163]  <TASK>
[18026.215167]  dump_stack_lvl+0x34/0x48
[18026.215175]  dump_header+0x4a/0x213
[18026.215179]  oom_kill_process.cold+0xb/0x10
[18026.215181]  out_of_memory+0xed/0x2e0
[18026.215187]  mem_cgroup_out_of_memory+0x136/0x150
[18026.215194]  try_charge_memcg+0x7d7/0x890
[18026.215198]  charge_memcg+0x35/0xe0
[18026.215202]  __mem_cgroup_charge+0x29/0x80
[18026.215206]  wp_page_copy+0x10f/0xa50
[18026.215209]  ? common_interrupt+0xf/0xa0
[18026.215212]  __handle_mm_fault+0x513/0x5e0
[18026.215217]  handle_mm_fault+0xc5/0x2b0
[18026.215219]  do_user_addr_fault+0x1a1/0x5a0
[18026.215223]  exc_page_fault+0x62/0x140
[18026.215226]  asm_exc_page_fault+0x22/0x30
[18026.215231] RIP: 0033:0x7fd3c92566c2
[18026.215234] Code: 43 18 31 d0 25 ff 0f 00 00 48 8d 0c 40 48 8d 05 94 b4 48 00 48 8d 04 c8 48 8b 78 08 89 10 48 89 68 10 48 83 03 01 48 89 58 08 <48> 83 2f 01 0f 85 22 ff ff ff e8 5f 79 fb ff e9 18 ff ff ff e8 d5
[18026.215237] RSP: 002b:00007ffe030c7f20 EFLAGS: 00010206
[18026.215239] RAX: 00007fd3c96f9810 RBX: 00007fd3c7f4d7f0 RCX: 0000000000002f9a
[18026.215241] RDX: 00000000000003a9 RSI: 00007ffe030c7eb0 RDI: 00007fd3c97bdb30
[18026.215243] RBP: 00007fd3c7ddbca0 R08: 2e3a4fc83fd4ac77 R09: 000000000000001f
[18026.215244] R10: 0000000000000020 R11: 00000000ffffffff R12: 00007fd3c96d41c0
[18026.215246] R13: 00000000028d3d00 R14: 00007fd3c13dfe50 R15: 00000000000000a0
[18026.215248]  </TASK>
[18026.215249] memory: usage 262144kB, limit 262144kB, failcnt 3036
[18026.215251] swap: usage 0kB, limit 0kB, failcnt 0
[18026.215252] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a874b4a74b84b99c912402cdb454b1462528b272d2c0ab04d44422a707e7a6b6.scope:
[18026.215267] anon 257167360
[18026.215267] file 180224
[18026.215267] kernel 10895360
[18026.215267] kernel_stack 688128
[18026.215267] pagetables 5767168
[18026.215267] sec_pagetables 0
[18026.215267] percpu 0
[18026.215267] sock 28672
[18026.215267] vmalloc 0
[18026.215267] shmem 73728
[18026.215267] zswap 0
[18026.215267] zswapped 0
[18026.215267] file_mapped 0
[18026.215267] file_dirty 94208
[18026.215267] file_writeback 0
[18026.215267] swapcached 0
[18026.215267] anon_thp 0
[18026.215267] file_thp 0
[18026.215267] shmem_thp 0
[18026.215267] inactive_anon 257232896
[18026.215267] active_anon 172032
[18026.215267] inactive_file 53248
[18026.215267] active_file 53248
[18026.215267] unevictable 0
[18026.215267] slab_reclaimable 1632528
[18026.215267] slab_unreclaimable 2307296
[18026.215267] slab 3939824
[18026.215267] workingset_refault_anon 0
[18026.215267] workingset_refault_file 3903
[18026.215270] Tasks state (memory values in pages):
[18026.215271] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[18026.215273] [ 140289]     0 140289   188392     9598   196608        0          -997 aws-efs-csi-dri
[18026.215276] [ 140339]     0 140339    30208     6130   282624        0          -997 python3
[18026.215279] [ 141313]     0 141313     5502      294    81920        0          -997 mount
[18026.215282] [ 141314]     0 141314     5502      277    81920        0          -997 mount
[18026.215284] [ 141315]     0 141315     5502      304    90112        0          -997 mount
[18026.215286] [ 141316]     0 141316     5502      277    86016        0          -997 mount
[18026.215288] [ 141317]     0 141317     5502      283    86016        0          -997 mount
[18026.215290] [ 141320]     0 141320    38344     8297   339968        0          -997 python3
[18026.215292] [ 141322]     0 141322    38343     8294   352256        0          -997 python3
[18026.215294] [ 141323]     0 141323     5502      287    90112        0          -997 mount
[18026.215296] [ 141324]     0 141324    36672     8426   335872        0          -997 python3
[18026.215297] [ 141325]     0 141325    38634     8577   339968        0          -997 python3
[18026.215299] [ 141326]     0 141326    38619     8547   339968        0          -997 python3
[18026.215301] [ 141327]     0 141327     5502      294    94208        0          -997 mount
[18026.215303] [ 141328]     0 141328    37281     8202   323584        0          -997 python3
[18026.215305] [ 141330]     0 141330     5502      287    94208        0          -997 mount
[18026.215307] [ 141332]     0 141332    38561     8511   344064        0          -997 python3
[18026.215309] [ 141333]     0 141333     5502      277    90112        0          -997 mount
[18026.215311] [ 141335]     0 141335    38570     8516   356352        0          -997 python3
[18026.215313] [ 141336]     0 141336    38005     7956   335872        0          -997 python3
[18026.215315] [ 141341]     0 141341        2        2    12288        0          -997 python3
[18026.215317] [ 141342]     0 141342    38634     6314   319488        0          -997 python3
[18026.215320] [ 141343]     0 141343    38619     6299   311296        0          -997 python3
[18026.215322] [ 141344]     0 141344    38344     6025   311296        0          -997 python3
[18026.215324] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-a874b4a74b84b99c912402cdb454b1462528b272d2c0ab04d44422a707e7a6b6.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a874b4a74b84b99c912402cdb454b1462528b272d2c0ab04d44422a707e7a6b6.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a874b4a74b84b99c912402cdb454b1462528b272d2c0ab04d44422a707e7a6b6.scope,task=aws-efs-csi-dri,pid=140289,uid=0
[18026.215372] Memory cgroup out of memory: Killed process 140289 (aws-efs-csi-dri) total-vm:753568kB, anon-rss:10532kB, file-rss:27860kB, shmem-rss:0kB, UID:0 pgtables:192kB oom_score_adj:-997
[18026.218478] Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a874b4a74b84b99c912402cdb454b1462528b272d2c0ab04d44422a707e7a6b6.scope are going to be killed due to memory.oom.group set
[18026.218486] Memory cgroup out of memory: Killed process 140289 (aws-efs-csi-dri) total-vm:753568kB, anon-rss:11240kB, file-rss:27860kB, shmem-rss:0kB, UID:0 pgtables:192kB oom_score_adj:-997
[18026.221140] Memory cgroup out of memory: Killed process 140339 (python3) total-vm:120832kB, anon-rss:15596kB, file-rss:8924kB, shmem-rss:0kB, UID:0 pgtables:276kB oom_score_adj:-997
[18026.223926] Memory cgroup out of memory: Killed process 141313 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:1012kB, shmem-rss:0kB, UID:0 pgtables:80kB oom_score_adj:-997
[18026.226541] Memory cgroup out of memory: Killed process 141314 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:940kB, shmem-rss:0kB, UID:0 pgtables:80kB oom_score_adj:-997
[18026.229018] Memory cgroup out of memory: Killed process 141315 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:1052kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[18026.231376] Memory cgroup out of memory: Killed process 141316 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:940kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[18026.234146] Memory cgroup out of memory: Killed process 141317 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:964kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[18026.236707] Memory cgroup out of memory: Killed process 141320 (python3) total-vm:153376kB, anon-rss:24100kB, file-rss:9088kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[18026.239454] Memory cgroup out of memory: Killed process 141322 (python3) total-vm:153372kB, anon-rss:24088kB, file-rss:9088kB, shmem-rss:0kB, UID:0 pgtables:344kB oom_score_adj:-997
[18026.242143] Memory cgroup out of memory: Killed process 141324 (python3) total-vm:146688kB, anon-rss:24732kB, file-rss:8972kB, shmem-rss:0kB, UID:0 pgtables:328kB oom_score_adj:-997
[18026.244768] Memory cgroup out of memory: Killed process 141325 (python3) total-vm:154536kB, anon-rss:25256kB, file-rss:9052kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[18026.247386] Memory cgroup out of memory: Killed process 141327 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:1008kB, shmem-rss:0kB, UID:0 pgtables:92kB oom_score_adj:-997
[18026.250050] Memory cgroup out of memory: Killed process 141328 (python3) total-vm:153328kB, anon-rss:24008kB, file-rss:8944kB, shmem-rss:0kB, UID:0 pgtables:324kB oom_score_adj:-997
[18026.252849] Memory cgroup out of memory: Killed process 141333 (mount) total-vm:22008kB, anon-rss:168kB, file-rss:940kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[18026.255506] Memory cgroup out of memory: Killed process 141346 (python3) total-vm:154280kB, anon-rss:25104kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:308kB oom_score_adj:-997
[18149.093875] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997
[18149.093882] CPU: 0 PID: 142299 Comm: python3 Not tainted 6.1.59 #1
[18149.093885] Hardware name: Amazon EC2 m6id.xlarge/, BIOS 1.0 10/16/2017
[18149.093887] Call Trace:
[18149.093890]  <TASK>
[18149.093894]  dump_stack_lvl+0x34/0x48
[18149.093901]  dump_header+0x4a/0x213
[18149.093905]  oom_kill_process.cold+0xb/0x10
[18149.093907]  out_of_memory+0xed/0x2e0
[18149.093912]  mem_cgroup_out_of_memory+0x136/0x150
[18149.093918]  try_charge_memcg+0x7d7/0x890
[18149.093923]  charge_memcg+0x35/0xe0
[18149.093925]  __mem_cgroup_charge+0x29/0x80
[18149.093929]  wp_page_copy+0x10f/0xa50
[18149.093932]  __handle_mm_fault+0x513/0x5e0
[18149.093936]  handle_mm_fault+0xc5/0x2b0
[18149.093938]  do_user_addr_fault+0x1a1/0x5a0
[18149.093942]  exc_page_fault+0x62/0x140
[18149.093945]  asm_exc_page_fault+0x22/0x30
[18149.093950] RIP: 0033:0x7fd0d3492672
[18149.093953] Code: c6 48 85 ff 0f 84 cd fd ff ff 49 c7 84 24 60 03 00 00 00 00 00 00 48 83 2f 01 0f 85 b7 fd ff ff e8 33 72 f6 ff e9 ad fd ff ff <49> 83 00 01 4c 89 c7 4c 89 44 24 08 e8 4d 3e f9 ff 48 85 c0 48 89
[18149.093955] RSP: 002b:00007ffdb8e4c5a0 EFLAGS: 00010246
[18149.093958] RAX: 00007fd0cb5abf80 RBX: 00007fd0d1fa07b0 RCX: 0000000000000000
[18149.093959] RDX: 00000000000003a9 RSI: 00007fd0d1fa07b0 RDI: 00007fd0cb5abf70
[18149.093961] RBP: 0000000000000000 R08: 00007fd0d1fab5b0 R09: 0000000000000001
[18149.093962] R10: 00007ffdb8e4c6d8 R11: 00007fd0d3a1fe40 R12: 00000000026f0d00
[18149.093964] R13: 00007fd0cb5abf80 R14: 00007fd0cb5abf70 R15: 00007fd0cb61fac0
[18149.093966]  </TASK>
[18149.093967] memory: usage 262144kB, limit 262144kB, failcnt 3004
[18149.093969] swap: usage 0kB, limit 0kB, failcnt 0
[18149.093970] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a9fc35efb5eb640b6608c561f09a3ca66d550e92f75e163c010701c9d91d56bf.scope:
[18149.093991] anon 257241088
[18149.093991] file 122880
[18149.093991] kernel 10543104
[18149.093991] kernel_stack 704512
[18149.093991] pagetables 5361664
[18149.093991] sec_pagetables 0
[18149.093991] percpu 0
[18149.093991] sock 32768
[18149.093991] vmalloc 0
[18149.093991] shmem 86016
[18149.093991] zswap 0
[18149.093991] zswapped 0
[18149.093991] file_mapped 0
[18149.093991] file_dirty 0
[18149.093991] file_writeback 0
[18149.093991] swapcached 0
[18149.093991] anon_thp 0
[18149.093991] file_thp 0
[18149.093991] shmem_thp 0
[18149.093991] inactive_anon 257585152
[18149.093991] active_anon 200704
[18149.093991] inactive_file 4096
[18149.093991] active_file 8192
[18149.093991] unevictable 0
[18149.093991] slab_reclaimable 1687240
[18149.093991] slab_unreclaimable 2250080
[18149.093991] slab 3937320
[18149.093991] workingset_refault_anon 0
[18149.093991] workingset_refault_file 138
[18149.093994] Tasks state (memory values in pages):
[18149.093995] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[18149.093996] [ 141423]     0 141423   188328     9608   188416        0          -997 aws-efs-csi-dri
[18149.094000] [ 141450]     0 141450    30208     6118   278528        0          -997 python3
[18149.094002] [ 142290]     0 142290     5502      277    81920        0          -997 mount
[18149.094004] [ 142291]     0 142291     5502      282    90112        0          -997 mount
[18149.094006] [ 142292]     0 142292     5502      285    94208        0          -997 mount
[18149.094008] [ 142293]     0 142293     5502      294    86016        0          -997 mount
[18149.094010] [ 142295]     0 142295     5502      286    86016        0          -997 mount
[18149.094012] [ 142298]     0 142298    38634     8612   344064        0          -997 python3
[18149.094013] [ 142299]     0 142299    38566     8519   339968        0          -997 python3
[18149.094015] [ 142300]     0 142300    38624     8567   344064        0          -997 python3
[18149.094016] [ 142301]     0 142301    38353     8292   348160        0          -997 python3
[18149.094018] [ 142302]     0 142302     5502      288    90112        0          -997 mount
[18149.094020] [ 142304]     0 142304     5502      294    90112        0          -997 mount
[18149.094022] [ 142305]     0 142305     5502      286    86016        0          -997 mount
[18149.094023] [ 142307]     0 142307    38248     8216   335872        0          -997 python3
[18149.094025] [ 142308]     0 142308    38332     8269   352256        0          -997 python3
[18149.094026] [ 142309]     0 142309    37352     8307   344064        0          -997 python3
[18149.094028] [ 142311]     0 142311     5502      290    90112        0          -997 mount
[18149.094029] [ 142313]     0 142313    38350     8301   344064        0          -997 python3
[18149.094031] [ 142315]     0 142315    38349     8272   335872        0          -997 python3
[18149.094033] [ 142319]     0 142319        2        2    12288        0          -997 python3
[18149.094034] [ 142320]     0 142320     9705     1139   131072        0          -997 openssl
[18149.094036] [ 142321]     0 142321     9672      628   110592        0          -997 openssl
[18149.094038] [ 142322]     0 142322     9705     1117   122880        0          -997 openssl
[18149.094040] [ 142323]     0 142323    38248     5954   311296        0          -997 python3
[18149.094042] [ 142324]     0 142324    38566     6272   315392        0          -997 python3
[18149.094043] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-a9fc35efb5eb640b6608c561f09a3ca66d550e92f75e163c010701c9d91d56bf.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a9fc35efb5eb640b6608c561f09a3ca66d550e92f75e163c010701c9d91d56bf.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a9fc35efb5eb640b6608c561f09a3ca66d550e92f75e163c010701c9d91d56bf.scope,task=aws-efs-csi-dri,pid=141423,uid=0
[18149.094093] Memory cgroup out of memory: Killed process 141423 (aws-efs-csi-dri) total-vm:753312kB, anon-rss:10316kB, file-rss:28116kB, shmem-rss:0kB, UID:0 pgtables:184kB oom_score_adj:-997
[18149.097299] Tasks in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf7c519c8_7ad0_480e_b6ff_e86240fd440a.slice/cri-containerd-a9fc35efb5eb640b6608c561f09a3ca66d550e92f75e163c010701c9d91d56bf.scope are going to be killed due to memory.oom.group set
[18149.097312] Memory cgroup out of memory: Killed process 141450 (python3) total-vm:120832kB, anon-rss:15592kB, file-rss:8880kB, shmem-rss:0kB, UID:0 pgtables:272kB oom_score_adj:-997
[18149.100110] Memory cgroup out of memory: Killed process 142290 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:944kB, shmem-rss:0kB, UID:0 pgtables:80kB oom_score_adj:-997
[18149.102522] Memory cgroup out of memory: Killed process 142291 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:964kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:-997
[18149.105323] Memory cgroup out of memory: Killed process 142292 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:976kB, shmem-rss:0kB, UID:0 pgtables:92kB oom_score_adj:-997
[18149.107929] Memory cgroup out of memory: Killed process 142293 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:1012kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[18149.110389] Memory cgroup out of memory: Killed process 142295 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:980kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[18149.112817] Memory cgroup out of memory: Killed process 142298 (python3) total-vm:154536kB, anon-rss:25260kB, file-rss:9188kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[18149.115291] Memory cgroup out of memory: Killed process 142299 (python3) total-vm:154264kB, anon-rss:25088kB, file-rss:8988kB, shmem-rss:0kB, UID:0 pgtables:332kB oom_score_adj:-997
[18149.117765] Memory cgroup out of memory: Killed process 142300 (python3) total-vm:154496kB, anon-rss:25220kB, file-rss:9048kB, shmem-rss:0kB, UID:0 pgtables:336kB oom_score_adj:-997
[18149.120452] Memory cgroup out of memory: Killed process 142301 (python3) total-vm:153412kB, anon-rss:24068kB, file-rss:9100kB, shmem-rss:0kB, UID:0 pgtables:340kB oom_score_adj:-997
[18149.123119] Memory cgroup out of memory: Killed process 142305 (mount) total-vm:22008kB, anon-rss:164kB, file-rss:980kB, shmem-rss:0kB, UID:0 pgtables:84kB oom_score_adj:-997
[18149.125725] Memory cgroup out of memory: OOM victim 142307 (python3) is already exiting. Skip killing the task
[18149.125729] Memory cgroup out of memory: Killed process 142309 (python3) total-vm:153612kB, anon-rss:24328kB, file-rss:9084kB, shmem-rss:0kB, UID:0 pgtables:344kB oom_score_adj:-997
[18149.128624] Memory cgroup out of memory: OOM victim 142313 (python3) is already exiting. Skip killing the task
[18149.128628] Memory cgroup out of memory: Killed process 142321 (openssl) total-vm:38688kB, anon-rss:484kB, file-rss:2028kB, shmem-rss:0kB, UID:0 pgtables:108kB oom_score_adj:-997
[18149.131532] Memory cgroup out of memory: Killed process 142327 (python3) total-vm:8kB, anon-rss:8kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:12kB oom_score_adj:-997
[18272.017624] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=-997

Instructions to gather debug logs can be found here

The text was updated successfully, but these errors were encountered:

bonickle · 2024-01-16T16:21:50Z

just a quick note here since I was running into something similar, maybe it will help. OOM might be preventing you from seeing the the true logs on what is actually causing the issues. On my cluster I gave my pod way more memory than it needed, it stabilized and I was able to see the efs-plugin was stuck trying to delete a bunch of pvs, which was causing the out of memory error before I temporarily gave it more memory to troubleshoot.

df-lcb · 2024-01-16T16:51:55Z

just a quick note here since I was running into something similar, maybe it will help. OOM might be preventing you from seeing the the true logs on what is actually causing the issues. On my cluster I gave my pod way more memory than it needed, it stabilized and I was able to see the efs-plugin was stuck trying to delete a bunch of pvs, which was causing the out of memory error before I temporarily gave it more memory to troubleshoot.

@bonickle Once you got past the OOM issue, can you share what you found to be the root cause of the issue, as well as what your resolution was? I'm having the same issue, but still haven't determined why it's happening even after manually removing the PVs.

bonickle · 2024-01-16T17:04:58Z

@df-lcb still looking into the root cause. If/when I figure it out, il drop an update

runningman84 · 2024-01-17T05:59:45Z

From my point of view the root cause is that if you have dozens of efs volumes the oom crash occurs because it mounts them all at the same time. This is also the reason why you do see many mount and python processes in the list.

I would recommend sequential mounts or a limit of parallel mounts.

df-lcb · 2024-01-17T14:38:57Z

Thanks @runningman84 I'll look into this.

jiangfwa · 2024-01-23T03:05:02Z

Hi, @df-lcb. Any updates about this, we bump the memory to 512mb, the OOMKilled happens again.

df-lcb · 2024-01-23T14:33:48Z

Hey, @jiangfwa. Due to resource spiking, we had to temporarily bump ours up to 1.5Gi to get past the OOM.

k8s-triage-robot · 2024-04-22T15:29:14Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

runningman84 · 2024-04-22T18:01:56Z

/remove-lifecycle stale

k8s-triage-robot · 2024-07-21T18:32:47Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-08-20T19:23:22Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

szaniec · 2024-08-28T01:18:09Z

Just in case anyone encounters this issue, I think we found one potential cause.

We have recently experienced frequent OOM within the efs-plugin container of efs-csi-node pods. The container was eventually allocated 2.5Gb limits to counter the OOM.

We eventually realised that the underlying EFS cluster was configured in "Burstable" throughput mode - the OOM restarts occurred during high EFS use when the cluster throughput would eventually be throttled. We speculate that this probably caused a large number of open file transfers causing the OOM.

Switching the EFS cluster to "Provisioned" through mode appears to have so far fixed the memory issue.

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 15, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 22, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 22, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 21, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

oom kills #1218

oom kills #1218

runningman84 commented Dec 15, 2023

bonickle commented Jan 16, 2024

df-lcb commented Jan 16, 2024 •

edited

Loading

bonickle commented Jan 16, 2024

runningman84 commented Jan 17, 2024 •

edited

Loading

df-lcb commented Jan 17, 2024

jiangfwa commented Jan 23, 2024

df-lcb commented Jan 23, 2024

k8s-triage-robot commented Apr 22, 2024

runningman84 commented Apr 22, 2024

k8s-triage-robot commented Jul 21, 2024

k8s-triage-robot commented Aug 20, 2024

szaniec commented Aug 28, 2024 •

edited

Loading

oom kills #1218

oom kills #1218

Comments

runningman84 commented Dec 15, 2023

bonickle commented Jan 16, 2024

df-lcb commented Jan 16, 2024 • edited Loading

bonickle commented Jan 16, 2024

runningman84 commented Jan 17, 2024 • edited Loading

df-lcb commented Jan 17, 2024

jiangfwa commented Jan 23, 2024

df-lcb commented Jan 23, 2024

k8s-triage-robot commented Apr 22, 2024

runningman84 commented Apr 22, 2024

k8s-triage-robot commented Jul 21, 2024

k8s-triage-robot commented Aug 20, 2024

szaniec commented Aug 28, 2024 • edited Loading

df-lcb commented Jan 16, 2024 •

edited

Loading

runningman84 commented Jan 17, 2024 •

edited

Loading

szaniec commented Aug 28, 2024 •

edited

Loading