Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filesystem based chunk storage results in "chunk_io_locked" exception followed by fluentbit process termination #4597

Closed
Sabari-Arunkumar-ML opened this issue Jan 11, 2022 · 0 comments

Comments

@Sabari-Arunkumar-ML
Copy link

Sabari-Arunkumar-ML commented Jan 11, 2022

Version: 1.7.5
Environment: Ubuntu (containarized) (k8s)

We have a high load in production and multiple files+rewrite tags in our pipleline.
Upon new log files found over a period in k8s cluster, we will restart fluentbit (SIGTERM call followed by new process creation )

I can see fluentbit crashed sporadically,

Following is the GDB backtrace observed in one of crash dump file

#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007fe965090921 in __GI_abort () at abort.c:79
#2 0x0000000000436a72 in flb_signal_handler (signal=11) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/fluent-bit.c:514
#3
#4 0x000000000072e3ee in cio_chunk_is_locked (ch=0x36) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/lib/chunkio/src/cio_chunk.c:343
#5 0x0000000000478b7c in input_chunk_get (tag=0x7fe960446f30 "klog", tag_len=4, in=0x7fe9603f2a80, chunk_size=368, set_down=0x7fe96504df08) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_input_chunk.c:630
#6 0x0000000000479121 in flb_input_chunk_append_raw (in=0x7fe9603f2a80, tag=0x7fe960446f30 "klog", tag_len=4, buf=0x7fe96001b9d0, buf_size=368) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_input_chunk.c:865
#7 0x000000000048cf47 in in_emitter_add_record (tag=0x7fe960446680 "klog", tag_len=4, buf_data=0x7fe96624901f <error: Cannot access memory at address 0x7fe96624901f>, buf_size=368, in=0x7fe9603f2a80) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/in_emitter/emitter.c:117
#8 0x0000000000523635 in process_record (tag=0x7fe960452b90 "kubelet", tag_len=7, map=..., buf=0x7fe96624901f, buf_size=368, keep=0x7fe96504e160, ctx=0x7fe9603f2270) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/filter_rewrite_tag/rewrite_tag.c:324
#9 0x000000000052378b in cb_rewrite_tag_filter (data=0x7fe96624901f, bytes=368, tag=0x7fe960452b90 "kubelet", tag_len=7, out_buf=0x7fe96504e1f8, out_bytes=0x7fe96504e1e8, f_ins=0x1320a40, filter_context=0x7fe9603f2270, config=0x128e290)
at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/filter_rewrite_tag/rewrite_tag.c:375
#10 0x000000000044cc0c in flb_filter_do (ic=0x7fe96043ee70, data=0x7fe960018ce0, bytes=371, tag=0x7fe96043ef00 "kubelet", tag_len=7, config=0x128e290) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_filter.c:118
#11 0x00000000004792ee in flb_input_chunk_append_raw (in=0x7fe960404f50, tag=0x7fe96043ef00 "kubelet", tag_len=7, buf=0x7fe960018ce0, buf_size=371) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_input_chunk.c:911
#12 0x000000000048cf47 in in_emitter_add_record (tag=0x7fe9604520c0 "kubelet", tag_len=7, buf_data=0x7fe96045c5d0 "\222\327", buf_size=371, in=0x7fe960404f50) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/in_emitter/emitter.c:117
#13 0x0000000000523635 in process_record (tag=0x7fe96042ede0 "syslog", tag_len=6, map=..., buf=0x7fe96045c5d0, buf_size=371, keep=0x7fe96504e510, ctx=0x7fe960404740) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/filter_rewrite_tag/rewrite_tag.c:324
#14 0x000000000052378b in cb_rewrite_tag_filter (data=0x7fe96045c5d0, bytes=371, tag=0x7fe96042ede0 "syslog", tag_len=6, out_buf=0x7fe96504e5a8, out_bytes=0x7fe96504e598, f_ins=0x1322050, filter_context=0x7fe960404740, config=0x128e290)
at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/filter_rewrite_tag/rewrite_tag.c:375
#15 0x000000000044cc0c in flb_filter_do (ic=0x7fe9604519d0, data=0x7fe960013630, bytes=177, tag=0x7fe96029a320 "syslog", tag_len=6, config=0x128e290) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_filter.c:118
#16 0x00000000004792ee in flb_input_chunk_append_raw (in=0x12c30e0, tag=0x7fe96029a320 "syslog", tag_len=6, buf=0x7fe960013630, buf_size=177) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_input_chunk.c:911
#17 0x0000000000491487 in process_content (file=0x7fe9602a11b0, bytes=0x7fe96504e858) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/in_tail/tail_file.c:367
#18 0x000000000049316b in flb_tail_file_chunk (file=0x7fe9602a11b0) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/in_tail/tail_file.c:994
#19 0x000000000048d9ba in in_tail_collect_event (file=0x7fe9602a11b0, config=0x128e290) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/in_tail/tail.c:261
#20 0x0000000000498277 in tail_fs_event (ins=0x12c30e0, config=0x128e290, in_context=0x7fe96029dec0) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/plugins/in_tail/tail_fs_inotify.c:268
#21 0x000000000044c6cd in flb_input_collector_fd (fd=215, config=0x128e290) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_input.c:1004
#22 0x000000000045c5d8 in flb_engine_handle_event (config=0x128e290, mask=1, fd=215) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_engine.c:363
#23 flb_engine_start (config=0x128e290) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_engine.c:624
#24 0x00000000004422db in flb_lib_worker (data=0x128e260) at /home/ec2-user/sabari/fb_v_1.7.5/fluent-bit/src/flb_lib.c:493
#25 0x00007fe965e0a6db in start_thread (arg=0x7fe96504f700) at pthread_create.c:463

Note: We never got into this scenario , when we didn't use filesytem based storage.

@Sabari-Arunkumar-ML Sabari-Arunkumar-ML changed the title Filesystem based chunk storage results in "chunk_io_locked" exception terminating fluentbit process Filesystem based chunk storage results in "chunk_io_locked" followed by fluentbit termination process Jan 11, 2022
@Sabari-Arunkumar-ML Sabari-Arunkumar-ML changed the title Filesystem based chunk storage results in "chunk_io_locked" followed by fluentbit termination process Filesystem based chunk storage results in "chunk_io_locked" followed by fluentbit process termination Jan 11, 2022
@Sabari-Arunkumar-ML Sabari-Arunkumar-ML changed the title Filesystem based chunk storage results in "chunk_io_locked" followed by fluentbit process termination Filesystem based chunk storage results in "chunk_io_locked" exception followed by fluentbit process termination Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant