Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in output plugin when trying to format data from chunk that is not up in memory #8691

Closed
epsteina16 opened this issue Apr 9, 2024 · 1 comment · Fixed by #8694
Closed

Comments

@epsteina16
Copy link
Contributor

Bug Report

Describe the bug
When fluent bit is run with multiple output plugins, filesystem storage for the input plugin, and a limited number of max chunks up it occasionally crashes with a SIGSEGV while trying to format the msgpack data. In order to reproduce the bug, one of the output plugins needs to be retrying it's assigned tasks.

#0  0x55a92accb8e2      in  flb_utils_write_str() at src/flb_utils.c:793
#1  0x55a92ac7b77d      in  msgpack2json() at src/flb_pack.c:641
#2  0x55a92ac7bcf7      in  msgpack2json() at src/flb_pack.c:729
#3  0x55a92ac7bebf      in  flb_msgpack_to_json() at src/flb_pack.c:768
#4  0x55a92ac7cc91      in  flb_msgpack_to_json_str() at src/flb_pack.c:1169
#5  0x55a92b0c1ff8      in  plain_output() at plugins/out_file/file.c:329
#6  0x55a92b0c2b86      in  cb_file_flush() at plugins/out_file/file.c:607
#7  0x55a92acb0157      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:597
#8  0x55a92ba090ca      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#9  0xffffffffffffffff  in  ???() at ???:0

The cause of this seems to be https://github.com/fluent/fluent-bit/blob/master/src/flb_task.c#L164 which doesn't check if other output plugins are about to use this chunk before putting it down to the filesystem.

To Reproduce
Configuration file:

[SERVICE]
  flush 1
  storage.path /disk1/tmp/fbit-chunk-issue
  storage.max_chunks_up 5
  log_level debug
  grace 10

[INPUT]
  Name tail
  Path /disk1/tmp/fbit-chunk-issue/forfb.log
  Read_from_Head True
  storage.type filesystem
  mem_buf_limit 1

[OUTPUT]
  Name tcp
  Match *
  Host localhost
  Port 28554
  Format json_lines
  retry_limit 5
  workers 0

[OUTPUT]
  Name file
  Match *
  Path /disk1/tmp
  File ignore.log
  Format plain

Run fluent bit with this configuration for ~5 seconds before starting the TCP server. This will force a number of tasks to be rescheduled.
For the TCP server, I ran
nc -l -k 28554

The log file I used in this configuration file has 7720000 lines of JSON.

Expected behavior
Fluent Bit should only put a chunk down when there is no output plugin that is about to use it.

Your Environment

  • Version used: Using a local build that was just updated from main (v3.0.1)
  • Operating System and version: Ubuntu 20.04
@edsiper
Copy link
Member

edsiper commented Apr 9, 2024

Thanks for reporting the issue, troubleshooting and pointing out to the root cause! I submitted a fix here: #8694

edsiper added a commit that referenced this issue Apr 9, 2024
Signed-off-by: Eduardo Silva <eduardo@calyptia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants