Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentbit pods are crashing/restarting after SIGSEV error #7751

Closed
RaniVedprakash opened this issue Jul 25, 2023 · 2 comments
Closed

Fluentbit pods are crashing/restarting after SIGSEV error #7751

RaniVedprakash opened this issue Jul 25, 2023 · 2 comments

Comments

@RaniVedprakash
Copy link

Issue:
Recently migrated to Fluentbit v2.1.2.
Fluentbit pods are crashing/restarting after SIGSEV error, only encountered on Prod accounts

Error,

[2023/07/25 05:26:19] [ info] [input:tail:tail.0] inotify_fs_add(): inode=22XXXX watch_fd=164 name=/var/lib/docker/containers/XXXXXXXXXXXXXX-json.log.1
[2023/07/25 05:26:19] [ info] [input:tail:tail.0] inotify_fs_add(): inode=22XXXX watch_fd=165 name=/var/log/containers/XXXXXXXXXXXXXXX.log
[2023/07/25 05:26:44] [error] [/src/fluent-bit/src/tls/openssl.c:488 errno=32] Broken pipe
[2023/07/25 05:26:44] [error] [tls] syscall error: error:00000005:lib(0):func(0):DH lib
[2023/07/25 05:26:44] [error] [/src/fluent-bit/src/flb_http_client.c:1238 errno=32] Broken pipe
[2023/07/25 05:26:44] [error] [tls] error: error:00000001:lib(0):func(0):reason(1)
[2023/07/25 05:26:44] [error] [output:s3:s3.1] PutObject request failed
[2023/07/25 05:26:44] [error] [/src/fluent-bit/src/tls/openssl.c:488 errno=32] Broken pipe
[2023/07/25 05:26:44] [error] [tls] syscall error: error:00000005:lib(0):func(0):DH lib
[2023/07/25 05:26:44] [error] [/src/fluent-bit/src/flb_http_client.c:1238 errno=32] Broken pipe
[2023/07/25 05:26:44] [error] [tls] error: error:00000001:lib(0):func(0):reason(1)
[2023/07/25 05:26:44] [error] [output:s3:s3.0] PutObject request failed
[2023/07/25 05:26:46] [error] [tls] error: error:00000001:lib(0):func(0):reason(1)
[2023/07/25 05:26:46] [error] [tls] error: error:00000001:lib(0):func(0):reason(1)
[2023/07/25 05:26:46] [error] [output:s3:s3.1] PutObject request failed
[2023/07/25 05:26:46] [engine] caught signal (SIGSEGV)
#0 0x564e83c08206 in __mk_list_del() at lib/monkey/include/monkey/mk_core/mk_list.h:141
#1 0x564e83c0823d in mk_list_del() at lib/monkey/include/monkey/mk_core/mk_list.h:147
#2 0x564e83c08754 in flb_bucket_queue_delete_min() at include/fluent-bit/flb_bucket_queue.h:113
#3 0x564e83c087a1 in flb_bucket_queue_pop_min() at include/fluent-bit/flb_bucket_queue.h:122
#4 0x564e83c0dbbc in output_thread() at src/flb_output_thread.c:249
#5 0x564e83c4d61f in step_callback() at src/flb_worker.c:43
#6 0x7ff87fc9aea6 in ???() at ???:0
#7 0x7ff87f54da2e in ???() at ???:0
#8 0xffffffffffffffff in ???() at ???:0

CM:
apiVersion: v1
data:
custom_parsers.conf: |
[PARSER]
Name docker
Format json
Time_Keep Off
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
fluent-bit.conf: |
[SERVICE]
Daemon Off
Flush 1
Log_Level info
Parsers_File parsers.conf
Parsers_File custom_parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
storage.path /var/log/flb-storage/
storage.sync normal
storage.checksum off
storage.max_chunks_up 128
storage.backlog.mem_limit 5M

[INPUT]
    Name tail
    Read_from_Head True
    Path /var/log/containers/*.log
    Exclude_Path   /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent*
    multiline.parser docker, cri
    DB                  var/log/flb_container.db
    Tag kube.*
    Mem_Buf_Limit       50MB
    Skip_Long_Lines     On
    Refresh_Interval    10
    Rotate_Wait         30
    storage.type        filesystem

[FILTER]
    Name kubernetes
    Match kube.**
    Merge_Log On
    Kube_Tag_Prefix     kube.var.log.containers.
    Merge_Log_Key       log
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On
    Annotations On
    Labels On
[FILTER]
    Name record_modifier
    Match *
    Record XXXXX-us-east-2
    Record region us-east-2
    Record team XXXX
    Record stage production
    Remove_key $.kubernetes.pod_id
    Remove_key $.kubernetes.master_url
    Remove_key $.kubernetes.container_image_id
    Remove_key $.kubernetes.namespace_id
    Remove_key $.kubernetes.container_hash
[FILTER]
    Name parser
    Match *
    Key_Name log
    Reserve_Data true
    Parser json

[OUTPUT]
    Name s3
    Match *
    bucket XXXX-bucket
    region us-east-2
    use_put_object On
    total_file_size 10M
    upload_timeout 1m
    s3_key_format /XXX/$TAG/%Y/%m/%d/%H/%M/%S/$UUID.gz
    static_file_path             On
    compression gzip
    s3_key_format_tag_delimiters .

[OUTPUT]
    Name s3
    Match *
    bucket XXXX-log
    region us-east-2
    use_put_object On
    total_file_size 10M
    upload_timeout 1m
    s3_key_format /XXXX/$TAG/%Y/%m/%d/%H/%M/%S/$UUID.gz
    static_file_path             On
    compression gzip
    s3_key_format_tag_delimiters .

kind: ConfigMap

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Dec 11, 2023
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant