Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input_chunk: prevent flb_input_chunk_protect when fs buffering(#4221) #4222

Closed
wants to merge 1 commit into from

Conversation

nokute78
Copy link
Collaborator

Fixes #4221.

#4077 removes return statement.
When it reaches mem_buf_limit, fluent-bit calls flb_input_chunk_protect and it pauses input plugin even if storage.type is filesystem.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • [N/A] Documentation required for this feature

Configuration

[SERVICE]
    Storage.path hoge

[INPUT]
    Name dummy
    dummy {"msg":"hoge"}
    rate 1000
    Mem_Buf_Limit 1kb
    Storage.type filesystem
    Samples 1000

[OUTPUT]
    Name stdout
    Match *

Debug output

$ ../bin/fluent-bit -c a.conf 
Fluent Bit v1.9.0
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/10/24 18:47:57] [ info] [engine] started (pid=45521)
[2021/10/24 18:47:57] [ info] [storage] created root path hoge
[2021/10/24 18:47:57] [ info] [storage] version=1.1.1, initializing...
[2021/10/24 18:47:57] [ info] [storage] root path 'hoge'
[2021/10/24 18:47:57] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/10/24 18:47:57] [ info] [storage] backlog input plugin: storage_backlog.1
[2021/10/24 18:47:57] [ info] [cmetrics] version=0.2.1
[2021/10/24 18:47:57] [ info] [input:storage_backlog:storage_backlog.1] queue memory limit: 95.4M
[2021/10/24 18:47:57] [ info] [sp] stream processor started
[0] dummy.0: [1635068877.578575668, {"msg"=>"hoge"}]
[1] dummy.0: [1635068877.578995732, {"msg"=>"hoge"}]
[2] dummy.0: [1635068877.579650781, {"msg"=>"hoge"}]
snip
[996] dummy.0: [1635068878.601826253, {"msg"=>"hoge"}]
[997] dummy.0: [1635068878.602608499, {"msg"=>"hoge"}]
[998] dummy.0: [1635068878.603778437, {"msg"=>"hoge"}]
[999] dummy.0: [1635068878.604611922, {"msg"=>"hoge"}]
^C[2021/10/24 18:48:02] [engine] caught signal (SIGINT)
[2021/10/24 18:48:02] [ info] [input] pausing storage_backlog.1
[2021/10/24 18:48:02] [ warn] [engine] service will stop in 5 seconds
[2021/10/24 18:48:06] [ info] [engine] service stopped

Valgrind output

$ valgrind --leak-check=full ../bin/fluent-bit -c a.conf 
==45526== Memcheck, a memory error detector
==45526== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==45526== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==45526== Command: ../bin/fluent-bit -c a.conf
==45526== 
Fluent Bit v1.9.0
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/10/24 18:48:57] [ info] [engine] started (pid=45526)
[2021/10/24 18:48:57] [ info] [storage] version=1.1.1, initializing...
[2021/10/24 18:48:57] [ info] [storage] root path 'hoge'
[2021/10/24 18:48:57] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/10/24 18:48:57] [ info] [storage] backlog input plugin: storage_backlog.1
[2021/10/24 18:48:57] [ info] [cmetrics] version=0.2.1
[2021/10/24 18:48:57] [ info] [input:storage_backlog:storage_backlog.1] queue memory limit: 95.4M
[2021/10/24 18:48:57] [ info] [sp] stream processor started
==45526== Warning: client switching stacks?  SP change: 0x57e59c8 --> 0x6c61e10
==45526==          to suppress, use: --max-stackframe=21480520 or greater
==45526== Warning: client switching stacks?  SP change: 0x6c61d88 --> 0x57e59c8
==45526==          to suppress, use: --max-stackframe=21480384 or greater
==45526== Warning: client switching stacks?  SP change: 0x57e59c8 --> 0x6c61d88
==45526==          to suppress, use: --max-stackframe=21480384 or greater
==45526==          further instances of this message will not be shown.
[0] dummy.0: [1635068937.783051276, {"msg"=>"hoge"}]
[1] dummy.0: [1635068937.906606225, {"msg"=>"hoge"}]
snip
[997] dummy.0: [1635068938.946732724, {"msg"=>"hoge"}]
[998] dummy.0: [1635068938.947715166, {"msg"=>"hoge"}]
[999] dummy.0: [1635068938.948873731, {"msg"=>"hoge"}]
^C[2021/10/24 18:49:29] [engine] caught signal (SIGINT)
[2021/10/24 18:49:29] [ info] [input] pausing storage_backlog.1
[2021/10/24 18:49:29] [ warn] [engine] service will stop in 5 seconds
[2021/10/24 18:49:33] [ info] [engine] service stopped
==45526== 
==45526== HEAP SUMMARY:
==45526==     in use at exit: 0 bytes in 0 blocks
==45526==   total heap usage: 7,262 allocs, 7,262 frees, 26,190,517 bytes allocated
==45526== 
==45526== All heap blocks were freed -- no leaks are possible
==45526== 
==45526== For lists of detected and suppressed errors, rerun with: -s
==45526== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

…#4221)

Signed-off-by: Takahiro Yamashita <nokute78@gmail.com>
@@ -1440,6 +1440,7 @@ int flb_input_chunk_append_raw(struct flb_input_instance *in,
cio_chunk_down(ic->chunk);
}
}
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think removing this return so flb_input_chunk_protect gets to run in those cases as well was intentional, @pwhelan do you think this could cause a regression in regards to what you wanted to achieve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will definitely cause a regression. Storage pausing should only turn on for storage.type filesystem when storage.pause_on_chunks_overlimit is activated or if the memory chunks themselves are overlimit. If the intended behavior is to not pause on memory chunks going over limit when storage.type is file system then that should be fixed in flb_input_chunk_protect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leonardo-albertovich @pwhelan Thank you for comments. I'll drop this PR.

Is #4221 a issue or my misunderstanding ? Do we need fix flb_input_chunk_protect ?

@nokute78
Copy link
Collaborator Author

I'll drop this PR since it causes regression.
@leonardo-albertovich @pwhelan Thank you for reviewing.

@nokute78 nokute78 closed this Oct 25, 2021
@nokute78 nokute78 deleted the fix_4221 branch October 25, 2021 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mem_Buf_Limit + Storage.type filesystem seems not to work from v1.8.7
4 participants