You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When using Fluent Bit 2.2.2 to collect syslogs using the syslog input (without parsing) and then parsing it in a separate parsing step it seems that some keys/values are put twice in the resulting msgbuf, appearing after using the rewrite_tag filter on it. If you parse the re-tagged msgbuf again after that with another parser you will leak memory.
To Reproduce
A complete package with fluent-bit.conf, parsers.conf and docker-compose.yml for use with Docker Compose V2 can be found here: leak-test.tar.gz, just start the test with "docker compose up" and see the valgrind output after running "docker compose restart".
The fluent-bit.conf demonstrating the problem looks like (using dummy input instead of syslog input):
[SERVICE]
flush 1
daemon Off
log_level trace
parsers_file parsers/parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
[INPUT]
Name dummy
Dummy {"date":1710410357.881099,"log":"<30>1 2023-08-21T15:10:16.003413+02:00 testserver test-app 1014 test-app - 2023-08-21 15:10:16.002453+02:00 [info] MSGTEST [TEST@12345 from=\"user.1.test@test.com\" to=\"user.2.test@test.com\" direction=\"in\"]","src_up_raw":"tcp://192.168.37.1:39796"}
Tag syslog
[FILTER]
Name parser
Match_Regex syslog
Key_Name log
Parser syslog-rfc5424
#Parser syslog-rfc3164
Reserve_Data On
Preserve_Key On
[FILTER]
Name rewrite_tag
Match syslog
Rule $ident ^(.+)$ syslog.$1 true
[FILTER]
Name modify
Match syslog.test-app
Condition Key_Exists msgid
Condition Key_Exists pri
Condition Key_Exists extradata
Condition Key_Value_Equals extradata -
Remove msgid
Remove pri
Remove severity
Remove facility
Remove extradata
[FILTER]
Name parser
Match syslog.test-app
#Match syslog
key_name message
parser test-app
Reserve_Data On
Preserve_Key Off
[OUTPUT]
Name stdout
Match *
# Format json_lines
After the rewrite_tag step (which creates syslog.test-app) you can see that the items "time", "host", "ident" and "pid" exist twice (if this is converted to JSON the duplicate keys/values disappears).
Valgrind output after running the test for 89315 seconds:
fluent-bit-1 | ==1== 458,752 bytes in 56 blocks are possibly lost in loss record 1 of 2
fluent-bit-1 | ==1== at 0x48386AF: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
fluent-bit-1 | ==1== by 0x483ADE7: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
fluent-bit-1 | ==1== by 0x341FDB: msgpack_sbuffer_write (sbuffer.h:81)
fluent-bit-1 | ==1== by 0x341CAA: msgpack_pack_map (pack_template.h:745)
fluent-bit-1 | ==1== by 0x342513: flb_parser_regex_do (flb_parser_regex.c:143)
fluent-bit-1 | ==1== by 0x33DADE: flb_parser_do (flb_parser.c:976)
fluent-bit-1 | ==1== by 0x837C01: cb_parser_filter (filter_parser.c:280)
fluent-bit-1 | ==1== by 0x2AD60C: flb_filter_do (flb_filter.c:158)
fluent-bit-1 | ==1== by 0x2A30FA: input_chunk_append_raw (flb_input_chunk.c:1520)
fluent-bit-1 | ==1== by 0x2A3B65: flb_input_chunk_append_raw (flb_input_chunk.c:1861)
fluent-bit-1 | ==1== by 0x38CEBD: input_log_append (flb_input_log.c:71)
fluent-bit-1 | ==1== by 0x38CF48: flb_input_log_append (flb_input_log.c:90)
fluent-bit-1 | ==1==
fluent-bit-1 | ==1== 731,668,480 bytes in 89,315 blocks are definitely lost in loss record 2 of 2
fluent-bit-1 | ==1== at 0x48386AF: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
fluent-bit-1 | ==1== by 0x483ADE7: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
fluent-bit-1 | ==1== by 0x341FDB: msgpack_sbuffer_write (sbuffer.h:81)
fluent-bit-1 | ==1== by 0x341CAA: msgpack_pack_map (pack_template.h:745)
fluent-bit-1 | ==1== by 0x342513: flb_parser_regex_do (flb_parser_regex.c:143)
fluent-bit-1 | ==1== by 0x33DADE: flb_parser_do (flb_parser.c:976)
fluent-bit-1 | ==1== by 0x837C01: cb_parser_filter (filter_parser.c:280)
fluent-bit-1 | ==1== by 0x2AD60C: flb_filter_do (flb_filter.c:158)
fluent-bit-1 | ==1== by 0x2A30FA: input_chunk_append_raw (flb_input_chunk.c:1520)
fluent-bit-1 | ==1== by 0x2A3B65: flb_input_chunk_append_raw (flb_input_chunk.c:1861)
fluent-bit-1 | ==1== by 0x38CEBD: input_log_append (flb_input_log.c:71)
fluent-bit-1 | ==1== by 0x38CF48: flb_input_log_append (flb_input_log.c:90)
fluent-bit-1 | ==1==
fluent-bit-1 | ==1== LEAK SUMMARY:
fluent-bit-1 | ==1== definitely lost: 731,668,480 bytes in 89,315 blocks
fluent-bit-1 | ==1== indirectly lost: 0 bytes in 0 blocks
fluent-bit-1 | ==1== possibly lost: 458,752 bytes in 56 blocks
fluent-bit-1 | ==1== still reachable: 0 bytes in 0 blocks
fluent-bit-1 | ==1== suppressed: 0 bytes in 0 blocks
If you change the second regex parsing step (parser test-app) to match syslog instead of syslog.test-app then you still get double key/value pars but no memory leak ("fluent-bit-1 | ==1== All heap blocks were freed -- no leaks are possible").
Expected behavior
To not leak memory and not get double key/value pairs.
Your Environment
Version used: Fluent Bit 2.2.2
Environment name and version: Docker Compose version v2.24.6
Operating System and version: Ubuntu 22.04.4 LTS
Additional context
The reason to use a separate parsing step instead of directly in the syslog input is to be able to parse both RFC5424 and RFC3164 logs.
The text was updated successfully, but these errors were encountered:
If you have several keys with the same name in a map and use the parser filter
plugin with reserve_data set to true then it can assign out_buf multiple times
resulting in leaked memory. This fix makes it continue copying original fields
if reserve_data is set but stop parsing them, similar to when reserve_data is
set to false.
Bug Report
Describe the bug
When using Fluent Bit 2.2.2 to collect syslogs using the syslog input (without parsing) and then parsing it in a separate parsing step it seems that some keys/values are put twice in the resulting msgbuf, appearing after using the rewrite_tag filter on it. If you parse the re-tagged msgbuf again after that with another parser you will leak memory.
To Reproduce
A complete package with fluent-bit.conf, parsers.conf and docker-compose.yml for use with Docker Compose V2 can be found here: leak-test.tar.gz, just start the test with "docker compose up" and see the valgrind output after running "docker compose restart".
The fluent-bit.conf demonstrating the problem looks like (using dummy input instead of syslog input):
After the rewrite_tag step (which creates syslog.test-app) you can see that the items "time", "host", "ident" and "pid" exist twice (if this is converted to JSON the duplicate keys/values disappears).
If you change the second regex parsing step (parser test-app) to match syslog instead of syslog.test-app then you still get double key/value pars but no memory leak ("fluent-bit-1 | ==1== All heap blocks were freed -- no leaks are possible").
Expected behavior
To not leak memory and not get double key/value pairs.
Your Environment
Additional context
The reason to use a separate parsing step instead of directly in the syslog input is to be able to parse both RFC5424 and RFC3164 logs.
The text was updated successfully, but these errors were encountered: