-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in_tail is not picking up all log files in /var/log/containers/*.log #3357
Comments
I have done some additional testing, by evicting that pod called service5, which had a large log file, then restarting the Fluentd pod, it seems that all log files are now picked up. I suspect this comes down to a few questions in regards to how big is too large for Fluentd in_tail and are there any optimization that can be made to help support this scenario? I am scouring https://docs.fluentd.org/input/tail and the only thing I see related to large files is https://docs.fluentd.org/input/tail#read_from_head which we do have enabled. I also see https://docs.fluentd.org/input/tail#enable_watch_timer which I need to do a bit more research on, but this potentially could help us since we are using the default value of Is there anything else that you would recommend here? |
Is it reproducible? |
I went to the K8s node where that evicted pod, service5, landed and it looks like everything is working as expected. I have also deleted the Fluentd pod again to see if I could reproduce the behavior seen before, but things are still working. @ashie, if this happens again, is there any better information I could capture that would help troubleshoot this? Just let me know what I can capture if this happens again and I will do so. Thanks for looking at this. |
Similar issue?: #3239 |
I wanted to see if Fluentd was able to read the full file and move on, and indeed I can see the following log lines in Fluentd:
This would indicate that it took 18 min to read that file because once it was done reading that file, it moved on. I suspect the cause here is a large log file. This means that I can't reproduce the same thing that I saw before. I will dig more into this to see if it is related to #3239. Thanks, @ashie. |
Now I've found that probably #2478 is a almost same issue with this. |
Check CONTRIBUTING guideline first and here is the list to help us investigate the problem.
Describe the bug
in_tain plugin isn't following all files on start of Fluentd
To Reproduce
I was troubleshooting why we weren't getting logs from a specific pod. I killed the container using kubectl delete pod command. When it comes back up, it isn't following all of the files that exist in the /var/log/container/ dir. It followed 5 logs out of 43 which currently exist in /var/log/containers/.
Expected behavior
Follow all logs that exist in the /var/log/containers/ directory if there is a file (symlink) there.
Your Environment
fluentd --version
ortd-agent --version
: fluentd 1.12.2More specifically 2613fcb (we build from source)
cat /etc/os-release
: CentOS 7uname -r
: 4.18.0-240.15.1.el8_3.x86_64If you hit the problem with older fluentd version, try latest version first. Done.
Your Configuration
Your Error Log
Additional context
The last log here, service5.log is one of our larger log files, which at the time was about 350MB and I am wondering if this is a potential cause for this issue. Our limit before rotation is 500MB though. I couldn't imagine this is enough information to really dig deep here so please feel free to engage me in the Fluentd slack channel and I can provide more context and/or know what better information can help here.
The text was updated successfully, but these errors were encountered: