Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a feature to clean up position informations on in_tail #1126

Closed
thkoch2001 opened this issue Jul 27, 2016 · 11 comments · Fixed by #2805
Closed

Add a feature to clean up position informations on in_tail #1126

thkoch2001 opened this issue Jul 27, 2016 · 11 comments · Fixed by #2805

Comments

@thkoch2001
Copy link

  • fluentd version: 0.12.20 - commit aee8086
  • environment: fluentd pods running on Google Container Engine / kubernetes

Google Container Engine uses fluentd pods to collect container log files via the in_tail plugin and forward the logs to Stackdriver logging.

When a container is deleted, kubernetes also deletes the containers log file and there will never be a log file at this filesystem path again.

However the position file will never clean up the obsolete line in the position file although the position value is ffffffffffffffff.

We see production clusters with position files of over 10000 lines.

Can this cause performance problems with fluentd? Should this be fixed?

The config stanza for the containers log files is:

<source>
  type tail
  format json
  time_key time
  path /var/log/containers/*.log
  pos_file /var/log/gcp-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag reform.*
  read_from_head true
</source>
@repeatedly
Copy link
Member

repeatedly commented Jul 27, 2016

in_tail removes untracked file positions at start phase.
So if you restart fluentd, pos_file is updated.

Can this cause performance problems with fluentd?

I'm not sure. I didn't receive any report of pos_file releated performance issue.

class FilePositionEntry

You can see pos_file implementation is here.
I think this IO cost should be ignored on normal file system.

@tagomoris
Copy link
Member

tagomoris commented Jul 29, 2016

The file which grows infinitely sounds terrible in production environment (even if that growing speed is slow, or if there are no reports about performance regression).
I think we should add any features to clean it up to tail plugin.

But currently, we have a plan to switch to use storage plugin from pos file for that purpose.
I think we can add this cleanup feature at the same time with switching storage plugins.

@tagomoris tagomoris changed the title in_tail plugin does not clean deleted log files from position file Add a feature to clean up position informations on in_tail Jul 29, 2016
@repeatedly
Copy link
Member

I added note for this issue on http://docs.fluentd.org/articles/in_tail.

@piosz
Copy link

piosz commented Nov 18, 2016

cc @crassirostris

@mrkstate
Copy link

mrkstate commented Jul 6, 2018

How do we get this feature (defect) fixed? This is creating a situation for us - as the POS file is growing rather quickly and the process has to continue to comb thru this file. We see constant CPU cycles even after no additional data has been written to the tailed log.

@roffe
Copy link

roffe commented Dec 6, 2018

@mrkstate Run a cron that kills fluentd every 30 minutes https://github.com/roffe/kube-gelf/blob/master/cron.yaml

@Krishna1408
Copy link

hi @roffe If i kiil fluentd won't it create problem for logging ? e.g. fluentd was down for 2 minutes so in that case won't I loose the log data for those 2 minutes ?

@roffe
Copy link

roffe commented Jan 9, 2019

@Krishna1408 no, since the position file will let it know where to pickup from last time

@TiagoJMartins
Copy link

Is restarting fluentd still the only solution to this? I've just confirmed it's still happening and would be nice to know if a fix is underway or if I can try and tackle the issue.

@ganmacs
Copy link
Member

ganmacs commented Feb 5, 2020

This feature will be released in the next version. #2805

@juliantaylor
Copy link

we see position file corruption with the compaction enabled, see #2918

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants