Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pos file cleanup #3488

Closed
toyaser opened this issue Aug 17, 2021 · 4 comments
Closed

pos file cleanup #3488

toyaser opened this issue Aug 17, 2021 · 4 comments

Comments

@toyaser
Copy link

toyaser commented Aug 17, 2021

Describe the bug

Hi,

I am trying to understand how the fluentd pos file cleanup happens.

At the moment, I am using docker image of fluentd version 1.11.5-1.0, but I have also tried version 1.13.0-1.0.

Here is a sample conf file I have:

<source>
  @type tail
  @id in_tail_ks_application_log
  path "ks-*.log"
  pos_file "ks-auditlog.log.pos"
  pos_file_compaction_interval "1m"
  tag ks-audit-logs.*
  refresh_interval 10
  enable_watch_timer false
  enable_stat_watcher true
  limit_recently_modified "7d"
  read_from_head true
  <parse>
    @type json
  </parse>
</source>
<filter ks-audit-logs.**>
  @type genhashvalue_alt
  use_entire_record true
  hash_type sha256    # md5/sha1/sha256/sha512
  base64_enc true
  base91_enc false
  set_key _hash
  separator _
  inc_time_as_key false
  inc_tag_as_key false
</filter>
<match ks-audit-logs.**>
  @type copy
  <store>
    @type elasticsearch
    host "myhost"
    port "5601"
    scheme "https"
    ssl_version "TLSv1_2"
    logstash_prefix "index"
    logstash_dateformat %Y.%m
    tag_key @log_name
    user "a_user" 
    password "something"
    id_key _hash
remove_keys _hash
@log_level "info"
logstash_format true
include_tag_key true
reload_connections false
reconnect_on_error true
reload_on_failure true
<buffer>
   flush_thread_count 8
   flush_interval 5s
</buffer>
  </store>
</match>

I am finding that when the compaction runs to clean up the pos file, if the file was deleted while fluentd was stopped and not running, that while expect the line to be removed from the pos file on fluentd startup or on compaction, neither happens and the line will be there forever as it seems that fluentd will no longer ever clean up that line.

I also found that if i change the file pattern in path, I expect on restart of fluentd that files that no longer match the pattern should be removed from the pos file. I am finding that this is not the case.

My concern is clean ups of pos file will not happen in the two above scenarios and my pos file will just keep growing with time.

To Reproduce

Scenario 1

  1. Start fluentd and it will create the pos file with the log files it is monitoring.
  2. Stop fluentd and delete a couple of the files that fluentd was monitoring.
  3. Restart fluentd.
  4. Pos file will still have references to deleted files and they will be there even after compaction runs.

Scenario 2

  1. Start fluentd and it will create the pos file with the log files it is monitoring.
  2. Stop fluentd
  3. Change the format of files to monitor
  4. Restart fluentd
  5. Pos file will still have references to files that no longer match path format

The only scenario where pos file cleanup happens correctly is when fluentd is running and files are deleted or renamed while fluentd is running.

Expected behavior

Pos file on restart of fluentd and run of compaction should clean up all files that no longer exist and ones that do not match path pattern.

Your Environment

- Fluentd version: 1.11.5-1.0 and 1.13.0-1.0
- Operating system: NAME="CentOS Linux" VERSION="7 (Core)"

Your Configuration

<source>
  @type tail
  @id in_tail_ks_application_log
  path "ks-*.log"
  pos_file "ks-auditlog.log.pos"
  pos_file_compaction_interval "1m"
  tag ks-audit-logs.*
  refresh_interval 10
  enable_watch_timer false
  enable_stat_watcher true
  limit_recently_modified "7d"
  read_from_head true
  <parse>
    @type json
  </parse>
</source>
<filter ks-audit-logs.**>
  @type genhashvalue_alt
  use_entire_record true
  hash_type sha256    # md5/sha1/sha256/sha512
  base64_enc true
  base91_enc false
  set_key _hash
  separator _
  inc_time_as_key false
  inc_tag_as_key false
</filter>
<match ks-audit-logs.**>
  @type copy
  <store>
    @type elasticsearch
    host "myhost"
    port "5601"
    scheme "https"
    ssl_version "TLSv1_2"
    logstash_prefix "index"
    logstash_dateformat %Y.%m
    tag_key @log_name
    user "a_user" 
    password "something"
    id_key _hash
remove_keys _hash
@log_level "info"
logstash_format true
include_tag_key true
reload_connections false
reconnect_on_error true
reload_on_failure true
<buffer>
   flush_thread_count 8
   flush_interval 5s
</buffer>
  </store>
</match>

Your Error Log

NA, no error just unexpected behaviour

Additional context

No response

@ashie
Copy link
Member

ashie commented Aug 18, 2021

It's same issue with #3433 and already fixed in #3467, but not released yet.
Please try 1.14.0.rc: gem install fluentd --version=1.14.0.rc

@ashie ashie closed this as completed Aug 18, 2021
@ashie
Copy link
Member

ashie commented Aug 18, 2021

If you have problems in v1.14.0.rc, please let me know. I'll reopen this.

@toyaser
Copy link
Author

toyaser commented Aug 18, 2021

If you have problems in v1.14.0.rc, please let me know. I'll reopen this.

Thank you for the quick response. I am using docker fluentd. Do you know if there are plans to update version 1.13?

@ashie
Copy link
Member

ashie commented Aug 18, 2021

v1.14.0 isn't released yet, it's currently RC (release candidate) state.
When we release it, we'll also release Docker images ASAP.
v1.14.0 will be released at the end of this month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants