Add log throttling per file #2702

rewiko · 2019-11-18T21:24:38Z

What this PR does / why we need it:
Running in a big cluster with high volume of log, it would be nice to throttle the log shipping to avoid network saturation and make it easier to calculate the max throughput per node for example in a Kubernetes cluster.

Tail plugin is watching files and every second reading from the last pointer to the end of the file.
This change allow to stop reading the file after X number of logs lines read and update the pointer in the pos file as usual.

This commit adds log throttling per bytes for each files, should work only when watch_timer is enabled and the stat_watcher (inotify) is disabled.

In order to have this feature for any watch configuration (timer or inotify..), I've updated to add a sleep when you have been reaching the bytes read limit, the sleep would block process and affect other file ingestion, I've added a basic thread array to have a multi-threading ingestion.
However I've noticed you are relying on cool.io, and I was wondering if I should use this library instead.

Would you be interested in this feature?

Some discussions before submitting the PR.

Docs Changes:

adding read_lines_limit_per_notify which by default is set to -1, so no throttling involve by default.

Release Note:

Remove debug log Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

…otify Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

ganmacs

test fails. so could you fix the test first?

ganmacs · 2019-11-26T01:05:48Z

lib/fluent/plugin/in_tail.rb

@@ -201,7 +204,15 @@ def start
      end

      refresh_watchers unless @skip_refresh_on_startup
-      timer_execute(:in_tail_refresh_watchers, @refresh_interval, &method(:refresh_watchers))
+
+      @threads['in_tail_refresh_watchers'] = Thread.new(@refresh_interval) do |refresh_interval|


can you use https://docs.fluentd.org/plugin-helper-overview/api-plugin-helper-thread

ganmacs · 2019-11-26T01:06:47Z

lib/fluent/plugin/in_tail.rb

+      @threads['in_tail_refresh_watchers'].priority = 10 # Default is zero; higher-priority threads will run before lower-priority threads.
+
+      @threads.each { |thr| 
+        thr.join


if it blocks here, all code after this is blocking.

@threads is hash. so thr is Array.

ganmacs · 2019-11-26T01:07:02Z

lib/fluent/plugin/in_tail.rb

+
+      log.debug "Thread refresh_watchers"
+      @threads.each { |thr| 
+            log.debug "Thread #{thr[0]} #{thr[1].status}"


ganmacs · 2019-11-26T01:11:35Z

lib/fluent/plugin/in_tail.rb

@@ -356,6 +392,7 @@ def update_watcher(path, pe)
        end
      end
      rotated_tw = @tails[path]
+


unnecessary

ganmacs · 2019-11-26T01:13:04Z

lib/fluent/plugin/in_tail.rb

      }
    end

    def stop_watchers(paths, immediate: false, unwatched: false, remove_watcher: true)
      paths.each { |path|
        tw = remove_watcher ? @tails.delete(path) : @tails[path]
+        if remove_watcher
+          @threads[path].exit


Thread#exit is dangerous. could you finish this thread in a proper way?

ganmacs · 2019-11-26T01:14:40Z

lib/fluent/plugin/in_tail.rb

-                    if @lines.size >= @watcher.read_lines_limit
+
+                    number_bytes_read += bytes_to_read 
+                    limit_bytes_per_second_reached = (number_bytes_read >= @watcher.read_bytes_limit_per_second and @watcher.read_bytes_limit_per_second > 0)


Suggested change

limit_bytes_per_second_reached = (number_bytes_read >= @watcher.read_bytes_limit_per_second and @watcher.read_bytes_limit_per_second > 0)

limit_bytes_per_second_reached = (number_bytes_read >= @watcher.read_bytes_limit_per_second && @watcher.read_bytes_limit_per_second > 0)

ganmacs · 2019-11-26T01:15:11Z

lib/fluent/plugin/in_tail.rb

+                        # sleep to stop reading files when we reach the read bytes per second limit, to throttle the log ingestion
+                        time_spent_reading = Time.new - start_reading 
+                        @watcher.log.debug("time_spent_reading: #{time_spent_reading} #{ @watcher.path}")
+                        if (time_spent_reading < 1)


Suggested change

if (time_spent_reading < 1)

if time_spent_reading < 1

ganmacs · 2019-11-26T01:23:36Z

lib/fluent/plugin/in_tail.rb

-          log.warn "Skip #{path} because unexpected setup error happens: #{e}"
-          next
+            begin
+              tw = setup_watcher(path, pe)


It can be a race condition. before passing pe to setup_watcher, L334 should be called. but the current code does not ensure it.

ganmacs · 2019-11-26T01:26:15Z

lib/fluent/plugin/in_tail.rb

          end
        end
+        if @threads[path].nil?
+          log.debug "Add Thread #{path}"
+          @threads[path] = Thread.new(path) do |path|


why did you change these codes to run on new thread?

ganmacs · 2019-11-26T01:28:10Z

lib/fluent/plugin/in_tail.rb

                    @fifo.read_lines(@lines)
-                    if @lines.size >= @watcher.read_lines_limit
+
+                    number_bytes_read += bytes_to_read 


IO#readpartial does not alway read bytes_to_read bytes. Is this code ok?

rewiko · 2019-11-27T08:39:48Z

test fails. so could you fix the test first?

Hello,
I will definitely add more test and fix those failing, I've created the PR to be able to discuss in term of design and making sure you would be interested by this kind of feature.

I'm gonna spend more time to add more tests and review your comments.
Thanks

ganmacs · 2019-11-27T20:45:47Z

Ok. I like this feature :)
but, There are some considering points.

Creating threads per file is not acceptable. Because if there are 1000 files to monitor, it creates 1000 threads. https://github.com/fluent/fluentd/pull/2702/files#diff-1da710c9dcc8d0fc57996df7a9d39695R331
This patch restricts the bytes size per file, right? I think that restring all data(not per file) could be better for the situation like "to avoid network saturation and make it easier to calculate the max throughput per node".

rewiko · 2019-11-27T22:40:45Z

Hi @ganmacs,

Creating threads per file is not acceptable. Because if there are 1000 files to monitor, it creates 1000 threads.

That was one of my worries in term of performance, I've been testing with 200 files and a high number of bytes per file, the number of thread hasn't been an issue but I agree having one thread per file is not ideal, I was thinking about using a thread pool but that would need a bigger re-architecture and I'm not sure the inotify will work as expect.

I think that restring all data(not per file) could be better for the situation like "to avoid network saturation and make it easier to calculate the max throughput per node".

I agree, however we are more interested to throttle log in a Kubernetes environment per container which means per file, to do not affect container which are sending at a decent rate.

In the commit, I've been implementing the log throttling per file but this implementation work nicely only with you use the timer because you it will stop reading and if it did not get notified again then some logs might not be read.
Maybe I could reuse that concept and more breaking I can push the function with a sleep and a notify which will be consume by a thread pool. In that case only the file throttled will create thread.

github-actions · 2020-12-18T10:11:43Z

This PR has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this PR will be closed in 30 days

repeatedly · 2020-12-18T10:21:05Z

cosmo0920 is now working on this feature on #3185 with newer in_tail code.
Thanks for the idea!

Add log throttling per file

9f402c1

Remove debug log Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

rewiko force-pushed the add-log-throttling-per-file branch from a4ec3f2 to 8ab733c Compare November 18, 2019 21:26

rewiko added 4 commits November 18, 2019 21:27

Update log throttling read bytes per second

2ce05b1

Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

Update log throttling based on number of bytes and compatible with in…

7ff1365

…otify Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

Add tail concurrency with Thread for TailWatcher

39370c4

Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

implement thread based tailwatcher

416693c

Signed-off-by: Anthony Comtois <anthony.comtois@sky.uk>

rewiko force-pushed the add-log-throttling-per-file branch from 8ab733c to 416693c Compare November 18, 2019 21:31

ganmacs reviewed Nov 26, 2019

View reviewed changes

cosmo0920 mentioned this pull request Nov 30, 2020

Add log throttling per file (revised) #3185

Merged

github-actions bot added the stale label Dec 18, 2020

repeatedly closed this Dec 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add log throttling per file #2702

Add log throttling per file #2702

rewiko commented Nov 18, 2019 •

edited

ganmacs left a comment

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

ganmacs Nov 26, 2019

rewiko commented Nov 27, 2019 •

edited

ganmacs commented Nov 27, 2019

rewiko commented Nov 27, 2019 •

edited

github-actions bot commented Dec 18, 2020

repeatedly commented Dec 18, 2020

	limit_bytes_per_second_reached = (number_bytes_read >= @watcher.read_bytes_limit_per_second and @watcher.read_bytes_limit_per_second > 0)
	limit_bytes_per_second_reached = (number_bytes_read >= @watcher.read_bytes_limit_per_second && @watcher.read_bytes_limit_per_second > 0)

Add log throttling per file #2702

Add log throttling per file #2702

Conversation

rewiko commented Nov 18, 2019 • edited

ganmacs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rewiko commented Nov 27, 2019 • edited

ganmacs commented Nov 27, 2019

rewiko commented Nov 27, 2019 • edited

github-actions bot commented Dec 18, 2020

repeatedly commented Dec 18, 2020

rewiko commented Nov 18, 2019 •

edited

rewiko commented Nov 27, 2019 •

edited

rewiko commented Nov 27, 2019 •

edited