feat: add capability to purge old histogram data #460

mnpw · 2024-03-08T07:54:32Z

Copy of #451

What

add purge_timeout option to PrometheusBuilder
run a purger that purges based on the purge_timeout

Implements third way as prescribed here to purge old histogram data:

update the builder to generate a future which both drives the Hyper server future as well as a call to get_recent_metrics on an interval

Fixes #245

- add purge_timeout option to PrometheusBuilder - run a purger that purges based on the purge_timeout

tobz

This is looking good, but just a few notes/requests to try and make this a little more generic/understandable to folks.

tobz · 2024-03-09T15:37:35Z

metrics-exporter-prometheus/src/builder.rs

+            if let Ok(handle) = runtime::Handle::try_current() {
+                handle.spawn(purger);
+            } else {
+                let thread_name = "metrics-exporter-prometheus-purger";
+
+                let runtime = runtime::Builder::new_current_thread()
+                    .enable_all()
+                    .build()
+                    .map_err(|e| BuildError::FailedToCreateRuntime(e.to_string()))?;
+
+                thread::Builder::new()
+                    .name(thread_name.to_owned())
+                    .spawn(move || runtime.block_on(purger))
+                    .map_err(|e| BuildError::FailedToCreateRuntime(e.to_string()))?;
+            };
+        }


No need for the conditional logic here: just spawn the future directly.

(We already document that this method must be called from within a Tokio runtime or else it will panic.)

tobz · 2024-03-09T15:37:42Z

metrics-exporter-prometheus/src/builder.rs


        let exporter_config = self.exporter_config.clone();
        let recorder = self.build_recorder();
        let handle = recorder.handle();

+        // use the handle to recorder
+        // #[cfg(not(feature = "push-gateway"))]


Delete this line.

tobz · 2024-03-09T15:44:41Z

metrics-exporter-prometheus/src/recorder.rs

+
+    /// Purges registry's histogram data by draining it into the distribution. This should be
+    /// called periodically to prevent the accumulation of histogram samples.
+    pub fn purge(&self) {


I think we should just change the wording overall from "purge" to "upkeep", and make this run_upkeep.

Purging to me sounds like getting rid of, when realistically we're just doing periodic cleanup work.

tobz · 2024-03-09T15:48:33Z

metrics-exporter-prometheus/src/builder.rs

+    /// Sets the purge timeout for metrics.
+    ///
+    /// If a purge timeout is set, the purger will call `.render()` on the registry, causing
+    /// the values from histograms to be drained out. This ensures that stale histogram values
+    /// do not persist indefinitely.


Similar note here about just changing the wording overall from "purge"/"purging" to "upkeep".

I would also remove the specific bit about calling .render(), since it doesn't actually do that anymore. Just be generic, something like:

/// Sets the upkeep interval. /// /// The upkeep task handles periodic maintenance operations, such as draining histogram data, /// to ensure that all recorded data is up-to-date and prevent unbounded memory growth.

tobz · 2024-03-09T15:53:05Z

metrics-exporter-prometheus/src/builder.rs

@@ -128,6 +129,7 @@ impl PrometheusBuilder {
            buckets: None,
            bucket_overrides: None,
            idle_timeout: None,
+            purge_timeout: None,


I think we should actually enable this by default.

Thinking more about it, it's a quality of life improvement to avoid unbounded memory growth for people with a high rate of histogram metrics, or who scrape/push their Prometheus exporter infrequently.

Probably 5 seconds as the default?

Makes sense, 5 seconds is a good default. Also I can't think of a use case where somebody would not want to run upkeep so making this non-optional would be sensible

tobz

Perfect. Nice and simple. 👍🏻

tobz · 2024-03-16T16:29:11Z

Released in metrics-exporter-prometheus@v0.14.0.

Thanks again for your contribution!

mnpw added 3 commits March 8, 2024 13:21

feat: add capability to purge old histogram data

8f216a6

- add purge_timeout option to PrometheusBuilder - run a purger that purges based on the purge_timeout

chore: only drain histogram during purge

d190f97

Merge branch 'main' into histogram-cleanup

600e6cc

tobz requested changes Mar 9, 2024

View reviewed changes

mnpw added 2 commits March 12, 2024 11:22

chore: improve naming and add default timeout for upkeeping operation

5f7ae1b

Merge branch 'main' into histogram-cleanup

0ab4add

tobz approved these changes Mar 13, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into histogram-cleanup

d00ba6b

tobz merged commit 3c71988 into metrics-rs:main Mar 16, 2024
12 checks passed

tobz added C-exporter Component: exporters such as Prometheus, TCP, etc. E-simple Effort: simple. T-enhancement Type: enhancement. S-awaiting-release Status: awaiting a release to be considered fixed/implemented. labels Mar 16, 2024

tobz removed the S-awaiting-release Status: awaiting a release to be considered fixed/implemented. label Mar 16, 2024

gauravkumar37 mentioned this pull request Mar 26, 2024

Histogram unbounded memory even with run_upkeep #467

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add capability to purge old histogram data #460

feat: add capability to purge old histogram data #460

mnpw commented Mar 8, 2024

tobz left a comment

tobz Mar 9, 2024

tobz Mar 9, 2024

tobz Mar 9, 2024

tobz Mar 9, 2024

tobz Mar 9, 2024

mnpw Mar 12, 2024

tobz left a comment

tobz commented Mar 16, 2024

feat: add capability to purge old histogram data #460

feat: add capability to purge old histogram data #460

Conversation

mnpw commented Mar 8, 2024

What

tobz left a comment

Choose a reason for hiding this comment

tobz Mar 9, 2024

Choose a reason for hiding this comment

tobz Mar 9, 2024

Choose a reason for hiding this comment

tobz Mar 9, 2024

Choose a reason for hiding this comment

tobz Mar 9, 2024

Choose a reason for hiding this comment

tobz Mar 9, 2024

Choose a reason for hiding this comment

mnpw Mar 12, 2024

Choose a reason for hiding this comment

tobz left a comment

Choose a reason for hiding this comment

tobz commented Mar 16, 2024