Histogram unbounded memory even with run_upkeep #467

gauravkumar37 · 2024-03-26T21:02:36Z

Thanks for a wonderful library. After the recent merge of #460 I tested the upkeep and idle timeout configs but still see unbounded memory usage if /metrics endpoint is not being called.
Is this still expected even after upkeep?

Testing methodology:

Using axum with axum-prometheus
Using /metrics
Using wrk to benchmark an API with a simple health endpoint
Memory keep on increasing unbounded
Code has idle_timeout of 60s and upkeep of 5s
If I now open /metrics endpoint, memory eventually reduces and the more I open it periodically, the more stable the memory is

So, is my observed behavior still expected even after the introduction of upkeep?

The text was updated successfully, but these errors were encountered:

tobz · 2024-03-27T00:11:28Z

Hmm, the latest (0.6.1) version of axum-prometheus is still on 0.13.x of metrics-exporter-prometheus. Are you using a development version or patching metrics-exporter-prometheus to 0.14.0?

gauravkumar37 · 2024-03-27T06:54:18Z

Correct, hence I patched the version of metrics-exporter-prometheus in axum-prometheus to 0.14.0 to test, also able to verify that adding upkeep_timeout compiles correctly, also verified the 0.14.0 in Cargo.lock.
Patch location: https://github.com/gauravkumar37/axum-prometheus/tree/patch-1

tobz · 2024-03-27T13:23:07Z

Looking at how axum-prometheus uses metrics-exporter-prometheus, this makes sense now: PrometheusBuilder doesn't create the upkeep task unless PrometheusBuilder::build or PrometheusBuilder::install are called, whereas right now, it calls PrometheusBuilder::install_recorder.

Not really sure there's a ton we can do here, or to be frank, that I'm willing to do here. The number of ways to build/install the recorder is already too high for comfort because it's trying to be all things to all people. The simplest thing would be if axum-prometheus switched to using PrometheusBuilder::build and discarded the exporter future if it's not in push gateway mode. It would still have to install the recorder, but it would allow it the chance to grab a recorder handle, and it would allow the upkeep task to be created.

gauravkumar37 · 2024-03-27T13:27:56Z

This is what I am using inside axum framework:

    PrometheusMetricLayerBuilder::new()
        .with_metrics_from_fn(|| {
            PrometheusBuilder::new()
                .idle_timeout(MetricKindMask::HISTOGRAM, Some(Duration::from_secs(300)))
                .upkeep_timeout(Duration::from_secs(60))
                .set_buckets(&[0.1, 0.2, 0.3])
                .unwrap()
                .install_recorder()
                .unwrap()
        })
        .build_pair()

This does create the PrometheusBuilder and installs it. Is there something more to do here?

tobz · 2024-03-27T22:18:34Z

Right, you're using install_recorder, when you need to use build.

gauravkumar37 · 2024-03-29T20:31:14Z

Ok, I got it to work correctly but I must say, that was an unexpected nuance there :-)

So, something like this works and there is no more unbounded memory growth:

    PrometheusMetricLayerBuilder::new()
        .with_metrics_from_fn(|| {
            let (recorder, _) = PrometheusBuilder::new()
                // if metrics are not updated within this time, they will be removed
                .idle_timeout(MetricKindMask::HISTOGRAM, Some(Duration::from_secs(5 * 60)))
                // prevents unbounded memory growth by draining histogram data periodically
                .upkeep_timeout(Duration::from_secs(5))
                .set_buckets(buckets_in_secs.as_slice())
                .expect_or_log(&format!("Failed to set buckets: '{buckets_in_secs:?}'"))
                // instead of install_recorder, use build because the former doesn't start the upkeep tasks
                .build()
                .expect_or_log("Failed to build metrics recorder");
            let handle = recorder.handle();
            metrics::set_global_recorder(recorder).expect_or_log("Failed to set global recorder");
            handle
        })
        .build_pair()

++ @Ptrskay3 for changes in axum-prometheus

Ptrskay3 · 2024-03-30T06:05:23Z

Thanks for the ping. I'm open to changing axum_prometheus, if that makes it safer for most of the people. Not sure when I'll find the time to do it myself however. Created Ptrskay3/axum-prometheus#51.

tobz · 2024-03-31T17:26:24Z

@Ptrskay3 Thanks for opening that. 👍🏻

Like I said upthread, I'm also not a fan of how thorny the crate (this crate, not yours) has become in terms of the API... but it does seem like taking the approach shown by @gauravkumar37 would be the simplest one.

I'll also make a note for myself to potentially contribute that as a PR to axum_prometheus.

tobz added C-exporter Component: exporters such as Prometheus, TCP, etc. E-simple Effort: simple. T-ergonomics Type: ergonomics. T-request Type: request. labels Mar 27, 2024

Ptrskay3 mentioned this issue Mar 30, 2024

Change PrometheusBuilder::install_recorder for the default metrics Ptrskay3/axum-prometheus#51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Histogram unbounded memory even with run_upkeep #467

Histogram unbounded memory even with run_upkeep #467

gauravkumar37 commented Mar 26, 2024

tobz commented Mar 27, 2024

gauravkumar37 commented Mar 27, 2024

tobz commented Mar 27, 2024 •

edited

gauravkumar37 commented Mar 27, 2024

tobz commented Mar 27, 2024

gauravkumar37 commented Mar 29, 2024 •

edited

Ptrskay3 commented Mar 30, 2024

tobz commented Mar 31, 2024

Histogram unbounded memory even with run_upkeep #467

Histogram unbounded memory even with run_upkeep #467

Comments

gauravkumar37 commented Mar 26, 2024

tobz commented Mar 27, 2024

gauravkumar37 commented Mar 27, 2024

tobz commented Mar 27, 2024 • edited

gauravkumar37 commented Mar 27, 2024

tobz commented Mar 27, 2024

gauravkumar37 commented Mar 29, 2024 • edited

Ptrskay3 commented Mar 30, 2024

tobz commented Mar 31, 2024

tobz commented Mar 27, 2024 •

edited

gauravkumar37 commented Mar 29, 2024 •

edited