Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus exporter: idle metrics not re-exported when becoming active again #372

Open
dezyh opened this issue May 25, 2023 · 7 comments
Open
Labels
C-exporter Component: exporters such as Prometheus, TCP, etc. E-complex Effort: complex. T-bug Type: bug. T-ergonomics Type: ergonomics.

Comments

@dezyh
Copy link

dezyh commented May 25, 2023

Issue

During periods of inactivity, after the idle_timeout period, metrics are correctly no longer exported. However, if these same metrics become active again, they are not re-exported.

This is problematic for my use case since my server provides long-lived streams with periods of activity lasting 15 minutes and then long periods of inactivity (>90 minutes). Additionally, even without supporting long-lived streams, clients can connect anytime before data is available, which would trigger the same idle timeout -> not re-exported issue.

Example

Currently I'm constructing and installing a prometheus push gateway exporter (with VictoriaMetrics instead of Prometheus):

PrometheusBuilder::new()
    .with_push_gateway(addr, Duration::from_secs(1))
    .expect("invalid push gateway endpoint")
    .idle_timeout(
        MetricKindMask::COUNTER | MetricKindMask::HISTOGRAM | MetricKindMask::GAUGE,
        Some(Duration::from_secs(30 * 60)), // 30 minutes
    )
    .install()
    .expect("failed to install prometheus exporter");

Timeline

  • @ 0m
    • client connects to server
    • client is assigned a long-lived stream with stream_id=A
    • metrics for stream_id=A start being exported (all 0s)
  • @ 30m
    • idle timeout kicks in
    • metrics for stream_id=A no longer exported
  • @ 60m
    • data is available
    • client's long-lived stream with stream_id=A starts receiving data
    • metrics not re-exported = no metrics for long-lived stream with stream_id=A being exported

Crate Versions

I'm currently using the following (which are a tiny bit out of date if that matters):

metrics = "0.20.1"
metrics-exporter-prometheus = { version = "0.11.0", features = ["push-gateway"] }
metrics-util = "0.14.0"
@tobz
Copy link
Member

tobz commented May 30, 2023

@dezyh Hmmm, interesting.

This sounds like it's potentially related to #314.

Does your code only use the emission macros directly (i.e. only ever using counter!(...), histogram!(...), etc) or do you register metrics and interact with the handle types (Counter/Gauge/Histogram) themselves?

@dezyh
Copy link
Author

dezyh commented May 31, 2023

I only register metrics and interact with the handle types. Let me do some testing and build some reproduction cases and I'll let you know if I find anything interesting.

@tobz
Copy link
Member

tobz commented May 31, 2023

If you're using the handle types directly, then the linked issue is definitively the cause.

@tobz tobz added C-exporter Component: exporters such as Prometheus, TCP, etc. E-complex Effort: complex. T-bug Type: bug. T-ergonomics Type: ergonomics. labels Sep 7, 2023
@dezyh
Copy link
Author

dezyh commented Nov 27, 2023

I just saw that #394 was merged which would mean there's no work-around for this issue anyone.

Just to clarify from your last message, here is how we we're currently using metrics. I believe this is "using the handle types directly".

use metrics::register_counter;

let messages_received = register_counter!("messages_received", "stream_id" => id);

messages_received.increment(1);

@tobz
Copy link
Member

tobz commented Nov 27, 2023 via email

@dezyh
Copy link
Author

dezyh commented Nov 28, 2023

The not re-exported issue occurs when using the handle types (register_counter!). The solution was therefore to use the emission macros (counter!) instead.

I believe this is no longer possible as #394 because the current emission macro (counter!) is being removed and replaced with the macro which returns the handle type (register_counter! -> counter!).

I might be confused on the recent changes or about what is a handler type vs emission macro, so maybe best to wait until the next release that includes #394 and I'll try it out then and let you know.

(My previous message was just clarifying that register_counter! was the handler type mentioned previously. Sorry if there was any confusion)

@tobz
Copy link
Member

tobz commented Nov 29, 2023

I understand the confusion now.

Under the hood, a metric always has to be registered before it can be used, whether you're holding on to it long-term (such as keeping it as a field in a struct) or just immediately performing the operation (increment, etc) and then moving on.

When calling register_*!, you get back the return value from Recorder::register_counter, so in that case, the owned Counter handle type representing your counter. When calling counter!, we still register the counter... we just use the returned handle type as a temporary to perform the operation, and then it's dropped.

Now, circling back to my comment prior, the change in #394 just means that let counter = counter!("name"); is the new way to do what you previously would have called let counter = register_counter!("name"); to do. For emission, you call the operation method inline, so instead of counter!("name", 42);, you would just do counter!("name").increment(42);.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-exporter Component: exporters such as Prometheus, TCP, etc. E-complex Effort: complex. T-bug Type: bug. T-ergonomics Type: ergonomics.
Projects
None yet
Development

No branches or pull requests

2 participants