Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smooth noncontinuous lines in line chart by casting null value to zero #135

Open
Wally110 opened this issue Oct 9, 2022 · 5 comments
Open

Comments

@Wally110
Copy link

Wally110 commented Oct 9, 2022

Is your feature request related to a problem? Please describe.
I have this use case: I need to monitor the fluctuation of metrics stored in database where if a metric value reduces to zero, the corresponding record doesn't exist at all in the database. This is a pretty common use case in monitoring noncontinuous metrics(e.g. order count, user count). When there is not any order in one day, we want the order count to be zero instead of null.

In current situation, query exporter would stop reporting one metric when the corresponding records in database doesn't exist. Prometheus can not read the update value of this metric and will keep reporting its last value. So what we see in the chart is that the line doesn't go down to zero but keep flat.

This is a real use case in my work. It would be great if query exporter have a switch that can cope with this situation.

Describe the solution you'd like
My desired solution is query exporter have a switch by which query exporter can automatically cast null to zero then Prometheus can display lines that reflect real changes of metrics.

I've added this feature to fix this as the way I described above. I'm willing to commit my code to help query exporter more powerful and can be applied in multiple scenarios. Does this make sense?

Describe alternatives you've considered
Another alternative is we add an extra database table that stores all the existed metric attributes. Even when a metric value disappears, query exporter can still read this metric and set value to zero.

I don't think this is a better idea since this adds complexity to database design and changes original scheme.

Additional context
For some use cases, metrics are always continuous so that this issue won't happen. But for those noncontinuous metrics monitoring problem, this feature can vastly reduce user's inconvenience in developing.

@albertodonato
Copy link
Owner

Hi, could you please provide an example of a query for such metrics?

Is there a case where you can't just use COALESCE in the query to ensure a value is always returned?

@Wally110
Copy link
Author

Wally110 commented Oct 9, 2022

Thanks for your reply.

Let's say we have a table orders :

date order_id
10/01/2022 1001
10/01/2022 1002
10/03/2022 1003

Now I want to monitor the number of orders by each day. So I query this table by:

select date, 
    count(distinct order_id) as order_count
from orders
group by date
order by date

Then I got query results as:

date order_count
10/01/2022 2
10/03/2022 1

With scrape interval set to 1 day, query exporter reports two records. Then Prometheus can not find the order_count on missing 10/02/2022 and will set metric value on this date to 2 in time series chart. But, we want order_count to be 0 on 10/02/2022 even when there is no record in table orders.

I fixed this in my use case by checking if previously reported metric was reported in current time. If not, modified query exporter will still report this metric but with value set to 0.

@albertodonato
Copy link
Owner

How is the metric defined in query-exporter? Specifically, how does the date get reported?

@Wally110
Copy link
Author

OK, sorry for the unclear. Let me rephrase here.
The goal is to monitor the fluctuation of number of orders for each day. And the detailed order info is stored in this table orders:

date order_id
10/01/2022 1001
10/01/2022 1002
10/03/2022 1003

date indicates the date when one order placed. order_id is the id of an order. We count distinct order_id to get the number of orders for each day.
Now we set the scrape_interval=1day, and the sql query goes to:

select count(distinct order_id) as order_count
from orders
where date between '$scrape_time-1day' and '$scrape_time' -- pseudo code

We define the metric as metric_order_count{}. For simplicity, we don't add any labels here.
Let's go through the process step by step as below:

  1. At the last second of 10/01/2022, query-exporter run the query and got the metric: metric_order_count{}=2. Prometheus read it and added a timestamp:
metric_order_count{timestamp:'10/01/2022'}=2
  1. At the last second of 10/02/2022, query-exporter run the query but didn't get the metric, cause no records of 10/02/2022 is in orders. Prometheus can not read it either, so the line trend in time series chart kept flat. This gave us an impression that the metric value on 10/01/2022 was still 2. No new metric value to refresh it !
  2. At the last second of 10/03/2022, query-exporter run the query and got the metric: metric_order_count{}=1. Prometheus read it and added a timestamp:
metric_order_count{timestamp:'10/03/2022'}=1

The real list of number of orders should be:

date order_count
10/01/2022 2
10/02/2022 0
10/03/2022 1

But we got:

date order_count
10/01/2022 2
10/02/2022 2
10/03/2022 1

COALESCE doesn't work here since no records of 10/02/2022 is in orders at all. Unless we fill in every single date manually into the table. In this case, we add one row:

date order_id
10/02/2022 null

Then we can use COALESCE. But this is not practical.

@albertodonato
Copy link
Owner

Hi, thanks for the explaination, but it would be useful to have a sample of config file you're using (both for the query and the metric configuration)

I'm not sure how you get metric_order_count{timestamp:'10/03/2022'}=1 given that you don't have labels. Also, if you have a COUNT() query, that should return zero and not null, if no row matches the query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants