Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Oban metric [:prom_ex, :plugin, :oban, :queue, :length, :count] is not fetching queue states when length is 0 #202

Open
linqueta opened this issue May 12, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@linqueta
Copy link

Describe the bug
At my company when creating a chart for when some queue gets zero jobs executing after some time I faced that Oban is not sending when a queue state reaches the number 0

Running the same query that we run to fetch queues grouped by state I could see a pattern like this:

Example:
Queues: create_order and deliver_order

After creating an order the job gets completed and triggered to deliver, so the response of the query will be:

[ 
  {"create_order", "executing", 102},
  {"create_order", "completed", 16087},
  {"create_order", "discarded", 3030},
  {"deliver_order", "executing", 2535},
  {"deliver_order", "discarded", 116}
]

After some time my program will finish the creation and I'll have:

[ 
  {"create_order", "completed", 16187},
  {"create_order", "discarded", 3030},
  {"deliver_order", "executing", 2535},
  {"deliver_order", "discarded", 116}
]

So, I'm interested to understand when the queue create_order in the state executing gets 0 but it's not possible since this metric is implemented as last_value (implementation) and we are keeping the last value of the last pooling round (ex: 5 seconds before finishing the creation the value was 10).

To Reproduce
Steps to reproduce the behavior:

  1. Add the following plugins to PromEx: Oban
  2. Create some application with one queue at least (could be the queue default)
  3. Create a job with a sleep of 5 seconds for this queue and trigger it once
  4. After 15 seconds (to force overlap one polling window) check in the /metrics about the state of the queue default for the state executing and you will see the value 1 even the job has the state completed

Expected behavior
I expected that for all possible Oban Job states ([:scheduled, :available, :executing, :retryable, :cancelled, :completed, :discarded]) the PromEx Oban plugin sets as 0 if the state wasn't found into the database, avoiding the wrong value set into the last_value metric.

Environment

  • Elixir version: 23.2.3
  • Erlang/OTP version: 1.13.1-otp-23
  • Grafana version: Not needed
  • Prometheus version: Not needed

Additional context

@linqueta linqueta added the bug Something isn't working label May 12, 2023
@linqueta
Copy link
Author

Suggesting a possible solution:

After fetching all queue states from the database, using the function Oban.states() for each queue we can set as 0 for states didn't find, for example, for the queue create_order we se the state executing as 0 after don't have any job executing.

I can open a PR if it's reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants