Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Delayed startup for polled/manual metrics #119

Open
LostKobrakai opened this issue Feb 10, 2022 · 5 comments
Open

[FEATURE] Delayed startup for polled/manual metrics #119

LostKobrakai opened this issue Feb 10, 2022 · 5 comments
Assignees
Milestone

Comments

@LostKobrakai
Copy link
Contributor

LostKobrakai commented Feb 10, 2022

Is your feature request related to a problem? Please describe.
PromEx does start all metrics collection together, but manual or polled metrics might need processes started later in the supervision tree before being able to execute successfully. There's the :manual_metrics_start_delay config, but this neither supports polled metrics nor is is it really bullet prove given startup may not take a fixed amount of time.

Describe the solution you would like to see
A way to have polled and the initial manual metric fetching be delayed without a timeout.

How would you expect this feature to work
Both polled and manual metrics have a group name.
It would be interesting to just configure starting them like this for the supervision tree.

children = [
  {MyApp.PromEx, polled: :delay, manual: :delay},
  …,
  …,
  {MyApp.PromEx.StartMetrics, polled: ["name_a", "name_b"], manual: :all},
  …,
  {MyApp.PromEx.StartMetrics, polled: ["name_c"]}
]

Besides the plain problem statement, this also allows granular startup based on individual metric groups' dependencies, over the "one fixed deley" of :manual_metrics_start_delay.

Additional context

Edit:
This approach would also allow clean shutdowns if metric collection is stopped before dependencies of it are stopped.

@akoutmos
Copy link
Owner

Totally missed this issue. Sorry about that!

I am not sure that I follow 100% based on the example. But it sounds like you would defer those groups from starting and instead you would manually activate them based on the StartMetrics child processes?

@LostKobrakai
Copy link
Contributor Author

Yeah, that's the idea. We have quite a few periodic or manual metrics, which require our repo to be started, but of course we don't want to miss telemetry of the repo itself.

@akoutmos
Copy link
Owner

For the missed metrics problem, I would suggest adding PromEx as the first item in your application.ex supervision tree. IIRC Ecto emits an init event that PromEx attaches to. Phoenix on the other hand does not have any init events (I should probably make a PR for that to Phoenix now that I mention it), and PromEx jumps through some hoops to get the config metrics from Phoenix.

That aside, I can totally see this being added to PromEx as it is a useful feature. Is this something that you would be interested in tackling?

@akoutmos akoutmos added this to the 1.9.0 milestone Feb 26, 2022
@mikl
Copy link
Sponsor

mikl commented Aug 16, 2022

If you put prom_ex first in your application.ex, but have a polling metric that relies on the Ecto repo being available, prom_ex will run the polling code immediately, resulting in a runtime error like this:

[error] Error when calling MFA defined by measurement: MyApp.SalesOrders.PromExPlugin :execute_sales_orders_metrics []
Class=:error
Reason=%RuntimeError{
  message: "could not lookup Ecto repo MyApp.Repo because it was not started or it does not exist"
}

This can be mitigated by checking if the repo is running in your polling callback, or switching the order in application.ex, but that has side-effects.

@mikl
Copy link
Sponsor

mikl commented Aug 16, 2022

Should someone want simply skip the first metrics run (to avoid the error), you can add something like this to your metrics callback:

  if Enum.member?(Ecto.Repo.all_running(), MyApp.Repo) do
      :telemetry.execute(
        @event_name,
        %{count: Function.to_get_count()},
        %{}
      )
    end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants