You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In addition to the labels discussed under #2176 and implemented in #2218 and #2225 it would also be good to see workflow related metrics reported by the metrics server. These could be collected from the workflow_run events.
I might try to open a PR for this unless someone quicker beats me to it.
Why is this needed?
The sum of each job run times of a given workflow isn't equal to the actual time the workflow took to finish because of parallel jobs, queuing of the jobs, etc.
If I wanted to measure how long engineers in our organisation need to wait for CI, workflow run times make more sense for us. If our CI team improved a shared or required workflow by replacing/rewriting a job or an action, or changed anything around the (self-hosted) infrastructure e.g. using larger nodes, changing the autoscaling, etc. we would like to measure what the impact of those changes were, if any.
Also, if we wanted to feed into our business metrics by measuring the time a PR took from start to finish including how much time CI took in the process, workflow run times would be better than individual job run times.
Additional context
In my organisation we're working on our own version of workflow and job metrics, but it has its own issues and adds TOIL to the team which could be discarded if the ARC provided these numbers out of the box.
The text was updated successfully, but these errors were encountered:
What would you like added?
In addition to the labels discussed under #2176 and implemented in #2218 and #2225 it would also be good to see workflow related metrics reported by the metrics server. These could be collected from the workflow_run events.
In addition to:
the following would also be useful to see:
I might try to open a PR for this unless someone quicker beats me to it.
Why is this needed?
The sum of each job run times of a given workflow isn't equal to the actual time the workflow took to finish because of parallel jobs, queuing of the jobs, etc.
If I wanted to measure how long engineers in our organisation need to wait for CI, workflow run times make more sense for us. If our CI team improved a shared or required workflow by replacing/rewriting a job or an action, or changed anything around the (self-hosted) infrastructure e.g. using larger nodes, changing the autoscaling, etc. we would like to measure what the impact of those changes were, if any.
Also, if we wanted to feed into our business metrics by measuring the time a PR took from start to finish including how much time CI took in the process, workflow run times would be better than individual job run times.
Additional context
In my organisation we're working on our own version of workflow and job metrics, but it has its own issues and adds TOIL to the team which could be discarded if the ARC provided these numbers out of the box.
The text was updated successfully, but these errors were encountered: