Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch temporality in BatchSpanProcessor Debug message #2620

Closed
MadVikingGod opened this issue Feb 16, 2022 · 6 comments · Fixed by #2640
Closed

Mismatch temporality in BatchSpanProcessor Debug message #2620

MadVikingGod opened this issue Feb 16, 2022 · 6 comments · Fixed by #2640

Comments

@MadVikingGod
Copy link
Contributor

MadVikingGod commented Feb 16, 2022

Currently, the debug message in the BatchSpanProcessor reports two different temporalities of metrics.

  • The number of spans exported - Delta
  • The number of dropped spans - Cumulative

This issue is to discuss and track changes to BSP and its debug message.

Found in #2615
Originally posted by @MrAlias in #2615 (comment)

@pellared

This comment was marked as resolved.

@MadVikingGod

This comment was marked as resolved.

@MadVikingGod
Copy link
Contributor Author

MadVikingGod commented Feb 16, 2022

To start this conversation I want to point out that this is a Debug log message, to help scope the required time to put into this.

The dropped spans is only updated when processQueue() is blocked (usually because ExportSpan from the exporter is blocked) and the input channel is full. This number is currently not reset anywhere.

The log message was put before ExportSpan() so there would be a log message in cases where the exporter hangs or doesn't respect the ctx. That being the case we won't know how many spans were dropped by this export.

I see three simple solutions:

  1. Leave it as is. If users are building metrics from the debug message I won't stop them but that isn't the primary focus of this message. We could still reset, but we should maybe communicate that this is the previous Export's dropped spans not this one.
  2. Record what happened in place of what we will attempt + reset. This may not be accurate if spans are Enqueued before exportSpans returns and processQueue resumes reading off the channel. As this is for debug purposes it will be accurate enough to notice errors
  3. Track how many spans have been exported for the life of this BSP, making them both cumulative.

@Aneurysm9
Copy link
Member

Can we change the label to total_dropped or something like that to make it clearer that it is a running total? I'm not sure I see an issue with the differing temporalities, as long as it is understood. If that's not an option, I think I'd prefer option 3) making the exported span count cumulative.

@pellared
Copy link
Member

Option 3) looks like an easy to implement, yet usable solution

@MadVikingGod
Copy link
Contributor Author

To hopefully not stall this too long, I will create a PR to rename the label to total_dropped. We can always revisit if there is a better approach offered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants