Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus: Remove unit suffix for unit 1 #4131

Closed

Conversation

ttys3
Copy link

@ttys3 ttys3 commented May 24, 2023

this feature introduced by #3352

according to https://prometheus.io/docs/instrumenting/writing_exporters/#naming

there's no _ratio related spec

and we use requests_total for a long long time, not requests_ratio_total (if use unit "1" for a counter now, we got this mixed metric name)

like do in https://github.com/labstack/echo-contrib/blob/1b74ff73cb919adf80a67dadf44dc6434324782b/echoprometheus/prometheus.go#LL172C4-L172C4

or https://github.com/labstack/echo-contrib/blob/1b74ff73cb919adf80a67dadf44dc6434324782b/prometheus/prometheus.go#L73

requests_total is simple and clear. it is an counter, it count user requests, not related to ratio.

https://prometheus.io/docs/practices/naming/#metric-names

...should have a suffix describing the unit, in plural form. Note that an accumulating count has total as a suffix, in addition to the unit if applicable.
http_request_duration_seconds
node_memory_usage_bytes
http_requests_total (for a unit-less accumulating count)
process_cpu_seconds_total (for an accumulating count with unit)
foobar_build_info (for a pseudo-metric that provides metadata about the running binary)
data_pipeline_last_record_processed_timestamp_seconds (for a timestamp that tracks the time of the latest record processed in a data processing pipeline)

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented May 24, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@pellared pellared added the blocked:CLA Waiting on CLA to be signed before progress can be made label May 24, 2023
@dmathieu dmathieu removed the blocked:CLA Waiting on CLA to be signed before progress can be made label May 24, 2023
CHANGELOG.md Outdated Show resolved Hide resolved
@pellared pellared changed the title chore: remove unit suffix for unit 1 prometheus: Remove unit suffix for unit 1 May 24, 2023
@pellared
Copy link
Member

@dashpole PTAL

@pellared
Copy link
Member

pellared commented May 24, 2023

For reference, related code in Collector: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/8f394d4ffea8d5f06a9019245746ff253be106fd/pkg/translator/prometheus/normalize_name.go#L155-L162:

	// Append _ratio for metrics with unit "1"
	// Some Otel receivers improperly use unit "1" for counters of objects
	// See https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aissue+some+metric+units+don%27t+follow+otel+semantic+conventions
	// Until these issues have been fixed, we're appending `_ratio` for gauges ONLY
	// Theoretically, counters could be ratios as well, but it's absurd (for mathematical reasons)
	if metric.Unit() == "1" && metric.Type() == pmetric.MetricTypeGauge {
		nameTokens = append(removeItem(nameTokens, "ratio"), "ratio")
	}

Co-authored-by: Robert Pająk <pellared@hotmail.com>
@ttys3
Copy link
Author

ttys3 commented May 24, 2023

For reference, related code in Collector: open-telemetry/opentelemetry-collector-contrib@8f394d4/pkg/translator/prometheus/normalize_name.go#L155-L162:

	// Append _ratio for metrics with unit "1"
	// Some Otel receivers improperly use unit "1" for counters of objects
	// See https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aissue+some+metric+units+don%27t+follow+otel+semantic+conventions
	// Until these issues have been fixed, we're appending `_ratio` for gauges ONLY
	// Theoretically, counters could be ratios as well, but it's absurd (for mathematical reasons)
	if metric.Unit() == "1" && metric.Type() == pmetric.MetricTypeGauge {
		nameTokens = append(removeItem(nameTokens, "ratio"), "ratio")
	}

wow, there comes more names normalization.

but I also do not think Append _ratio (for gauges only) is a good choice.

as the docs https://prometheus.io/docs/practices/naming/#base-units (which the commit open-telemetry/opentelemetry-collector-contrib@b828db4#diff-9f77c9a72be0a6feb89132d5c34b7cd4b4edbcd4e6c1845714d30a840c507d8e ref it to)

Percentages (unit is 1): Values are 0–1 (rather than 0–100). ratio is only used as a suffix for names like disk_usage_ratio. The usual metric name follows the pattern A_per_B.

a gauge can be any value, for example disk_free_space, which present the current free space of a disk. it should NOT append a _ratio obviously.

but as open-telemetry/opentelemetry-collector-contrib#10554 said clearly:

Unit 1 must be used only for fractions and ratios.

(do we have spec for this? Unit 1 must be used only for fractions and ratios ? many users may use unit 1 for counters)

maybe we should document that. so users won't set unit 1 for counters.

@dmathieu
Copy link
Member

The OpenTelemetry Specification seems to go against this change.
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#metric-metadata-1

Special case: Converting "1" to "ratio".

@ttys3
Copy link
Author

ttys3 commented May 24, 2023

The OpenTelemetry Specification seems to go against this change. open-telemetry/opentelemetry-specification@main/specification/compatibility/prometheus_and_openmetrics.md#metric-metadata-1

Special case: Converting "1" to "ratio".

the problem is, lot of people using unit 1 for counters or int gauges. like open-telemetry/opentelemetry-collector-contrib#10554

under current otel metric code:
if use unit "1" for a counter requests now, we got this mixed metric name requests_ratio_total for promethus.
that's strange.

so, maybe we should make more clear (maybe in the WithUnit func comment?):

  1. Unit 1 must be used only for fractions and ratios, for other situations, users should just use an empty unit.
  2. ref the spec about Unit 1 https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md?rgh-link-date=2023-05-24T11%3A44%3A27Z#metric-metadata-1

@ttys3
Copy link
Author

ttys3 commented May 24, 2023

and there's a second issue, if the otel metric name is requests_total, after export to promethus, it becomes requests_total_ratio_total

@dmathieu
Copy link
Member

I think making things clearer in the spec would be the first step.

@ttys3
Copy link
Author

ttys3 commented May 24, 2023

not everyone can read the full spec. and sometimes it is even hard to find the right part from the spec.

add a simple comment in the WithUnit func, to make it clear what Unit 1 means. and also put a link to the related spec.

is this OK @dmathieu

@dmathieu
Copy link
Member

Sure.

@MrAlias
Copy link
Contributor

MrAlias commented May 24, 2023

not everyone can read the full spec. and sometimes it is even hard to find the right part from the spec.

add a simple comment in the WithUnit func, to make it clear what Unit 1 means. and also put a link to the related spec.

is this OK @dmathieu

The WithUnit function would not be the correct place to document the behavior of the Prometheus exporter. It applies to much more than the Prometheus exporter.

I agree with @dmathieu that if this is truly undesired behavior it should be addressed at the specification level. However, I'm not sure it is.As is pointed out in the mentioned issue, counters and gauges that measure non-ratio values should be annotating the unit with descriptions of what they measure.

@pellared
Copy link
Member

Closing per #4131 (comment)

@pellared pellared closed this Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants