Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counter metric name with duplicated _total suffix after upgrading otel version #3614

Closed
lucasoares opened this issue Jan 24, 2023 · 2 comments
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working pkg:SDK Related to an SDK package response needed Waiting on user input before progress can be made

Comments

@lucasoares
Copy link

Description

I updated OpenTelemetry package versions and fixed every breaking change but now I can't get my metrics to work properly.

After upgrading libraries my counter metrics started to get a weird duplication for the _total value:

Data from /metrics endpoint of the prometheus exporter:

# HELP deckard_query_unlock_total Number of unlocked queries.
# TYPE deckard_query_unlock_total counter
deckard_query_unlock_total{lock_type="lock_nack",queue="panels-synchronizer",service_version="0.8.0",service_namespace="chapter-backend",deployment_environment="staging",pod="deckard-housekeeper-staging-5965c7c768-mmrmc",service_name="deckard"} 3
# HELP deckard_query_unlock_total_total Number of unlocked queries.
# TYPE deckard_query_unlock_total_total counter
deckard_query_unlock_total_total{lock_type="lock_nack",queue="deleted-universes-panels-synchronizer",service_version="0.8.0",service_namespace="chapter-backend",deployment_environment="staging",pod="deckard-housekeeper-staging-5965c7c768-mmrmc",service_name="deckard"} 2
# HELP deckard_query_unlock_total_total_total Number of unlocked queries.
# TYPE deckard_query_unlock_total_total_total counter
deckard_query_unlock_total_total_total{lock_type="lock_nack",queue="socialmetricsproprietary:tiktok:pages",service_version="0.8.0",service_namespace="chapter-backend",deployment_environment="staging",pod="deckard-housekeeper-staging-5965c7c768-mmrmc",service_name="deckard"} 0
# HELP deckard_query_unlock_total_total_total_total Number of unlocked queries.
# TYPE deckard_query_unlock_total_total_total_total counter
deckard_query_unlock_total_total_total_total{lock_type="lock_nack",queue="proprietarypage:instagramdm:pages",service_version="0.8.0",service_namespace="chapter-backend",deployment_environment="staging",pod="deckard-housekeeper-staging-5965c7c768-mmrmc",service_name="deckard"} 15
# HELP deckard_query_unlock_total_total_total_total_total Number of unlocked queries.
# TYPE deckard_query_unlock_total_total_total_total_total counter
deckard_query_unlock_total_total_total_total_total{lock_type="lock_nack",queue="proprietarypage:instagramdm:pages:retroactive",service_version="0.8.0",service_namespace="chapter-backend",deployment_environment="staging",pod="deckard-housekeeper-staging-5965c7c768-mmrmc",service_name="deckard"} 15

Apparently for every combination of labels, a new metric is created with a _total suffix, which is duplicated for each one.

Environment

  • OS: Linux
  • Architecture: x86_64
  • Go Version: 1.18

Versions before:

	go.opentelemetry.io/contrib/instrumentation/go.mongodb.org/mongo-driver/mongo/otelmongo v0.31.0
	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.31.0
	go.opentelemetry.io/otel v1.7.0
	go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.6.3
	go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.6.3
	go.opentelemetry.io/otel/exporters/prometheus v0.30.0
	go.opentelemetry.io/otel/metric v0.30.0
	go.opentelemetry.io/otel/sdk v1.7.0
	go.opentelemetry.io/otel/sdk/metric v0.30.0
	go.opentelemetry.io/otel/trace v1.7.0

Versions after:

	go.opentelemetry.io/contrib/instrumentation/go.mongodb.org/mongo-driver/mongo/otelmongo v0.36.4
	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.36.4
	go.opentelemetry.io/otel v1.11.1
	go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.11.1
	go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.11.1
	go.opentelemetry.io/otel/exporters/prometheus v0.33.0
	go.opentelemetry.io/otel/metric v0.33.0
	go.opentelemetry.io/otel/sdk v1.11.1
	go.opentelemetry.io/otel/sdk/metric v0.33.0
	go.opentelemetry.io/otel/trace v1.11.1

Steps To Reproduce

Configuring metrics and exporter:

registry.go

Used to export custom metrics by default, because of #3405

package metrics

import (
	prom "github.com/prometheus/client_golang/prometheus"
	dto "github.com/prometheus/client_model/go"
)

var (
	_ prom.Gatherer   = &WrappedRegistry{}
	_ prom.Registerer = &WrappedRegistry{}
)

type WrappedRegistry struct {
	labels       []*dto.LabelPair
	promRegistry *prom.Registry
}

func NewWrappedRegistry(promRegistry *prom.Registry, labels ...*dto.LabelPair) *WrappedRegistry {
	return &WrappedRegistry{
		labels:       labels,
		promRegistry: promRegistry,
	}
}

func (wr *WrappedRegistry) Gather() ([]*dto.MetricFamily, error) {
	families, err := wr.promRegistry.Gather()
	if err != nil {
		return nil, err
	}

	for _, f := range families {
		for _, m := range f.Metric {
			m.Label = append(m.Label, wr.labels...)
		}
	}
	return families, nil
}

func (wr *WrappedRegistry) Register(collector prom.Collector) error {
	return wr.promRegistry.Register(collector)
}

func (wr *WrappedRegistry) MustRegister(collector ...prom.Collector) {
	wr.promRegistry.MustRegister(collector...)
}

func (wr *WrappedRegistry) Unregister(collector prom.Collector) bool {
	return wr.promRegistry.Unregister(collector)
}

metrics.go

import (
	"context"
	"net/http"
	"os"
	"strings"

	prometheusclient "github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	dto "github.com/prometheus/client_model/go"
	"gitlab-enterprise.stilingue.com.br/StilingueBackend/deckard/internal/logger"
	"gitlab-enterprise.stilingue.com.br/StilingueBackend/deckard/internal/version"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/prometheus"
	otelmetric "go.opentelemetry.io/otel/metric"
	"go.opentelemetry.io/otel/metric/global"
	"go.opentelemetry.io/otel/metric/instrument"
	"go.opentelemetry.io/otel/metric/instrument/asyncint64"
	"go.opentelemetry.io/otel/metric/instrument/syncint64"
	"go.opentelemetry.io/otel/metric/unit"
	"go.opentelemetry.io/otel/sdk/metric"
	"go.opentelemetry.io/otel/sdk/metric/aggregation"
	"go.opentelemetry.io/otel/sdk/metric/view"
	"go.opentelemetry.io/otel/sdk/resource"
)

func init() {
	registry = NewWrappedRegistry(prometheusclient.NewRegistry(), createDefaultMetrics()...)

	var err error
	exporter, err = prometheus.New(
		prometheus.WithRegisterer(registry),
		prometheus.WithoutUnits(),
		prometheus.WithAggregationSelector(func(ik view.InstrumentKind) aggregation.Aggregation {
			switch ik {
			case view.SyncHistogram:
				return aggregation.ExplicitBucketHistogram{
					Boundaries: []float64{0, 1, 2, 5, 10, 15, 20, 30, 35, 50, 100, 200, 400, 600, 800, 1000, 1500, 2000, 5000, 10000, 15000, 50000},
					NoMinMax:   false,
				}
			}

			return metric.DefaultAggregationSelector(ik)
		}))

	if err != nil {
                panic(err)
	}

	provider := metric.NewMeterProvider(
		metric.WithReader(exporter),
		metric.WithResource(resource.Environment()),
	)

	global.SetMeterProvider(provider)

	createMetrics()

        http.Handle("/metrics", promhttp.HandlerFor(registry, promhttp.HandlerOpts{
	        EnableOpenMetrics: true,
        }))

        go func() {
	        err := http.ListenAndServe(":5555", nil)

	        if err != nil {
		        logger.S(context.Background()).Error("Error starting prometheus exporter.", err)
	        }
        }()
}

func createDefaultMetrics() []*dto.LabelPair {
	result := make([]*dto.LabelPair, 0)

	strPtr := func(s string) *string { return &s }

	otelAttributes := os.Getenv("OTEL_RESOURCE_ATTRIBUTES")

	attributes := strings.Split(otelAttributes, ",")
	for _, attribute := range attributes {
		parts := strings.Split(attribute, "=")

		if len(parts) != 2 {
			continue
		}

		result = append(result, &dto.LabelPair{
			Name:  strPtr(strings.Replace(parts[0], ".", "_", -1)),
			Value: strPtr(parts[1]),
		})
	}

	otelServiceName := os.Getenv("OTEL_SERVICE_NAME")

	result = append(result, &dto.LabelPair{
		Name:  strPtr("service_name"),
		Value: strPtr(otelServiceName),
	})

	return result
}

func createMetrics() {
	meter = global.MeterProvider().Meter(version.Name)

	MetricsMap = NewQueryPoolMetricsMap()

	HousekeeperUnlock, err := meter.SyncInt64().Counter(
		"deckard_query_unlock",
		instrument.WithDescription("Number of unlocked queries."),
	)
        if err != nil {
                panic(err)
        }
}

How I'm using the meter:

		metrics.HousekeeperUnlock.Add(ctx, int64(len(ids)), attribute.String("queue", getQueueName()), attribute.String("lock_type", getLockType()))

Expected behavior

The metric should export with the same name for each combination of labels.

@lucasoares lucasoares added the bug Something isn't working label Jan 24, 2023
@MrAlias MrAlias added pkg:SDK Related to an SDK package area:metrics Part of OpenTelemetry Metrics labels Jan 26, 2023
@MrAlias
Copy link
Contributor

MrAlias commented Jan 30, 2023

This looks to be the bug resolved by #3369.

Can you update your dependencies to the latest releases and re-test. This should be resolved for metrics packages > v.033.0.

@MrAlias MrAlias added the response needed Waiting on user input before progress can be made label Jan 30, 2023
@MrAlias
Copy link
Contributor

MrAlias commented Feb 2, 2023

This looks to be the bug resolved by #3369.

Can you update your dependencies to the latest releases and re-test. This should be resolved for metrics packages > v.033.0.

I'm going to close this assuming upgrading resolved your issue. Please reopen if not.

@MrAlias MrAlias closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working pkg:SDK Related to an SDK package response needed Waiting on user input before progress can be made
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants