Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thanos-compactor crashed while pushing out-of-order labels #212

Open
SushilSanjayBhile opened this issue Nov 9, 2023 · 1 comment
Open

Comments

@SushilSanjayBhile
Copy link

SushilSanjayBhile commented Nov 9, 2023

After using this ObjectStore spec, thanos-compactor pod started crashing..

apiVersion: monitoring.banzaicloud.io/v1alpha1
kind: ObjectStore
metadata:
  name: spektra-thanos
spec:
  config:
    mountFrom:
      secretKeyRef:
        name: spektra-thanos-objectstore
        key: spektra-thanos-s3-config.yaml
  bucketWeb:
    deploymentOverrides:
      spec:
        template:
          spec:
            containers:
              - name: bucket
                image: quay.io/thanos/thanos:v0.26.0
  compactor:
    retentionResolutionRaw: 5270400s # 61d
    retentionResolution5m: 5270400s # 61d
    retentionResolution1h: 5270400s # 61d
    dataVolume:
      pvc:
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: MINIO_STORAGE_SIZE
    deploymentOverrides:
      spec:
        template:
          spec:
            containers:
              - name: compactor
                image: quay.io/thanos/thanos:v0.26.0

Here are some logs from thanos-compactor pod:-

level=warn ts=2023-11-09T07:00:36.915552097Z caller=index.go:267 group="0@{receive_replica=\"spektra-thanos-receiver-soft-tenant-0\", tenant_id=\"t1\"}" groupKey=0@12308314071310558143 msg="**out-of-order label set: known bug in Prometheus 2.8.0 and below**" labelset="{__measurement__=\"kubernetes_pod_volume\", __name__=\"fs_used_bytes\", app=\"unknown\", claim_name=\"unknown\", cluster=\"devtb7\", kubernetes_io_config_seen=\"2023-10-20T07:07:41.743485600-07:00\", kubernetes_io_config_source=\"api\", name=\"mongodb-kubernetes-operator\", namespace=\"spektra-system\", node_name=\"appserv85\", pod_template_hash=\"598cb5f96\", pod_name=\"mongodb-kubernetes-operator-598cb5f96-9t4nl\", project=\"unknown\", source=\"t1|devtb7|appserv85\", tenant=\"t1\", volume_name=\"kube-api-access-gdtk4\"}" series=10821
level=warn ts=2023-11-09T07:00:36.916184335Z caller=intrumentation.go:67 msg="changing probe status" status=not-ready reason="error executing compaction: compaction: group 0@12308314071310558143: block id 01HD6WW51TDS9NTVG9XA8BWGE5, try running with --debug.accept-malformed-index: index contains 1207 postings with out of order labels"
level=info ts=2023-11-09T07:00:36.916226751Z caller=http.go:84 service=http/server component=compact msg="internal server is shutting down" err="error executing compaction: compaction: group 0@12308314071310558143: block id 01HD6WW51TDS9NTVG9XA8BWGE5, try running with --debug.accept-malformed-index: index contains 1207 postings with out of order labels"
level=info ts=2023-11-09T07:00:36.917473056Z caller=http.go:103 service=http/server component=compact msg="internal server is shutdown gracefully" err="error executing compaction: compaction: group 0@12308314071310558143: block id 01HD6WW51TDS9NTVG9XA8BWGE5, try running with --debug.accept-malformed-index: index contains 1207 postings with out of order labels"
level=info ts=2023-11-09T07:00:36.917542971Z caller=intrumentation.go:81 msg="changing probe status" status=not-healthy reason="error executing compaction: compaction: group 0@12308314071310558143: block id 01HD6WW51TDS9NTVG9XA8BWGE5, try running with --debug.accept-malformed-index: index contains 1207 postings with out of order labels"
level=error ts=2023-11-09T07:00:36.917788817Z caller=main.go:158 err="group 0@12308314071310558143: block id 01HD6WW51TDS9NTVG9XA8BWGE5, **try running with --debug.accept-malformed-index**: index contains 1207 postings with out of order labels\ncompaction\nmain.runCompact.func7\n\t/app/cmd/thanos/compact.go:422\nmain.runCompact.func8.1\n\t/app/cmd/thanos/compact.go:476\ngithub.com/thanos-io/thanos/pkg/runutil.Repeat\n\t/app/pkg/runutil/runutil.go:75\nmain.runCompact.func8\n\t/app/cmd/thanos/compact.go:475\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\nerror executing compaction\nmain.runCompact.func8.1\n\t/app/cmd/thanos/compact.go:503\ngithub.com/thanos-io/thanos/pkg/runutil.Repeat\n\t/app/pkg/runutil/runutil.go:75\nmain.runCompact.func8\n\t/app/cmd/thanos/compact.go:475\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\ncompact command failed\nmain.main\n\t/app/cmd/thanos/main.go:158\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581"
sushil@batpod:~/Diamanti/gitrepo/helm$ 

In this we can see pod crashed because of: out of order labels
Warning also mentions that: this is known issue in prometheus 2.8.0 and below version
Logs also states that use: --debug.accept-malformed-index to avoid this issue, but in above objectstore we don't have option to specify this flag. Also I went through code and I can see that this deployment.go file as well don't have above flag.

Please add an option in either apiVersion: monitoring.banzaicloud.io/v1alpha1 kind: ObjectStore or please add that flag here somewhere: https://github.com/banzaicloud/thanos-operator/blob/pkg/sdk/v0.3.7/pkg/resources/compactor/deployment.go#L54

@sushilbhile
Copy link

Any update on this issue?

SushilSanjayBhile pushed a commit to SushilSanjayBhile/thanos-operator that referenced this issue Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants