Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Segmentation fault on ingester due race condition when reading blocks that are being deleted #5119

Merged
merged 2 commits into from
Feb 2, 2023

Conversation

alanprot
Copy link
Member

@alanprot alanprot commented Feb 2, 2023

What this PR does:

When streaming metadata from ingesters we are closing the querier before the request finish.
This cause a race condition when the blocks from where the metadata is being read is deleted (when we close the querier we decrement the pendingRead waitgroup on the blocks allowing it to be deleted)

See:

func (i *Ingester) LabelValuesStream(req *client.LabelValuesRequest, stream client.Ingester_LabelValuesStreamServer) error {
resp, err := i.LabelValues(stream.Context(), req)

q, err := db.Querier(ctx, mint, maxt)
if err != nil {
return nil, err
}
defer q.Close()

This information is being read from an mmaped file on the index header, see:

https://github.com/prometheus/prometheus/blob/c70d85baed260f6013afd18d6cd0ffcac4339861/tsdb/index/index.go#L1517

This PR create commons methods that also return a cleanup function that the called should call after the request is done.

signal SIGSEGV: segmentation violation code=0x1 addr=0x7fb8bd27e8c8 pc=0x471061]
 
goroutine 269815502 [running]:
runtime.throw({0x22eb250?, 0x3df2?})
        GoLang-1.x.119322.0/AL2_x86_64/DEV.STD.PTHREAD/build/lib/src/runtime/panic.go:1047 +0x5d fp=0xc06932ada0 sp=0xc06932ad70 pc=0x43bbdd
runtime.sigpanic()
        GoLang-1.x.119322.0/AL2_x86_64/DEV.STD.PTHREAD/build/lib/src/runtime/signal_unix.go:842 +0x2c5 fp=0xc06932adf0 sp=0xc06932ada0 pc=0x452585
runtime.memmove()
        GoLang-1.x.119322.0/AL2_x86_64/DEV.STD.PTHREAD/build/lib/src/runtime/memmove_amd64.s:184 +0x141 fp=0xc06932adf8 sp=0xc06932adf0 pc=0x471061
vendor/github.com/cortexproject/cortex/pkg/ingester/client.(*LabelValuesStreamResponse).MarshalToSizedBuffer(0xc1198c75a8, {0xc0bdef2600, 0x116f, 0x116f})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/ingester.pb.go:3749 +0xe9 fp=0xc06932ae38 sp=0xc06932adf8 pc=0x1ae2169
vendor/github.com/cortexproject/cortex/pkg/ingester/client.(*LabelValuesStreamResponse).Marshal(0xc1198c75a8)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/ingester.pb.go:3729 +0x6e fp=0xc06932ae78 sp=0xc06932ae38 pc=0x1ae1eae
vendor/google.golang.org/protobuf/internal/impl.legacyMarshal({{}, {0x28ede00, 0xc0f202ad50}, {0x0, 0x0, 0x0}, 0x0})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/protobuf/internal/impl/legacy_message.go:402 +0xa2 fp=0xc06932af00 sp=0xc06932ae78 pc=0x84d4c2
vendor/google.golang.org/protobuf/proto.MarshalOptions.marshal({{}, 0xd8?, 0x0, 0x0}, {0x0, 0x0, 0x0}, {0x28ede00, 0xc0f202ad50})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/protobuf/proto/encode.go:166 +0x27b fp=0xc06932afa0 sp=0xc06932af00 pc=0x7e87db
vendor/google.golang.org/protobuf/proto.MarshalOptions.MarshalAppend({{}, 0xe0?, 0x5c?, 0x22?}, {0x0, 0x0, 0x0}, {0x28bb360?, 0xc0f202ad50?})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/protobuf/proto/encode.go:125 +0x79 fp=0xc06932afe8 sp=0xc06932afa0 pc=0x7e8419
vendor/github.com/golang/protobuf/proto.marshalAppend({0x0, 0x0, 0x0}, {0x7fb8a9fdf878?, 0xc1198c75a8?}, 0xc0?)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/golang/protobuf/proto/wire.go:40 +0xa5 fp=0xc06932b068 sp=0xc06932afe8 pc=0x876f05
vendor/github.com/golang/protobuf/proto.Marshal(...)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/golang/protobuf/proto/wire.go:23
vendor/google.golang.org/grpc/encoding/proto.codec.Marshal({}, {0x2225ce0, 0xc1198c75a8})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/grpc/encoding/proto/proto.go:45 +0x4e fp=0xc06932b0b8 sp=0xc06932b068 pc=0x97cf2e
vendor/google.golang.org/grpc/encoding/proto.(*codec).Marshal(0xc06932b0e0?, {0x2225ce0?, 0xc1198c75a8?})
        <autogenerated>:1 +0x37 fp=0xc06932b0d8 sp=0xc06932b0b8 pc=0x97d117
vendor/google.golang.org/grpc.encode({0x7fb98ee3ad98?, 0x3ce3668?}, {0x2225ce0?, 0xc1198c75a8?})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/grpc/rpc_util.go:594 +0x44 fp=0xc06932b128 sp=0xc06932b0d8 pc=0x9e2984
vendor/google.golang.org/grpc.prepareMsg({0x2225ce0?, 0xc1198c75a8?}, {0x7fb98ee3ad98?, 0x3ce3668?}, {0x0, 0x0}, {0x0, 0x0})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/grpc/stream.go:1610 +0xd2 fp=0xc06932b1a0 sp=0xc06932b128 pc=0x9f9cb2
vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc0b51aad80, {0x2225ce0?, 0xc1198c75a8})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/google.golang.org/grpc/stream.go:1503 +0xf3 fp=0xc06932b2f0 sp=0xc06932b1a0 pc=0x9f8ab3
vendor/github.com/opentracing-contrib/go-grpc.(*openTracingServerStream).SendMsg(0xc017d6cac8?, {0x2225ce0?, 0xc1198c75a8?})
        <autogenerated>:1 +0x34 fp=0xc06932b318 sp=0xc06932b2f0 pc=0xa7ed94
vendor/github.com/weaveworks/common/middleware.(*serverStream).SendMsg(0x49dc6a?, {0x2225ce0?, 0xc1198c75a8?})
        <autogenerated>:1 +0x35 fp=0xc06932b340 sp=0xc06932b318 pc=0xbb7c15
vendor/github.com/cortexproject/cortex/pkg/cortex.(*wrappedServerStream).SendMsg(0xc06932b380?, {0x2225ce0?, 0xc1198c75a8?})
        <autogenerated>:1 +0x35 fp=0xc06932b368 sp=0xc06932b340 pc=0x1d07dd5
vendor/github.com/cortexproject/cortex/pkg/util/grpcutil.(*wrappedServerStream).SendMsg(0x6?, {0x2225ce0?, 0xc1198c75a8?})
        <autogenerated>:1 +0x35 fp=0xc06932b390 sp=0xc06932b368 pc=0x166ac15
vendor/github.com/cortexproject/cortex/pkg/ingester/client.(*ingesterLabelValuesStreamServer).Send(0x49dc6a?, 0x0?)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/ingester.pb.go:3148 +0x2b fp=0xc06932b3b8 sp=0xc06932b390 pc=0x1ade96b
vendor/github.com/cortexproject/cortex/pkg/ingester/client.SendLabelValuesStream.func1()
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/cortex_util.go:23 +0x26 fp=0xc06932b3d8 sp=0xc06932b3b8 pc=0x1acdf66
vendor/github.com/cortexproject/cortex/pkg/ingester/client.sendWithContextErrChecking({0x28d1c90, 0xc0f346b860}, 0xc06932b418)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/cortex_util.go:54 +0x3f fp=0xc06932b400 sp=0xc06932b3d8 pc=0x1ace0bf
vendor/github.com/cortexproject/cortex/pkg/ingester/client.SendLabelValuesStream({0x28e1b20, 0xc0a0293d40}, 0xc1198c75a8)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/cortex_util.go:22 +0x78 fp=0xc06932b448 sp=0xc06932b400 pc=0x1acdef8
vendor/github.com/cortexproject/cortex/pkg/ingester.(*Ingester).LabelValuesStream(0x40ceeb?, 0x224c880?, {0x28e1b20, 0xc0a0293d40})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/ingester.go:1346 +0x13d fp=0xc06932b4b8 sp=0xc06932b448 pc=0x1cbdabd
vendor/github.com/cortexproject/cortex/pkg/ingester/client._Ingester_LabelValuesStream_Handler({0x22c7480?, 0xc000832c00}, {0x28d99c8, 0xc089e0bae0})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/ingester/client/ingester.pb.go:3135 +0xd0 fp=0xc06932b4f8 sp=0xc06932b4b8 pc=0x1ade8f0
vendor/github.com/cortexproject/cortex/pkg/util/grpcutil.HTTPHeaderPropagationStreamServerInterceptor({0x22c7480, 0xc000832c00}, {0x28d98f0?, 0xc089e0bac0?}, 0xc089e0bac0?, 0x249dda0)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/util/grpcutil/util.go:44 +0xab fp=0xc06932b558 sp=0xc06932b4f8 pc=0x1669e0b
vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1.1.1({0x22c7480?, 0xc000832c00?}, {0x28d98f0?, 0xc089e0bac0?})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:49 +0x3a fp=0xc06932b598 sp=0xc06932b558 pc=0xa7a75a
vendor/github.com/cortexproject/cortex/pkg/cortex.ThanosTracerStreamInterceptor({0x22c7480, 0xc000832c00}, {0x28d9b78?, 0xc089e0ba80?}, 0xc089e0ba80?, 0xc089e0b8e0)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/cortex/tracing.go:23 +0x109 fp=0xc06932b608 sp=0xc06932b598 pc=0x1d06d89
vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainStreamServer.func1.1.1({0x22c7480?, 0xc000832c00?}, {0x28d9b78?, 0xc089e0ba80?})
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:49 +0x3a fp=0xc06932b648 sp=0xc06932b608 pc=0xa7a75a
vendor/github.com/weaveworks/common/middleware.StreamServerUserHeaderInterceptor({0x22c7480, 0xc000832c00}, {0x28d8468?, 0xc089e0ba40?}, 0x0?, 0xc089e0b900)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/weaveworks/common/middleware/grpc_auth.go:48 +0xc2 fp=0xc06932b6a8 sp=0xc06932b648 pc=0xbb1fc2
vendor/github.com/cortexproject/cortex/pkg/util/fakeauth.SetupAuthMiddleware.func2({0x22c7480, 0xc000832c00}, {0x28d8468, 0xc089e0ba40}, 0xc017d6c210, 0xc089e0b900)
        Cortex-1.0.343.0/AL2_x86_64/DEV.STD.PTHREAD/build/gopath/src/vendor/github.com/cortexproject/cortex/pkg/util/fakeauth/fake_auth.go:35 +0xa6 fp=0xc06932b6e8 sp=0xc06932b6a8 pc=0x1ce5ae6

Which issue(s) this PR fixes:
Fixes #

Checklist

  • [NA] Tests updated
  • [NA] Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Alan Protasio <alanprot@gmail.com>
Signed-off-by: Alan Protasio <alanprot@gmail.com>
@alanprot alanprot marked this pull request as ready for review February 2, 2023 21:55
@alanprot alanprot changed the title fix query seg fault Fix Segmentation fault on ingester due race condition when reading blocks that are being deleted Feb 2, 2023
Copy link
Collaborator

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@alanprot alanprot merged commit 2e693f2 into cortexproject:master Feb 2, 2023
@alanprot alanprot deleted the fix/query-segfult branch February 2, 2023 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants