Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/backend: fatal if there is panic during defrag #17757

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fuweid
Copy link
Contributor

@fuweid fuweid commented Apr 9, 2024

We should exit as soon as possible if there is panic during defrag. Because that tx might be closed. The inflight request might use invalid tx and then panic as well. However, the real panic might be shadowed by new panic. It's related to goroutine schedule. So, we should fatal here.

How to reproduce bbolt#715:

diff --git a/server/etcdserver/api/v3rpc/maintenance.go b/server/etcdserver/api/v3rpc/maintenance.go
index 4f55c7c74..0a91b4225 100644
--- a/server/etcdserver/api/v3rpc/maintenance.go
+++ b/server/etcdserver/api/v3rpc/maintenance.go
@@ -89,7 +89,13 @@ func NewMaintenanceServer(s *etcdserver.EtcdServer, healthNotifier notifier) pb.
 func (ms *maintenanceServer) Defragment(ctx context.Context, sr *pb.DefragmentRequest) (*pb.DefragmentResponse, error) {
        ms.lg.Info("starting defragment")
        ms.healthNotifier.defragStarted()
-       defer ms.healthNotifier.defragFinished()
+       defer func() {
+               ms.healthNotifier.defragFinished()
+               if rerr := recover(); rerr != nil {
+                       time.Sleep(30 * time.Second)
+                       panic(rerr)
+               }
+       }()
        err := ms.bg.Backend().Defrag()
        if err != nil {
                ms.lg.Warn("failed to defragment", zap.Error(err))
$ make gofail-enable
$ make
$ GOFAIL_HTTP="127.0.0.1:1234" bin/etcd

$ New Terminal
$ curl http://127.0.0.1:1234/defragBeforeRename -XPUT -d'panic()'
$ bin/etcdctl defrag

$ Old Terminal
{"level":"info","ts":"2024-04-09T23:28:54.084434+0800","caller":"v3rpc/maintenance.go:90","msg":"starting defragment"}
{"level":"info","ts":"2024-04-09T23:28:54.087363+0800","caller":"backend/backend.go:509","msg":"defragmenting","path":"default.etcd/member/snap/db","current-db-size-bytes":20480,"current-db-size":"20 kB","current-db-size-in-use-bytes":16384,"current-db-size-in-use":"16 kB"}
{"level":"info","ts":"2024-04-09T23:28:54.091229+0800","caller":"v3rpc/health.go:62","msg":"grpc service status changed","service":"","status":"SERVING"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xbb2553]

goroutine 156 [running]:
go.etcd.io/bbolt.(*Tx).Bucket(...)
        go.etcd.io/bbolt@v1.4.0-alpha.0/tx.go:112
go.etcd.io/etcd/server/v3/storage/backend.(*baseReadTx).UnsafeRange(0xc00061ac80, {0x12b7780, 0x1a73420}, {0x1a113e8, 0xe, 0xe}, {0x0, 0x0, 0x0}, 0x1)
        go.etcd.io/etcd/server/v3/storage/backend/read_tx.go:103 +0x233
go.etcd.io/etcd/server/v3/storage/schema.UnsafeReadStorageVersion({0x7f6280613618?, 0xc00061ac80?})
        go.etcd.io/etcd/server/v3/storage/schema/version.go:35 +0x5d
go.etcd.io/etcd/server/v3/storage/schema.UnsafeDetectSchemaVersion(0xc000334b80, {0x7f6280613618, 0xc00061ac80})
        go.etcd.io/etcd/server/v3/storage/schema/schema.go:94 +0x47
go.etcd.io/etcd/server/v3/storage/schema.DetectSchemaVersion(0xc000334b80, {0x12b77f0, 0xc00061ac80})
        go.etcd.io/etcd/server/v3/storage/schema/schema.go:89 +0xf2
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).StorageVersion(0xc000393c08)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2216 +0x105
go.etcd.io/etcd/server/v3/etcdserver.(*serverVersionAdapter).GetStorageVersion(0x0?)
        go.etcd.io/etcd/server/v3/etcdserver/adapters.go:77 +0x16
go.etcd.io/etcd/server/v3/etcdserver/version.(*Monitor).UpdateStorageVersionIfNeeded(0xc00002df70)
        go.etcd.io/etcd/server/v3/etcdserver/version/monitor.go:112 +0x5d
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).monitorStorageVersion(0xc000393c08)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2259 +0xa8
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach.func1()
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2440 +0x59
created by go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach in goroutine 1
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2438 +0xf9

REF: etcd-io/bbolt#715

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

We should exit as soon as possible if there is panic during defrag.
Because that tx might be closed. The inflight request might use invalid
tx and then panic as well. However, the real panic might be shadowed by
new panic. It's related to goroutine schedule. So, we should fatal here.

How to reproduce bbolt#715:

```diff
diff --git a/server/etcdserver/api/v3rpc/maintenance.go b/server/etcdserver/api/v3rpc/maintenance.go
index 4f55c7c74..0a91b4225 100644
--- a/server/etcdserver/api/v3rpc/maintenance.go
+++ b/server/etcdserver/api/v3rpc/maintenance.go
@@ -89,7 +89,13 @@ func NewMaintenanceServer(s *etcdserver.EtcdServer, healthNotifier notifier) pb.
 func (ms *maintenanceServer) Defragment(ctx context.Context, sr *pb.DefragmentRequest) (*pb.DefragmentResponse, error) {
        ms.lg.Info("starting defragment")
        ms.healthNotifier.defragStarted()
-       defer ms.healthNotifier.defragFinished()
+       defer func() {
+               ms.healthNotifier.defragFinished()
+               if rerr := recover(); rerr != nil {
+                       time.Sleep(30 * time.Second)
+                       panic(rerr)
+               }
+       }()
        err := ms.bg.Backend().Defrag()
        if err != nil {
                ms.lg.Warn("failed to defragment", zap.Error(err))
```

```bash
$ make gofail-enable
$ make
$ GOFAIL_HTTP="127.0.0.1:1234" bin/etcd

$ New Terminal
$ curl http://127.0.0.1:1234/defragBeforeRename -XPUT -d'panic()'
$ bin/etcdctl defrag

$ Old Terminal
{"level":"info","ts":"2024-04-09T23:28:54.084434+0800","caller":"v3rpc/maintenance.go:90","msg":"starting defragment"}
{"level":"info","ts":"2024-04-09T23:28:54.087363+0800","caller":"backend/backend.go:509","msg":"defragmenting","path":"default.etcd/member/snap/db","current-db-size-bytes":20480,"current-db-size":"20 kB","current-db-size-in-use-bytes":16384,"current-db-size-in-use":"16 kB"}
{"level":"info","ts":"2024-04-09T23:28:54.091229+0800","caller":"v3rpc/health.go:62","msg":"grpc service status changed","service":"","status":"SERVING"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xbb2553]

goroutine 156 [running]:
go.etcd.io/bbolt.(*Tx).Bucket(...)
        go.etcd.io/bbolt@v1.4.0-alpha.0/tx.go:112
go.etcd.io/etcd/server/v3/storage/backend.(*baseReadTx).UnsafeRange(0xc00061ac80, {0x12b7780, 0x1a73420}, {0x1a113e8, 0xe, 0xe}, {0x0, 0x0, 0x0}, 0x1)
        go.etcd.io/etcd/server/v3/storage/backend/read_tx.go:103 +0x233
go.etcd.io/etcd/server/v3/storage/schema.UnsafeReadStorageVersion({0x7f6280613618?, 0xc00061ac80?})
        go.etcd.io/etcd/server/v3/storage/schema/version.go:35 +0x5d
go.etcd.io/etcd/server/v3/storage/schema.UnsafeDetectSchemaVersion(0xc000334b80, {0x7f6280613618, 0xc00061ac80})
        go.etcd.io/etcd/server/v3/storage/schema/schema.go:94 +0x47
go.etcd.io/etcd/server/v3/storage/schema.DetectSchemaVersion(0xc000334b80, {0x12b77f0, 0xc00061ac80})
        go.etcd.io/etcd/server/v3/storage/schema/schema.go:89 +0xf2
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).StorageVersion(0xc000393c08)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2216 +0x105
go.etcd.io/etcd/server/v3/etcdserver.(*serverVersionAdapter).GetStorageVersion(0x0?)
        go.etcd.io/etcd/server/v3/etcdserver/adapters.go:77 +0x16
go.etcd.io/etcd/server/v3/etcdserver/version.(*Monitor).UpdateStorageVersionIfNeeded(0xc00002df70)
        go.etcd.io/etcd/server/v3/etcdserver/version/monitor.go:112 +0x5d
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).monitorStorageVersion(0xc000393c08)
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2259 +0xa8
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach.func1()
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2440 +0x59
created by go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach in goroutine 1
        go.etcd.io/etcd/server/v3/etcdserver/server.go:2438 +0xf9
```

REF: etcd-io/bbolt#715

Signed-off-by: Wei Fu <fuweid89@gmail.com>
@fuweid
Copy link
Contributor Author

fuweid commented Apr 12, 2024

ping @ahrtr @serathius @jmhbnz @siyuanfoundation request a review on this. hope it can help for failpoint test.

Comment on lines +480 to +488
defer func() {
// NOTE: We should exit as soon as possible because that tx
// might be closed. The inflight request might use invalid
// tx and then panic as well. The real panic reason might be
// shadowed by new panic. So, we should fatal here with lock.
if rerr := recover(); rerr != nil {
b.lg.Fatal("unexpected panic during defrag", zap.Any("panic", rerr))
}
}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During defragmentation, we acquire all locks, so any TXs (including both batchTx and readTx) will never be closed by other operations, because all other operations will be blocked. The only exception is there may be some inflight concurrent read operations, but which will never close any TX either.

Also etcdserver will wait for all read TXs to finish when closing the bbolt db; so the defrag will not impact any read TXs either.

I agree printing the real panic is a little better so that it won't be shadowed by any other new panic. But I would suggest that we have a overall solution as mentioned in etcd-io/bbolt#715 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would suggest that we have a overall solution as mentioned

I do have a plan. Still working on POC. Will file pull request for RFC when it's ready.

I agree printing the real panic is a little better so that it won't be shadowed by any other new panic.

Side note: The original issue is from robustness test suite. The failpoint was injected during serving requests, not in shutdown phase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: The original issue is from robustness test suite. The failpoint was injected during serving requests, not in shutdown phase.

IN real production environment, the only possible reason for panicking a defragmentation is system error, e.g. power-off, OOM, I/O issue etc.

Copy link
Contributor Author

@fuweid fuweid Apr 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES. So, in this patch, fatal within lock is kind of fail fast. :) Even if we have better drain solution, it's still possible to use nil pointer during handling other inflight requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest not to b.lg.Fatal

  • It calls os.Exit(1) to terminate the process immediately. Deferred functions won't get executed, so any cleanup tasks won't get executed.
  • As mentioned above, during defragmentation, we acquires all locks, so defragmentation won't conflict with any other operations. The TXN won't be closed by any other opeartion/goroutine during defragmentation.
  • Also I don't see any relationship between this PR and the issue UnsafeRange panicking during shutdown #17223

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fatal will call os.Exit(1) with flushing pending log. I think it's better than os.Exit(1) here.

Also I don't see any relationship between this PR and the issue #17223

Yes. It's about bbolt#715. I was thinking that it's related to termination phase so I linked it.
Sorry for that confusion.

As mentioned above, during defragmentation, we acquires all locks, so defragmentation won't conflict with any other operations. The TXN won't be closed by any other opeartion/goroutine during defragmentation.

Yes. However, if there is panic, all the defer functions will be called and the lock will be released. If there is any pending lock acquire function, it can get the lock and do query. It depends on goroutine schedule. Before exit caused by panic, other goroutines stil have chance to execute somethings. It depends on kernel or go runtime.

Based on bbolt#715 comment, the failpoint make the original db reference closed. The unsafe query will panic as well and shadow the original one.

@serathius serathius self-requested a review April 19, 2024 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants