Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race during manager shutdown #2760

Open
ahmetb opened this issue Apr 5, 2024 · 2 comments
Open

Data race during manager shutdown #2760

ahmetb opened this issue Apr 5, 2024 · 2 comments

Comments

@ahmetb
Copy link
Member

ahmetb commented Apr 5, 2024

I'd like to report a data race caught by go test -race during my end-to-end test suite (which uses envtest, but does not use Ginkgo) which ended up failing my test.

The data race has occurred during the shutdown sequence. It seems during this time, one of the reconcilers was doing an API call, which ended up getting cancelled due to ctx cancellation I do as part of the tear down (to shut down envtest and the Runnables).

I am not at all sure if it is related to controller-runtime, as the concurrent reader/writer goroutine stacks are pointing to net/http:

But here it goes.

2024-04-02T18:22:08Z	INFO	Tests finished	{"duration": "2.049504249s", "exitCode": 0}
2024-04-02T18:22:08Z	INFO	Shutting down controllers
2024-04-02T18:22:08Z	INFO	Shutting down envtest
2024-04-02T18:22:08Z	INFO	Stopping and waiting for non leader election runnables
2024-04-02T18:22:08Z	INFO	Stopping and waiting for leader election runnables
2024-04-02T18:22:08Z	INFO	Shutdown signal received, waiting for all workers to finish	{"controller": "kubernetescontrolplane", "controllerGroup": "example.com", "controllerKind": "KubernetesControlPlane"}
2024-04-02T18:22:08Z	INFO	Shutdown signal received, waiting for all workers to finish	{"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster"}
2024-04-02T18:22:08Z	INFO	All workers finished	{"controller": "kubernetescontrolplane", "controllerGroup": "example.com", "controllerKind": "KubernetesControlPlane"}
==================
WARNING: DATA RACE
Read at 0x00c0001361c0 by goroutine 269:
  net/http.rewindBody()
      /export/home/tester/.go/1.22.0/src/net/http/transport.go:683 +0xcb
  net/http.(*Transport).roundTrip()
      /export/home/tester/.go/1.22.0/src/net/http/transport.go:557 +0x6db
  net/http.(*Transport).RoundTrip()
      /export/home/tester/.go/1.22.0/src/net/http/roundtrip.go:17 +0x33
  k8s.io/client-go/transport.(*userAgentRoundTripper).RoundTrip()
      /export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/transport/round_trippers.go:168 +0x4f6
  net/http.send()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:259 +0x9a6
  net/http.(*Client).send()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:180 +0x14e
  net/http.(*Client).do()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:724 +0x11c8
  net/http.(*Client).Do()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:590 +0x6a6
  k8s.io/client-go/rest.(*Request).request()
      /export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1023 +0x3ac
  k8s.io/client-go/rest.(*Request).Do()
      /export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1063 +0x119
  sigs.k8s.io/controller-runtime/pkg/client.(*typedClient).PatchSubResource()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/typed_client.go:277 +0x924
  sigs.k8s.io/controller-runtime/pkg/client.(*subResourceClient).Patch()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/client.go:595 +0x484
  golnkd.in/k8s-compute-broker-controller/control-plane-manager/internal/controller.(*EtcdClusterReconciler).Reconcile()
      /export/content/data/multiproduct-post-commit/i001/workspace/k8s-compute-broker-controller_a60f2169246cc0c4f7d68a8124dca5ad12006477/control-plane-manager/internal/controller/etcdcluster_controller.go:81 +0x78d
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:119 +0x1a1
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:316 +0x59a
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:266 +0x399
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:227 +0xb2

Previous write at 0x00c0001361c0 by goroutine 304:
  net/http.(*readTrackingBody).Read()
      /export/home/tester/.go/1.22.0/src/net/http/transport.go:656 +0x44
  golang.org/x/net/http2.(*clientStream).writeRequestBody()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1732 +0x5d9
  golang.org/x/net/http2.(*clientStream).writeRequest()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1440 +0xf32
  golang.org/x/net/http2.(*clientStream).doRequest()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1326 +0x33
  golang.org/x/net/http2.(*ClientConn).RoundTrip.gowrap1()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1232 +0x44

Goroutine 269 (running) created at:
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:223 +0x80b
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:234 +0x2b9
  sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/runnable_group.go:223 +0x1c6
  sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.gowrap1()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/manager/runnable_group.go:226 +0x41

Goroutine 304 (running) created at:
  golang.org/x/net/http2.(*ClientConn).RoundTrip()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:1232 +0x7d7
  golang.org/x/net/http2.(*Transport).RoundTripOpt()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:564 +0x644
  golang.org/x/net/http2.(*Transport).RoundTrip()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:513 +0x37
  golang.org/x/net/http2.noDialH2RoundTripper.RoundTrip()
      /export/home/tester/go-athens/pkg/mod/golang.org/x/net@v0.19.0/http2/transport.go:3136 +0x26
  net/http.(*Transport).roundTrip()
      /export/home/tester/.go/1.22.0/src/net/http/transport.go:553 +0x641
  net/http.(*Transport).RoundTrip()
      /export/home/tester/.go/1.22.0/src/net/http/roundtrip.go:17 +0x33
  k8s.io/client-go/transport.(*userAgentRoundTripper).RoundTrip()
      /export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/transport/round_trippers.go:168 +0x4f6
  net/http.send()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:259 +0x9a6
  net/http.(*Client).send()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:180 +0x14e
  net/http.(*Client).do()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:724 +0x11c8
  net/http.(*Client).Do()
      /export/home/tester/.go/1.22.0/src/net/http/client.go:590 +0x6a6
  k8s.io/client-go/rest.(*Request).request()
      /export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1023 +0x3ac
  k8s.io/client-go/rest.(*Request).Do()
      /export/home/tester/go-athens/pkg/mod/k8s.io/client-go@v0.29.1/rest/request.go:1063 +0x119
  sigs.k8s.io/controller-runtime/pkg/client.(*typedClient).PatchSubResource()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/typed_client.go:277 +0x924
  sigs.k8s.io/controller-runtime/pkg/client.(*subResourceClient).Patch()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/client/client.go:595 +0x484
  golnkd.in/k8s-compute-broker-controller/control-plane-manager/internal/controller.(*EtcdClusterReconciler).Reconcile()
      /export/content/data/multiproduct-post-commit/i001/workspace/k8s-compute-broker-controller_a60f2169246cc0c4f7d68a8124dca5ad12006477/control-plane-manager/internal/controller/etcdcluster_controller.go:81 +0x78d
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:119 +0x1a1
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:316 +0x59a
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:266 +0x399
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
      /export/home/tester/go-athens/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.2/pkg/internal/controller/controller.go:227 +0xb2
==================
2024-04-02T18:22:08Z	DEBUG	Status updated	{"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster", "EtcdCluster": {"name":"test"}, "namespace": "", "name": "test", "reconcileID": "c175233f-a6c2-46c4-82af-49e68f48bdbb"}
2024-04-02T18:22:08Z	ERROR	Reconciler error	{"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster", "EtcdCluster": {"name":"test"}, "namespace": "", "name": "test", "reconcileID": "c175233f-a6c2-46c4-82af-49e68f48bdbb", "error": "failed to patch EtcdCluster status: Patch \"https://127.0.0.1:40187/apis/example.com/v1alpha1/etcdclusters/test/status\": context canceled", "errorCauses": [{"error": "failed to patch EtcdCluster status: Patch \"https://127.0.0.1:40187/apis/example.com/v1alpha1/etcdclusters/test/status\": context canceled"}]}
2024-04-02T18:22:08Z	INFO	All workers finished	{"controller": "etcdcluster", "controllerGroup": "example.com", "controllerKind": "EtcdCluster"}
2024-04-02T18:22:08Z	INFO	Stopping and waiting for caches
2024-04-02T18:22:08Z	INFO	Stopping and waiting for webhooks
2024-04-02T18:22:08Z	INFO	Stopping and waiting for HTTP servers
2024-04-02T18:22:08Z	INFO	controller-runtime.metrics	Shutting down metrics server with timeout of 1 minute
2024-04-02T18:22:08Z	INFO	Wait completed, proceeding to shutdown the manager
2024-04-02T18:22:08Z	INFO	Controllers stopped
2024-04-02T18:22:09Z	INFO	Teardown finished	{"duration": "1.107332537s"}
Found 1 data race(s)

@vincepri
Copy link
Member

vincepri commented Apr 8, 2024

Maybe related to #2757

Are you able to update to latest v0.17 commit?

@vincepri
Copy link
Member

vincepri commented Apr 8, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants