Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize reminders as protobuf in state store #7129

Merged
merged 22 commits into from Dec 12, 2023

Conversation

ItalyPaleAle
Copy link
Contributor

@ItalyPaleAle ItalyPaleAle commented Oct 31, 2023

Part of #6121

Changes how reminders are serialized in the state store, switching from the current JSON serialization to protobuf. Using protobuf is more efficient, not just because the serialization itself is faster, but because the payload is more "compressed" and so the I/O cost (the biggest issue today) is reduced.

Results from the actor_reminder perf test (using Postgres) (higher is better):

Workflow perf test (TestSeriesWorkflowWithMaxVUs - Req Duration P90) (lower is better):

This is the first change that leverages #6838 for real to preserve backwards-compatibility and to be able to function in a cluster with mixed Dapr versions

  • Actor API level has been bumped to 20 (was 10 in 1.12)
  • If the cluster API level is 20 or higher, reminders are serialized as protobuf. This means that if all actor hosts in the cluster are running on "vNext" (assuming 1.13), then reminders are serialized to protobuf.
  • If the actors API level in the cluster is < 20 (either because there's at least 1 host running 1.12 or lower, or because the admin explicitly set the maximum API version to < 20 in the Helm chart), then reminders are serialized as JSON
  • It's always possible to parse reminders that are serialized as JSON, regardless of the current API level

TODO:

  • Figure out how to write a test for ensuring that this can read/write JSON depending on the API level in the cluster

@ItalyPaleAle ItalyPaleAle requested review from a team as code owners October 31, 2023 17:56
@ItalyPaleAle
Copy link
Contributor Author

/ok-to-perf actor_reminder

@dapr-bot
Copy link
Collaborator

dapr-bot commented Oct 31, 2023

Dapr perf test

🔗 Link to Action run

Commit ref: 1bd249c

✅ Build succeeded

  • Image tag: daprprf824263b21e
  • Test image tag: daprprf824263b21e

✅ Infrastructure deployed

  • Resource group name: Dapr-Perf-daprprf824263b21e
  • Azure region: westus3

✅ Perf tests succeeded

  • Image tag: daprprf824263b21e
  • Test image tag: daprprf824263b21e

@ItalyPaleAle
Copy link
Contributor Author

/ok-to-perf workflows

@dapr-bot
Copy link
Collaborator

dapr-bot commented Oct 31, 2023

Dapr perf test

🔗 Link to Action run

Commit ref: 1bd249c

✅ Build succeeded

  • Image tag: daprprf19623e7bfc
  • Test image tag: daprprf19623e7bfc

✅ Infrastructure deployed

  • Resource group name: Dapr-Perf-daprprf19623e7bfc
  • Azure region: westus3

✅ Perf tests succeeded

  • Image tag: daprprf19623e7bfc
  • Test image tag: daprprf19623e7bfc

@ItalyPaleAle ItalyPaleAle marked this pull request as draft October 31, 2023 18:01
@yaron2
Copy link
Member

yaron2 commented Oct 31, 2023

Can attest personally changing JSON to proto several years back has improved Redis throughput x10 on a project I worked on

Copy link

codecov bot commented Nov 1, 2023

Codecov Report

Attention: 35 lines in your changes are missing coverage. Please review.

Comparison is base (ed34172) 64.59% compared to head (e0d92d8) 64.62%.
Report is 1 commits behind head on master.

Files Patch % Lines
pkg/actors/reminders/reminders.go 70.00% 14 Missing and 10 partials ⚠️
pkg/http/api_directmessaging.go 82.35% 3 Missing and 3 partials ⚠️
pkg/resiliency/policy.go 66.66% 4 Missing ⚠️
pkg/actors/internal/api_level.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7129      +/-   ##
==========================================
+ Coverage   64.59%   64.62%   +0.03%     
==========================================
  Files         225      226       +1     
  Lines       21110    21191      +81     
==========================================
+ Hits        13635    13694      +59     
- Misses       6307     6319      +12     
- Partials     1168     1178      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@artursouza
Copy link
Member

I really like this change. It is a quick life improvement to users before we can offer a v2 experience for reminders.

ItalyPaleAle added a commit to ItalyPaleAle/dapr that referenced this pull request Nov 15, 2023
Follow-up from dapr#7134
Part of dapr#6838

Actor runtime can use this to change its behavior. Will allow completing dapr#7129

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
yaron2 pushed a commit that referenced this pull request Nov 17, 2023
Follow-up from #7134
Part of #6838

Actor runtime can use this to change its behavior. Will allow completing #7129

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
@ItalyPaleAle ItalyPaleAle changed the title [WIP] Serialize reminders as protobuf in state store Serialize reminders as protobuf in state store Nov 21, 2023
@ItalyPaleAle ItalyPaleAle marked this pull request as ready for review November 21, 2023 20:14
@ItalyPaleAle
Copy link
Contributor Author

/ok-to-perf workflows

@ItalyPaleAle
Copy link
Contributor Author

/ok-to-perf actor_reminder

@ItalyPaleAle
Copy link
Contributor Author

/ok-to-test

@dapr-bot
Copy link
Collaborator

dapr-bot commented Nov 21, 2023

Dapr perf test

🔗 Link to Action run

Commit ref: a9a376e

✅ Build succeeded

  • Image tag: daprprfd778409565
  • Test image tag: daprprfd778409565

✅ Infrastructure deployed

  • Resource group name: Dapr-Perf-daprprfd778409565
  • Azure region: westus3

✅ Perf tests succeeded

  • Image tag: daprprfd778409565
  • Test image tag: daprprfd778409565

@dapr-bot
Copy link
Collaborator

dapr-bot commented Nov 21, 2023

Dapr perf test

🔗 Link to Action run

Commit ref: a9a376e

✅ Build succeeded

  • Image tag: daprprf1ab5f025a9
  • Test image tag: daprprf1ab5f025a9

✅ Infrastructure deployed

  • Resource group name: Dapr-Perf-daprprf1ab5f025a9
  • Azure region: westus3

✅ Perf tests succeeded

  • Image tag: daprprf1ab5f025a9
  • Test image tag: daprprf1ab5f025a9

@dapr-bot
Copy link
Collaborator

dapr-bot commented Nov 21, 2023

Dapr E2E test

🔗 Link to Action run

Commit ref: a9a376e

✅ Build succeeded for linux/amd64

  • Image tag: dapre2e7f762eda2dl
  • Test image tag: dapre2e7f762eda2dl

✅ Infrastructure deployed

Cluster Resource group name Azure region
Linux Dapr-E2E-dapre2e7f762eda2dl westus3
Windows Dapr-E2E-dapre2e7f762eda2dw westus3
Linux/arm64 Dapr-E2E-dapre2e7f762eda2dla eastus

✅ Build succeeded for windows/amd64

  • Image tag: dapre2e7f762eda2dw
  • Test image tag: dapre2e7f762eda2dw

✅ Tests succeeded on windows/amd64

  • Image tag: dapre2e7f762eda2dw
  • Test image tag: dapre2e7f762eda2dw

✅ Tests succeeded on linux/amd64

  • Image tag: dapre2e7f762eda2dl
  • Test image tag: dapre2e7f762eda2dl

@ItalyPaleAle ItalyPaleAle added the autoupdate DaprBot will keep the Pull Request up to date with master branch label Nov 21, 2023
Copy link
Contributor

@mukundansundar mukundansundar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change is something that will be extremely useful. Please add version skew tests for this feature.

@ItalyPaleAle
Copy link
Contributor Author

the change is something that will be extremely useful. Please add version skew tests for this feature.

I'll see what I can do after I'm back next week.

The current setup for version skew tests doesn't allow testing for something like this. We may need some creative thinking.

…API level

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Co-authored-by: Josh van Leeuwen <me@joshvanl.dev>
Signed-off-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com>
@ItalyPaleAle
Copy link
Contributor Author

@ItalyPaleAle it shouldn't be the case that we need more resources for the runners when things were working before. This suggests to me that resources are not being cleaned up between test cases, or a test case is completely de-funked.

I don't know what to say... Maybe more tests now are using more CPU and we're exhausting resources?

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Comment on lines 93 to 110
// CreateStateTables creates the state tables in a new SQLite DB.
func (s *SQLite) CreateStateTables(t *testing.T) {
require.NotNil(t, s.conn, "Cannot perform if connection isn't active")
_, err := s.conn.Exec(`
CREATE TABLE metadata (
key text NOT NULL PRIMARY KEY,
value text NOT NULL
);
INSERT INTO metadata VALUES('migrations','1');
CREATE TABLE state (
key TEXT NOT NULL PRIMARY KEY,
value TEXT NOT NULL,
is_binary BOOLEAN NOT NULL,
etag TEXT NOT NULL,
expiration_time TIMESTAMP DEFAULT NULL,
update_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);`)
require.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daprd is not running at this point.

@JoshVanL
Copy link
Contributor

JoshVanL commented Dec 1, 2023

@ItalyPaleAle it shouldn't be the case that we need more resources for the runners when things were working before. This suggests to me that resources are not being cleaned up between test cases, or a test case is completely de-funked.

I don't know what to say... Maybe more tests now are using more CPU and we're exhausting resources?

The error is because of the actor rebalancing test. You can see this here https://github.com/JoshVanL/dapr/actions/runs/7060286640/job/19219678365?pr=28

@ItalyPaleAle
Copy link
Contributor Author

@ItalyPaleAle it shouldn't be the case that we need more resources for the runners when things were working before. This suggests to me that resources are not being cleaned up between test cases, or a test case is completely de-funked.

I don't know what to say... Maybe more tests now are using more CPU and we're exhausting resources?

The error is because of the actor rebalancing test. You can see this here https://github.com/JoshVanL/dapr/actions/runs/7060286640/job/19219678365?pr=28

I don't see the actor rebalancing test failing there?

I see that the failures are:

  • Test_Integration/daprd/serviceinvocation/http/basic/run with a socket timeout:
    image
  • Test_Integration/actors/reminders/serialization/json with an out-of-memory (maybe just need bigger runners?)
    image

@JoshVanL
Copy link
Contributor

JoshVanL commented Dec 1, 2023

@ItalyPaleAle it shouldn't be the case that we need more resources for the runners when things were working before. This suggests to me that resources are not being cleaned up between test cases, or a test case is completely de-funked.

I don't know what to say... Maybe more tests now are using more CPU and we're exhausting resources?

The error is because of the actor rebalancing test. You can see this here https://github.com/JoshVanL/dapr/actions/runs/7060286640/job/19219678365?pr=28

I don't see the actor rebalancing test failing there?

I see that the failures are:

* `Test_Integration/daprd/serviceinvocation/http/basic/run ` with a socket timeout:
  <img alt="image" width="1034" src="https://private-user-images.githubusercontent.com/43508/287335412-68e3b9df-650b-4361-8b66-b1473146bcb7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTEiLCJleHAiOjE3MDE0NTExNTIsIm5iZiI6MTcwMTQ1MDg1MiwicGF0aCI6Ii80MzUwOC8yODczMzU0MTItNjhlM2I5ZGYtNjUwYi00MzYxLThiNjYtYjE0NzMxNDZiY2I3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFJV05KWUFYNENTVkVINTNBJTJGMjAyMzEyMDElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjMxMjAxVDE3MTQxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPThlYmZjY2M4YTZkY2FlYjU3YjY4YzhmYTNjYjU3YTY1NDM2ZDVjNzZkMmRhNmM5MWZkZjAzOTQ4N2E4Mzk0MTYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.-Z96CMPIsTTb5ZAV-5EMQ6C9zv3LTXIOMON3co9BUQ4">

* `Test_Integration/actors/reminders/serialization/json` with an out-of-memory (maybe just need bigger runners?)
  <img alt="image" width="944" src="https://private-user-images.githubusercontent.com/43508/287335575-5d806fd0-7c21-4d25-b0c4-173ec7325725.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTEiLCJleHAiOjE3MDE0NTExNTIsIm5iZiI6MTcwMTQ1MDg1MiwicGF0aCI6Ii80MzUwOC8yODczMzU1NzUtNWQ4MDZmZDAtN2MyMS00ZDI1LWIwYzQtMTczZWM3MzI1NzI1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFJV05KWUFYNENTVkVINTNBJTJGMjAyMzEyMDElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjMxMjAxVDE3MTQxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTc1ODM5ZGQ4ODFmODAyMGYwOTljZjUyNzI5ZmM3NTQxNzI3YmQ0OTI0YThhYTcwZjk2YjZkOWJiNjE3YjE2ZmMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.zbTHqr6G7wdaWilsWOKwww_k2Xhtfb5pnlc5n8ZGS2Q">

Indeed, the rebalancing test has been disabled on that run https://github.com/JoshVanL/dapr/pull/28/files#diff-3cd7e807d7e6af5f4ef237072fabae2d4e158f9212c2ea314b6b1cb1cbe0a47d

ItalyPaleAle and others added 5 commits December 4, 2023 17:55
…serialize-protobuf

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
@yaron2
Copy link
Member

yaron2 commented Dec 6, 2023

@ItalyPaleAle once you address existing comments I think we'll be good to merge, brief review LGTM.

dapr-bot and others added 3 commits December 5, 2023 20:52
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
@ItalyPaleAle
Copy link
Contributor Author

The second bug in the reminders rebalancing IT should be fixed by d25832b

Comment on lines +409 to 423
var sent bool
for {
// When the stream ends (which happens when the context is canceled) this returns an error and we can return
o, rerr := stream.Recv()
if rerr != nil {
errCh <- fmt.Errorf("error from placement: %w", rerr)
if !sent {
errCh <- fmt.Errorf("error from placement: %w", rerr)
}
return
}
if o.GetOperation() == "update" {
if o.GetOperation() == "update" && !sent {
errCh <- nil
return
sent = true
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes introduce a data race in the code. Please also open a separate PR for fixes #7250 with a description as to why the error began appearing and whey this code fixes this problem.

==================
WARNING: DATA RACE
Write at 0x00c0005bec10 by goroutine 104:
  runtime.slicecopy()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/runtime/slice.go:310 +0x0
  bytes.(*Buffer).Read()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/bytes/buffer.go:328 +0x16c
  google.golang.org/grpc/internal/transport.(*recvBufferReader).readAdditional()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:219 +0xfc
  google.golang.org/grpc/internal/transport.(*recvBufferReader).readClient()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:210 +0xf0
  google.golang.org/grpc/internal/transport.(*recvBufferReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:171 +0x2bc
  google.golang.org/grpc/internal/transport.(*transportReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:506 +0x5c
  io.ReadAtLeast()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:335 +0xc8
  io.ReadFull()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:354 +0xe8
  google.golang.org/grpc/internal/transport.(*Stream).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:490 +0xc4
  google.golang.org/grpc.(*parser).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:600 +0x5c
  google.golang.org/grpc.recvAndDecompress()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:730 +0x4c
  google.golang.org/grpc.recv()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:796 +0x84
  google.golang.org/grpc.(*csAttempt).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1084 +0x3a8
  google.golang.org/grpc.(*clientStream).RecvMsg.func1()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:927 +0x4c
  google.golang.org/grpc.(*clientStream).withRetry()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:760 +0x360
  google.golang.org/grpc.(*clientStream).RecvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:926 +0xe4
  github.com/dapr/dapr/pkg/proto/placement/v1.(*placementReportDaprStatusClient).Recv()
      /home/josh/go/src/github.com/dapr/dapr/pkg/proto/placement/v1/placement_grpc.pb.go:62 +0x64
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement.func1()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:412 +0x50

Previous read at 0x00c0005bec12 by goroutine 228:
  encoding/binary.bigEndian.Uint32()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/encoding/binary/binary.go:158 +0xd4
  google.golang.org/grpc.(*parser).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:605 +0x70
  google.golang.org/grpc.recvAndDecompress()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:730 +0x4c
  google.golang.org/grpc.recv()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:796 +0x84
  google.golang.org/grpc.(*csAttempt).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1084 +0x3a8
  google.golang.org/grpc.(*clientStream).RecvMsg.func1()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:927 +0x4c
  google.golang.org/grpc.(*clientStream).withRetry()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:760 +0x360
  google.golang.org/grpc.(*clientStream).RecvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:926 +0xe4
  github.com/dapr/dapr/pkg/proto/placement/v1.(*placementReportDaprStatusClient).Recv()
      /home/josh/go/src/github.com/dapr/dapr/pkg/proto/placement/v1/placement_grpc.pb.go:62 +0x64
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement.func1()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:412 +0x50

Goroutine 104 (running) created at:
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:408 +0x2b8
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).getPlacementClient()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:373 +0x3bc
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).Run.func7()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:116 +0x58

Goroutine 228 (running) created at:
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:408 +0x2b8
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).Run.func4()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:186 +0xb8
==================
==================
WARNING: DATA RACE
Write at 0x00c0000f2d48 by goroutine 104:
  google.golang.org/grpc/internal/transport.(*recvBufferReader).readAdditional()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:224 +0x254
  google.golang.org/grpc/internal/transport.(*recvBufferReader).readClient()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:210 +0xf0
  google.golang.org/grpc/internal/transport.(*recvBufferReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:171 +0x2bc
  google.golang.org/grpc/internal/transport.(*transportReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:506 +0x5c
  io.ReadAtLeast()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:335 +0xc8
  io.ReadFull()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:354 +0xe8
  google.golang.org/grpc/internal/transport.(*Stream).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:490 +0xc4
  google.golang.org/grpc.(*parser).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:600 +0x5c
  google.golang.org/grpc.recvAndDecompress()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:730 +0x4c
  google.golang.org/grpc.recv()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:796 +0x84
  google.golang.org/grpc.(*csAttempt).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1084 +0x3a8
  google.golang.org/grpc.(*clientStream).RecvMsg.func1()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:927 +0x4c
  google.golang.org/grpc.(*clientStream).withRetry()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:760 +0x360
  google.golang.org/grpc.(*clientStream).RecvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:926 +0xe4
  github.com/dapr/dapr/pkg/proto/placement/v1.(*placementReportDaprStatusClient).Recv()
      /home/josh/go/src/github.com/dapr/dapr/pkg/proto/placement/v1/placement_grpc.pb.go:62 +0x64
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement.func1()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:412 +0x50

Previous write at 0x00c0000f2d48 by goroutine 228:
  google.golang.org/grpc/internal/transport.(*recvBufferReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:166 +0x258
  google.golang.org/grpc/internal/transport.(*transportReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:506 +0x5c
  io.ReadAtLeast()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:335 +0xc8
  io.ReadFull()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:354 +0xe8
  google.golang.org/grpc/internal/transport.(*Stream).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:490 +0xc4
  google.golang.org/grpc.(*parser).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:617 +0x180
  google.golang.org/grpc.recvAndDecompress()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:730 +0x4c
  google.golang.org/grpc.recv()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:796 +0x84
  google.golang.org/grpc.(*csAttempt).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1084 +0x3a8
  google.golang.org/grpc.(*clientStream).RecvMsg.func1()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:927 +0x4c
  google.golang.org/grpc.(*clientStream).withRetry()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:760 +0x360
  google.golang.org/grpc.(*clientStream).RecvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:926 +0xe4
  github.com/dapr/dapr/pkg/proto/placement/v1.(*placementReportDaprStatusClient).Recv()
      /home/josh/go/src/github.com/dapr/dapr/pkg/proto/placement/v1/placement_grpc.pb.go:62 +0x64
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement.func1()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:412 +0x50

Goroutine 104 (running) created at:
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:408 +0x2b8
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).getPlacementClient()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:373 +0x3bc
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).Run.func7()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:116 +0x58

Goroutine 228 (running) created at:
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:408 +0x2b8
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).Run.func4()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:186 +0xb8
==================
==================
WARNING: DATA RACE
Write at 0x00c0000f2d50 by goroutine 104:
  google.golang.org/grpc/internal/transport.(*recvBufferReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:171 +0x2d0
  google.golang.org/grpc/internal/transport.(*transportReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:506 +0x5c
  io.ReadAtLeast()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:335 +0xc8
  io.ReadFull()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:354 +0xe8
  google.golang.org/grpc/internal/transport.(*Stream).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:490 +0xc4
  google.golang.org/grpc.(*parser).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:600 +0x5c
  google.golang.org/grpc.recvAndDecompress()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:730 +0x4c
  google.golang.org/grpc.recv()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:796 +0x84
  google.golang.org/grpc.(*csAttempt).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1084 +0x3a8
  google.golang.org/grpc.(*clientStream).RecvMsg.func1()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:927 +0x4c
  google.golang.org/grpc.(*clientStream).withRetry()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:760 +0x360
  google.golang.org/grpc.(*clientStream).RecvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:926 +0xe4
  github.com/dapr/dapr/pkg/proto/placement/v1.(*placementReportDaprStatusClient).Recv()
      /home/josh/go/src/github.com/dapr/dapr/pkg/proto/placement/v1/placement_grpc.pb.go:62 +0x64
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement.func1()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:412 +0x50

Previous read at 0x00c0000f2d50 by goroutine 228:
  google.golang.org/grpc/internal/transport.(*recvBufferReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:158 +0x44
  google.golang.org/grpc/internal/transport.(*transportReader).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:506 +0x5c
  io.ReadAtLeast()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:335 +0xc8
  io.ReadFull()
      /nix/store/7y75b2ac80chlh6knld3943y46n3v9kf-go-1.21.4/share/go/src/io/io.go:354 +0xe8
  google.golang.org/grpc/internal/transport.(*Stream).Read()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/transport.go:490 +0xc4
  google.golang.org/grpc.(*parser).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:617 +0x180
  google.golang.org/grpc.recvAndDecompress()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:730 +0x4c
  google.golang.org/grpc.recv()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/rpc_util.go:796 +0x84
  google.golang.org/grpc.(*csAttempt).recvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:1084 +0x3a8
  google.golang.org/grpc.(*clientStream).RecvMsg.func1()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:927 +0x4c
  google.golang.org/grpc.(*clientStream).withRetry()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:760 +0x360
  google.golang.org/grpc.(*clientStream).RecvMsg()
      /home/josh/go/pkg/mod/google.golang.org/grpc@v1.59.0/stream.go:926 +0xe4
  github.com/dapr/dapr/pkg/proto/placement/v1.(*placementReportDaprStatusClient).Recv()
      /home/josh/go/src/github.com/dapr/dapr/pkg/proto/placement/v1/placement_grpc.pb.go:62 +0x64
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement.func1()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:412 +0x50

Goroutine 104 (running) created at:
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:408 +0x2b8
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).getPlacementClient()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:373 +0x3bc
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).Run.func7()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:116 +0x58

Goroutine 228 (running) created at:
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).reportStatusToPlacement()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:408 +0x2b8
  github.com/dapr/dapr/tests/integration/suite/actors/reminders.(*rebalancing).Run.func4()
      /home/josh/go/src/github.com/dapr/dapr/tests/integration/suite/actors/reminders/rebalancing.go:186 +0xb8
==================

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this race condition exists in master too. I'll fix it in another PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not get a race condition for this test on master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race condition is due to concurrent recv and send in the gRPC channel. Not sure why it doesn't appear on master (maybe because the background goroutine was terminated on the first message?) but that's not new code. Fixing that requires creating 2 separate streams and it's a moderately-sized code change. Can be fixed later

@ItalyPaleAle
Copy link
Contributor Author

Looks like the race condition is another bug in that test case (only in test code) that predates this PR. Fix doesn't seem immediate, but I can look into fixing that in a separate PR.

The remaining comments are about nits on test code, including naming, and can be fixed separately if needed. Looking forward getting this PR merged to fix the perf and integration tests failing in the master branch for many weeks :)

@artursouza
Copy link
Member

artursouza commented Dec 7, 2023

Looks like the race condition is another bug in that test case (only in test code) that predates this PR. Fix doesn't seem immediate, but I can look into fixing that in a separate PR.

The remaining comments are about nits on test code, including naming, and can be fixed separately if needed. Looking forward getting this PR merged to fix the perf and integration tests failing in the master branch for many weeks :)

There might be 2 different issues here, so I don't want to dismiss the race condition in this PR or a bug in the test code.

On the other hand, we should raise our bar in pull requests because our users (and maintainers) are paying a price for that. I suggest we move slower but get the PRs to a mergeable state and avoid "fix later" as much as possible. It is better to move slower in the right direction than fast in the wrong one.

dapr-bot and others added 2 commits December 11, 2023 06:52
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
@yaron2 yaron2 merged commit 72a439c into dapr:master Dec 12, 2023
20 of 22 checks passed
@yaron2
Copy link
Member

yaron2 commented Dec 12, 2023

Merging this as the race condition is scoped to test code and we're getting close to code freeze. We have a plan with @JoshVanL to follow up with a fix.

JoshVanL added a commit to JoshVanL/dapr that referenced this pull request Dec 14, 2023
* Added support for reading/writing reminders as protobuf depending on API level

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Increase target QPS

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Added integration tests

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* 💄

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Updated test code per review feedback

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* 💄

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Mod tidy

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Update tests/integration/suite/actors/reminders/serialization_json.go

Signed-off-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com>

* Update tests/integration/framework/process/daprd/options.go

Co-authored-by: Josh van Leeuwen <me@joshvanl.dev>
Signed-off-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com>

* Changed per review feedback

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Lint

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Should fix the test errors

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Lint....

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

* Disable ITs on Windows due to SQLite limitations

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>

---------

Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
Signed-off-by: Alessandro (Ale) Segala <43508+ItalyPaleAle@users.noreply.github.com>
Co-authored-by: Dapr Bot <56698301+dapr-bot@users.noreply.github.com>
Co-authored-by: Josh van Leeuwen <me@joshvanl.dev>
Co-authored-by: Yaron Schneider <schneider.yaron@live.com>
ItalyPaleAle added a commit to ItalyPaleAle/dapr that referenced this pull request Dec 30, 2023
Signed-off-by: ItalyPaleAle <43508+ItalyPaleAle@users.noreply.github.com>
@ItalyPaleAle ItalyPaleAle added this to the v1.13 milestone Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoupdate DaprBot will keep the Pull Request up to date with master branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants