Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] ab-cast etcd #4

Merged
merged 1 commit into from Feb 27, 2021
Merged

[feature] ab-cast etcd #4

merged 1 commit into from Feb 27, 2021

Conversation

jabolina
Copy link
Owner

Creates a Core interface, where the default structure is the ReltCore, this core structure will manage all goroutines and state, for issuing request to peers and receiving messages, as well as shutdown resources. To issue requests the Coordinator interface, this interface should be implemented by the structure that issues and receives commands using the atomic broadcast.

For referece, on the current version, should exists a RabbitMQCoordinator responsible for sending and receiving messages. The current available coordinator is the EtcdCoordinator backed by a etcd server, as mentioned on #3.

Issues

The etcd version seems to have something wrong, that can be seen be the output:

$ go get "github.com/coreos/etcd/clientv3"
go: downloading github.com/coreos/etcd v0.5.0-alpha.5
go: downloading github.com/coreos/etcd v3.3.25+incompatible
go: found github.com/coreos/etcd/clientv3 in github.com/coreos/etcd v3.3.25+incompatible
go: finding module for package google.golang.org/grpc/resolver
go: finding module for package google.golang.org/grpc/credentials
go: finding module for package google.golang.org/grpc/codes
go: finding module for package github.com/google/uuid
go: finding module for package github.com/golang/protobuf/proto
go: finding module for package github.com/coreos/pkg/capnslog
go: finding module for package google.golang.org/grpc/grpclog
go: finding module for package go.uber.org/zap/zapcore
go: finding module for package google.golang.org/grpc/status
go: finding module for package google.golang.org/grpc/metadata
go: downloading google.golang.org/grpc v1.35.0
go: downloading github.com/golang/protobuf v1.4.3
go: finding module for package google.golang.org/grpc
go: finding module for package google.golang.org/grpc/balancer
go: finding module for package github.com/gogo/protobuf/gogoproto
go: finding module for package github.com/coreos/go-semver/semver
go: finding module for package google.golang.org/grpc/keepalive
go: finding module for package github.com/coreos/go-systemd/journal
go: finding module for package google.golang.org/genproto/googleapis/api/annotations
go: downloading github.com/google/uuid v1.2.0
go: downloading go.uber.org/zap v1.16.0
go: finding module for package google.golang.org/grpc/resolver/dns
go: finding module for package go.uber.org/zap
go: finding module for package google.golang.org/grpc/resolver/passthrough
go: downloading github.com/gogo/protobuf v1.3.2
go: downloading github.com/coreos/go-systemd v0.0.0-20191104093116-d3cd4ed1dbcf
go: downloading google.golang.org/genproto v0.0.0-20210212180131-e7f2df4ecc2d
go: finding module for package google.golang.org/grpc/connectivity
go: found github.com/coreos/pkg/capnslog in github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f
go: found github.com/google/uuid in github.com/google/uuid v1.2.0
go: found go.uber.org/zap in go.uber.org/zap v1.16.0
go: found google.golang.org/grpc in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/codes in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/credentials in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/grpclog in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/keepalive in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/metadata in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/status in google.golang.org/grpc v1.35.0
go: found github.com/gogo/protobuf/gogoproto in github.com/gogo/protobuf v1.3.2
go: found github.com/golang/protobuf/proto in github.com/golang/protobuf v1.4.3
go: found google.golang.org/grpc/balancer in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/connectivity in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/resolver in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/resolver/dns in google.golang.org/grpc v1.35.0
go: found google.golang.org/grpc/resolver/passthrough in google.golang.org/grpc v1.35.0
go: found go.uber.org/zap/zapcore in go.uber.org/zap v1.16.0
go: found google.golang.org/genproto/googleapis/api/annotations in google.golang.org/genproto v0.0.0-20210212180131-e7f2df4ecc2d
go: found github.com/coreos/go-systemd/journal in github.com/coreos/go-systemd v0.0.0-20191104093116-d3cd4ed1dbcf
go: found github.com/coreos/go-semver/semver in github.com/coreos/go-semver v0.3.0
go: downloading google.golang.org/protobuf v1.25.0
go: downloading go.uber.org/multierr v1.5.0
go: downloading golang.org/x/net v0.0.0-20201021035429-f5854403a974
go: downloading golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f
go: downloading go.uber.org/atomic v1.6.0
go: downloading golang.org/x/text v0.3.3
# github.com/coreos/etcd/clientv3/balancer/resolver/endpoint
../../go/pkg/mod/github.com/coreos/etcd@v3.3.25+incompatible/clientv3/balancer/resolver/endpoint/endpoint.go:114:78: undefined: resolver.BuildOption
../../go/pkg/mod/github.com/coreos/etcd@v3.3.25+incompatible/clientv3/balancer/resolver/endpoint/endpoint.go:182:31: undefined: resolver.ResolveNowOption
# github.com/coreos/etcd/clientv3/balancer/picker
../../go/pkg/mod/github.com/coreos/etcd@v3.3.25+incompatible/clientv3/balancer/picker/err.go:37:44: undefined: balancer.PickOptions
../../go/pkg/mod/github.com/coreos/etcd@v3.3.25+incompatible/clientv3/balancer/picker/roundrobin_balanced.go:55:54: undefined: balancer.PickOptions

Digging through some issues to find somenthing (etcd-io/etcd#11931, etcd-io/etcd#12577) found the solution was to lock the grpc dependecy on v1.26.0, but this can be harmful since on this version the grpc has a problem handling DNS.

@jabolina jabolina added the enhancement New feature or request label Feb 14, 2021
@jabolina jabolina self-assigned this Feb 14, 2021
@jabolina
Copy link
Owner Author

Had to relax the broadcast timeout because of the disk throughput usage with etcd. Since the raft protocol persist some information, the disk could take more than 200 ms to execute a write.

The current implementation is backed by etcd. Using this
approach we should have a primitive that is consistent and totally
orders all messages.

Along with this changes, added a generic interface to be possible to
use different atomic broadcast protocols.
@jabolina
Copy link
Owner Author

Using a well established and well tested atomic broadcast implementation. Using this approach we now have a communication primitive that is reliable, totally orders messages between all replicas. To everything work properly, some changes were made, the first one is the need for an etcd server to be running and be able to receive connections, documentation for this can be found here.

Some extra configuration was added, specifically the DefaultTimeout time.Duration property, now the client can define a timeout for the write requests. Since the etcd server have a WAL, the disk speed can affect the writes sent to the server. This property can be used to tune the primitive.

Internally, every user request will reach a coordinator. The Coordinator interface will be implemented by an atomic broadcast client, in this case, we have a EtcdCoordinator. To create the communication abstraction, we are using the KV API for writing and the Watch API for receiving any changes.

When creating a new relt object, the client must define a partition, in which the client will bind. Every message sent will be sent on behalf of the defined partition, and messages received are messages in which the destination is the configured partition. This is an abstraction hidden by the KV, when sending a message to any partition, we will be writing on a distributed key-value structure, that is kept consistent using the raft protocol, where the destination of a message is the key we are writing a value.

Using the watch API for a partition, we are able to be notified about commands applied on the distributed KV. The EtcdCoordinator will be listening for changes applied only to the configured partition. For every PUT applied to the KV, we publish the change back to the client.

Some TODOs were left to be handled in the future. For example, what should we do when a new message is received, but we are not able to sent it through the channel to be consumed? And some possibilities for enhancement, at this moment we are synchronizing every write request and sending a single write request at a time, perhaps a better approach can be made for issuing writes to the etcd server concurrently.

@jabolina jabolina merged commit 0e7cc81 into master Feb 27, 2021
@jabolina jabolina deleted the abcast-etcd branch February 27, 2021 01:52
@jabolina jabolina linked an issue Feb 27, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use etcd for atomic broadcast
1 participant