Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commitlog handling #1710

Open
yuriy-yarosh opened this issue Jun 29, 2023 · 5 comments
Open

Commitlog handling #1710

yuriy-yarosh opened this issue Jun 29, 2023 · 5 comments

Comments

@yuriy-yarosh
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Cassandra are you using?

4.1

What version of Gocql are you using?

1.5.2

What version of Go are you using?

1.20

What did you do?

Streaming cassandra commitlogs with common Java reader/replayer, similar to cassandra-commitlog-extract.

What did you expect to see?

It would be really nice if gocql could parse Cassandra Commitlogs, so people could use golang for Cassandra CDC streaming.

What did you see instead?

That there's no Commitlog support in gocql.

@martin-sucha
Copy link
Contributor

It seems to me that this can and should be implemented in an external library, not part of the core gocql. Each CQL database implementation (Cassandra, Scylla, ...) has a different CDC implementation. For example Scylla has https://github.com/scylladb/scylla-cdc-go.

@yuriy-yarosh
Copy link
Author

yuriy-yarosh commented Jun 29, 2023

@martin-sucha yes, I'm well aware.
It's just cassadra does things in a very different manner, because it expects developers to be reponsible for the direct CommitLog files consuming, instead of exposing them as a CQL-compatible construct (like Scylla did) or any kind of consumable network stream data.

The exact Cassandra CDC implementation is on developers shoulders, that's why we have things like debezium.

There's literally no golang libs for CommitLog codding, and the exact CommitLog operation is heavily dependent on the number of available replicas and the client implementation itself - we can't stream and replay what's haven't been sync'ed yet, across all the hosts. In that regard there's nothing available for golang, yet.

@yuriy-yarosh
Copy link
Author

My aim is to stream Cassandra CDC to Knative Eventing CloudEvent channel with a custom Apache Arrow marshaling and processing (flatbuffers IPC easier to optimize). And I'd like to do that with zero additional hops and with zero-alloc golang. Had considered rust, as well.

@martin-sucha
Copy link
Contributor

My aim is to stream Cassandra CDC to Knative Eventing CloudEvent channel with a custom Apache Arrow marshaling and processing (flatbuffers IPC easier to optimize). And I'd like to do that with zero additional hops and with zero-alloc golang. Had considered rust, as well.

How is gocql related to that?

@yuriy-yarosh
Copy link
Author

@martin-sucha I'd like to do that in golang, reusing the existing gocql client with CommitLog sync bindings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants