Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot transfer performance #1678

Open
aderouineau opened this issue Feb 7, 2024 · 8 comments
Open

Snapshot transfer performance #1678

aderouineau opened this issue Feb 7, 2024 · 8 comments

Comments

@aderouineau
Copy link

What version are you running?
8.19.0

Are you using Docker or Kubernetes to run your system?
Neither: calling application directly

Are you running a single node or a cluster?
Single quorum node with a single read-only node.

What did you do?

  • Launched a new single quorum node: ./rqlite-v8.19.0-linux-amd64/rqlited -node-id 1 data1
  • Loaded a 13GB SQLite DB using .boot
  • Launched a new read-only node on the same host and filesystem: ./rqlite-v8.19.0-linux-amd64/rqlited -raft-non-voter=true -http-addr=localhost:4011 -raft-addr=localhost:4012 -join=localhost:4002 -raft-log-level=INFO data2. This made the read-only node get the 10+GB snapshot from the leader. The logs show a rate of 105 MB/s.
  • Created 10GB file with random data: dd if=<(openssl enc -pbkdf2 -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) of=random_data bs=100M count=100 iflag=fullblock
  • Tested transfer speed: dd if=/home/derouin/rqlite/random_data of=/home/derouin/rqlite/random_data2 which gave a rate of 207 MB/s.

Logs showing snapshot transfer:

[rqlited] 2024/02/07 04:20:26 node HTTP API available at http://localhost:4011
[rqlited] 2024/02/07 04:20:26 connect using the command-line tool via 'rqlite -H localhost -p 4011'
[raft] 2024/02/07 04:20:26 [WARN]  failed to get previous log: previous-index=4 last-index=0 error="log not found"
[raft] 2024/02/07 04:20:36 [INFO]  snapshot network transfer progress: read-bytes=766181376 percent-complete="5.76%"
[raft] 2024/02/07 04:20:46 [INFO]  snapshot network transfer progress: read-bytes=1736704000 percent-complete="13.06%"
[raft] 2024/02/07 04:20:56 [INFO]  snapshot network transfer progress: read-bytes=2762866688 percent-complete="20.77%"
[raft] 2024/02/07 04:21:06 [INFO]  snapshot network transfer progress: read-bytes=3797811200 percent-complete="28.55%"
[raft] 2024/02/07 04:21:16 [INFO]  snapshot network transfer progress: read-bytes=4835835904 percent-complete="36.35%"
[raft] 2024/02/07 04:21:26 [INFO]  snapshot network transfer progress: read-bytes=5834997760 percent-complete="43.87%"
[raft] 2024/02/07 04:21:36 [INFO]  snapshot network transfer progress: read-bytes=6840778752 percent-complete="51.43%"
[raft] 2024/02/07 04:21:46 [INFO]  snapshot network transfer progress: read-bytes=7867924480 percent-complete="59.15%"
[raft] 2024/02/07 04:21:56 [INFO]  snapshot network transfer progress: read-bytes=8833925120 percent-complete="66.41%"
[raft] 2024/02/07 04:22:06 [INFO]  snapshot network transfer progress: read-bytes=9785114624 percent-complete="73.56%"
[raft] 2024/02/07 04:22:16 [INFO]  snapshot network transfer progress: read-bytes=11205083136 percent-complete="84.24%"
[raft] 2024/02/07 04:22:26 [INFO]  snapshot network transfer progress: read-bytes=12725649408 percent-complete="95.67%"
raft] 2024/02/07 04:22:32 [INFO]  copied to local snapshot: bytes=13301739520
[store] 2024/02/07 04:22:32 initiating node restore on node ID localhost:4012
[raft] 2024/02/07 04:22:42 [INFO]  snapshot restore progress: id=2-3-1707279626880 last-index=3 last-term=2 size-in-bytes=13301739520 read-bytes=5995560960 percent-complete="45.07%"
[raft] 2024/02/07 04:22:52 [INFO]  snapshot restore progress: id=2-3-1707279626880 last-index=3 last-term=2 size-in-bytes=13301739520 read-bytes=8808824832 percent-complete="66.22%"
[raft] 2024/02/07 04:23:02 [INFO]  snapshot restore progress: id=2-3-1707279626880 last-index=3 last-term=2 size-in-bytes=13301739520 read-bytes=11653480448 percent-complete="87.61%"
[store] 2024/02/07 04:23:08 successfully opened database at /home/derouin/rqlite/data2/db.sqlite due to restore

What did you expect to happen?

I expected the rqlite snapshot transfer speed to be closer to 207 MB/s.

What happened instead?

The rqlite snapshot transfer speed was only 105 MB/s (about ~50%)

@aderouineau
Copy link
Author

My test host is an AWS EC2 instance m5.2xlarge with 8 vCPUs, 32.0 GiB RAM, and EBS SSD storage.

@otoolep
Copy link
Member

otoolep commented Feb 13, 2024

@aderouineau -- when a Snapshot is transferred to a node, in your mental model how many copies of the SQLite database are made on the receiving node?

@otoolep
Copy link
Member

otoolep commented Feb 13, 2024

On the receiving node two copies are made of the received snapshot -- one to that copies the snapshot to its internal store, the other which then takes that copy, and makes the working SQLite copy that rqlite uses to serve queries.

But I see you're saying the network transfer speed itself is slower than you expect. One other thing to be aware is that rqlite gzip-compresses the snapshot on the sending side, and decompresses it on the receiving side. It does this to reduce network bandwidth.

In your testing are you streaming the data over the local network, in addition to running your dd test? It doesn't look like it. If you want to simulate closely with two rqlite nodes are doing, it's like this.

  • node1 is reading from disk
  • node1 is compressing
  • node1 is sending over the network
  • node 2 is receiving on the network
  • node 2 is decompressing
  • node 2 is writing to disk

@aderouineau
Copy link
Author

Both nodes are running on the same machine and using localhost

@otoolep
Copy link
Member

otoolep commented Feb 13, 2024

Both nodes are running on the same machine and using localhost

I understand, I'm just pointing that the dd test you are running is not fully simulating what is actually happening when one node transfers a snapshot to another node (assuming I'm not missing anything).

@aderouineau
Copy link
Author

I'll look into a better benchmark and see if this is still a problem.

@otoolep
Copy link
Member

otoolep commented Feb 13, 2024

Maybe run a test using netcat. I think if you put gzip and netcat between the dd invocations that could be a good test.

dd -> gzip -> nc -> nc -> gunzip -> dd

@otoolep
Copy link
Member

otoolep commented Feb 13, 2024

Correct comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants