Retrieving Shards information from RedisCluster #260

mellis13 · 2021-08-12T22:32:48Z

For the application I am working on, it is helpful for performance to know the layout of the redis cluster with regards to hash slot distribution between the nodes. Is it possible to retrieve this information from RedisCluster? It seems like there isn't any access methods for the private member variable ShardsPool _pool. I'd like to avoid a redundant CLUSTER SLOTS command since this is done during RedisCluster object creation, and RedisCluster maintains the most up-to-date information on shards. If there isn't a method, would the community be open to a method that returns access to _pool for inspection but not modification?

The text was updated successfully, but these errors were encountered:

sewenew · 2021-08-13T13:57:39Z

it is helpful for performance to know the layout of the redis cluster with regards to hash slot distribution between the nodes.

What's your scenario? Can you give some example? Since redis-plus-plus will cache the slot mapping, when you send commands to Redis Cluster, it, in fact, sends the command to the right node directly. There won't be any performance penalty.

Even if you get the underlying ShardPool, it might not help. Because if the slot-mapping changes (it might change just after you get the ShardsPool), ShardsPool will be out-of-date, and you need to manually update it.

Regards

mellis13 · 2021-08-13T16:15:54Z

Thanks for the quick response. The use case is specific to a Redis module being used. The general idea is that a copy of data is placed on every cluster node to facilitate efficient parallel computation. This placement of copied data requires knowing the cluster slots assigned to each database node. Re-sharding is not a concern in this specific use case.

sewenew · 2021-08-15T02:34:06Z

The general idea is that a copy of data is placed on every cluster node to facilitate efficient parallel computation. This placement of copied data requires knowing the cluster slots assigned to each database node. Re-sharding is not a concern in this specific use case.

If I understand correctly, it seems that you don't even need a Redis Cluster, instead, you need several standalone Redis instances. So that you can parallel the computation. You can create a Redis object (not RedisCluster) pool by creating a Redis object for each node, and randomly pick one from the pool for each operation. This also works even if you deploy these node as a Redis Cluster.

If you insist on using a RedisCluster, and don't need to worry about the re-sharding problem, you can also write your data on each node WITH THE SAME KEY by Redis object. When you need to operate the data, you can call RedisCluster::redis("random-hash-tag", false) with a randomly generate hash tag to randomly pick a node, and send the command to it.

If there isn't a method, would the community be open to a method that returns access to _pool for inspection but not modification?

In fact, you can manually create a ShardsPool with its constructor:

ShardsPool(const ConnectionPoolOptions &pool_opts,
                const ConnectionOptions &connection_opts,
                Role role);

The constructor will call CLUSTER SLOTS to get the slot mapping info. Then you can use the ShardsPool::shards method to get the slot-mapping info.

Why there's any method to get the underlying node/slot info?

As I mentioned in above comments, this mapping might change at any time. With such a method, you might get an out-of-date info, and it might mislead you.

I'll keep this issue open to see if others have similar requirements for the node-slot mapping.

If you still have any problem on it, feel free to post it.

Regards

mlaczin · 2023-03-21T15:33:24Z

I have a use case for this, actually. We're using Redis JSON, and in particular, JSON.MGET which (with redis++) demands that we use the command interface. However, general multikey queries will only be routed to the node associated to the first key, so we need to collect keys which we know (via the hashslot) will be on a particular node and send them in one batch.

For example, if we have five keys (1, 2, 3, 4, 5) on two nodes (A, B), where 1,2,3 are on A and 4,5 are on B, then this command will fail:

JSON.MGET 1 2 3 4 5 $

And there will be two nil's in the output (associated to keys 4, 5).

Thus we need to send two commands:

JSON.MGET 1 2 3 $
JSON.MGET 4 5 $

Where we compute the hashslots of 1, 2, 3, 4, 5 and divide them properly.

In our case, we have several million keys that need to be divided in this way. Consequently, having access to the hashslot ranges associated to each node would be helpful.

sewenew · 2023-03-23T07:17:30Z

@mlaczin If I understand correctly, even if you get the slot range info, you cannot call RedisCluster::command("JSON.MGET", ...) with 1 2 3, if these 3 keys do NOT belong to the same slot. Because Redis Cluster requires that all keys in one command must belong to the same slot, instead of the same node.

One solution is that you can use hash-tag to ensure these keys belong to the same slot. Once you do that, you can call RedisCluster::redis("hash-tag", false) to create a Redis object and send the command with it.

Regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieving Shards information from RedisCluster #260

Retrieving Shards information from RedisCluster #260

mellis13 commented Aug 12, 2021

sewenew commented Aug 13, 2021

mellis13 commented Aug 13, 2021

sewenew commented Aug 15, 2021

mlaczin commented Mar 21, 2023

sewenew commented Mar 23, 2023

Retrieving Shards information from RedisCluster #260

Retrieving Shards information from RedisCluster #260

Comments

mellis13 commented Aug 12, 2021

sewenew commented Aug 13, 2021

mellis13 commented Aug 13, 2021

sewenew commented Aug 15, 2021

mlaczin commented Mar 21, 2023

sewenew commented Mar 23, 2023