Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure on large redis clusters #2418

Open
1 of 2 tasks
boixu opened this issue Nov 9, 2023 · 1 comment
Open
1 of 2 tasks

Failure on large redis clusters #2418

boixu opened this issue Nov 9, 2023 · 1 comment

Comments

@boixu
Copy link

boixu commented Nov 9, 2023

Hi

We have a situation where phpredis starts failing with errors
"RedisCluster::__construct(): php_network_getaddresses: getaddrinfo failed: Name or service not known"
and "Fatal error: Uncaught RedisClusterException: Couldn't map cluster keyspace using any provided seed"
When there is a large amount of requests on a cluster with 20 or more nodes.

We use AWS Elasticache as our redis cluster. When the node count on the cluster is below 20 everything works fine, even under high loads. Once the node count is 20 or more we start seeing the above errors.
We use a DNS address(provided by Elasticache) as the host endpoint and not individually specified hosts.

We know this is a phpredis issue because the same redis cluster is being used by go-based clients as well and there we see no issues(no errors or slowness)

We have tried enabling redis.clusters.cache_slots = 1 but that did not help

Tested latest phpredis(6.0.2) as well with no help

I suspect it has something to do with the logic behind the cluster nodes connections

Any help would be appreciated!
Thanks!

Expected behavior

No errors

Actual behavior

Getting the errors above

I'm seeing this behavior on

  • OS: Centos 7
  • Redis: 6.2.6
  • PHP: 7.4.3
  • phpredis: 5.1.1 / 6.0.2

Steps to reproduce, backtrace or example script

Redis cluster with 20 nodes or more.
High load simulation, eg 1000 concurrent connections creating 1000 random read write requests

I can provide the php script i was creating the artificial load with if needed.

I've checked

  • There is no similar issue from other users
  • Issue isn't fixed in develop branch
@michael-grunder
Copy link
Member

michael-grunder commented Jan 9, 2024

It's probably going to be tough to track this down without being able to replicate it, but the error indicates something is failing at the kernel/OS layer.

php_network_getaddresses: getaddrinfo failed: Name or service not known

That means that the getaddrinfo syscall is not able to resolve the host. Perhaps there is some sort of resource exhaustion going on here that is causing these symptoms?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants