Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP resolver not updating IPs on failure #993

Closed
Excpt0r opened this issue Oct 11, 2021 · 8 comments
Closed

IP resolver not updating IPs on failure #993

Excpt0r opened this issue Oct 11, 2021 · 8 comments

Comments

@Excpt0r
Copy link
Contributor

Excpt0r commented Oct 11, 2021

Hi,

I use jetcd in an application that runs on kubernetes and connects to a 3-node ETCD cluster (also k8s).
When I scale down the ETCD cluster to 0 instances and then scale up again to 3,
jetcd client is not able to reconnect again to the ETCD cluster (healthy again).
The new ETCD pods on kubernetes get a new IP, and DNS names are updated and also resolve correctly within the JVM (when networkaddress.cache.ttl=10 is set).
But jetcd client still retries with the previous outdated IPs.

Debugging the logs, I thought the problem is related to grpc-java so I opened ticket grpc/grpc-java#8574 there, including further logs.
It looks like the IPresolver implementation of jetcd is not capable to update the IPs after failures.

Tested with jetcd 0.5.10

@lburgazzoli
Copy link
Collaborator

lburgazzoli commented Oct 11, 2021

do you have any time to work on a fix ?

@Excpt0r
Copy link
Contributor Author

Excpt0r commented Oct 11, 2021

So far I have not enough knowledge about jetcd, the changes done in #814 and how that works with grpc-java, to be able to design a solution.
If you already have an idea what needs to be changed, let me know and I will have a look.

Update: Just saw that #814 did not change much beside the name. But the question is why jetcd needs own resolve logic.

lburgazzoli added a commit to lburgazzoli/etcd-io-jetcd that referenced this issue Oct 11, 2021
@lburgazzoli
Copy link
Collaborator

Added a potential fix here: #994
Mind testing against that PR ?

@lburgazzoli
Copy link
Collaborator

Update: Just saw that #814 did not change much beside the name. But the question is why jetcd needs own resolve logic.

Because grpc-java does not provide a resolver that supports multiple hosts

@Excpt0r
Copy link
Contributor Author

Excpt0r commented Oct 12, 2021

Thanks! I will test the PR in the next few days.

@Excpt0r
Copy link
Contributor Author

Excpt0r commented Oct 18, 2021

Hi @lburgazzoli
I just tested PR #994 and results are looking good to be, jetcd now updates the DNS information on failure and can connect to the new pod IPs.
Thank you for your work!

@lburgazzoli
Copy link
Collaborator

cool !

I'll write some unit test and then I can cut a release

@lburgazzoli
Copy link
Collaborator

fixed by #994

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants