Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting a node quickly and connecting will not emit onConnectevents in registrar #2378

Open
marcus-pousette opened this issue Jan 25, 2024 · 2 comments
Labels
need/triage Needs initial labeling and prioritization

Comments

@marcus-pousette
Copy link
Contributor

marcus-pousette commented Jan 25, 2024

  • Version:
    1.1.0

  • Subsystem:
    Registrar, connection manager?

Severity:

Low

Description:

I am trying to track down issues I have with connections and realised that I get unexpected behaviours when restarting nodes

Steps to reproduce the error:

Create a protocol handler with

this.components.registrar.register(multicodec, {
	onConnect: this.onPeerConnected.bind(this),
	onDisconnect: this.onPeerDisconnected.bind(this),
	notifyOnTransient: false
})
  • Start 2 libp2p nodes with tcp transport
  • Connect the nodes
  • Write data back and forth (not sure exactly to what extent), but this seem to matter
  • Now stop one one, and then start it again immediately.
  • Connect nodes agains immediately. I am dialing from the node that never closed.

Expected behaviour onPeerConnected is emitted 2 times for both nodes.

Actual behaviour. onPeerConnected is never emitted on the last connect for any node. And the onPeerDisconnected is emitted late, after dialing has been resolved (looks like that the connection take a while to close (?))

Putting a delay between stopping and starting the node, or between starting and connecting, will solve this issue, but in a real world usecase where I refresh a tab in browser I suspect that this issue is causing the issues I am having.

@marcus-pousette marcus-pousette added the need/triage Needs initial labeling and prioritization label Jan 25, 2024
@marcus-pousette marcus-pousette changed the title Restarting a node quickly and connecting will not emit onConnectevents in regigstrar Restarting a node quickly and connecting will not emit onConnectevents in registrar Jan 25, 2024
@marcus-pousette
Copy link
Contributor Author

This is what I get when I am starting up the node after closing

2024-01-25T23:54:38.968Z libp2p:transports creating listener for @libp2p/tcp on /ip4/127.0.0.1/tcp/0
2024-01-25T23:54:38.968Z libp2p:tcp:listener Listening on { address: '127.0.0.1', family: 4, port: 60420 }
2024-01-25T23:54:38.969Z libp2p:connection-manager dial 12D3KooWPQUiiod5RRfnRf2reXAP1mw5vM86m32atbXZuN3T49Yd
2024-01-25T23:54:38.969Z libp2p:connection-manager had an existing non-transient connection to 12D3KooWPQUiiod5RRfnRf2reXAP1mw5vM86m32atbXZuN3T49Yd
2024-01-25T23:54:38.969Z libp2p:tcp:socket 127.0.0.1:60410 socket end
2024-01-25T23:54:38.969Z libp2p:mplex initiator stream with id 1 and protocol /test/0.0.0 ended
2024-01-25T23:54:38.969Z libp2p:mplex receiver stream with id 1 and protocol /test/0.0.0 ended
2024-01-25T23:54:38.969Z libp2p:tcp:socket 127.0.0.1:60410 socket close
2024-01-25T23:54:38.969Z libp2p:connection:outbound:aynwkw1706226876843 closing connection to /ip4/127.0.0.1/tcp/60410/p2p/12D3KooWPQUiiod5RRfnRf2reXAP1mw5vM86m32atbXZuN3T49Yd
2024-01-25T23:54:38.970Z libp2p:tcp:socket 127.0.0.1:60410 socket was already destroyed when trying to close

It seems libp2p:connection-manager had an existing non-transient connection to 12D3KooWPQUiiod5RRfnRf2reXAP1mw5vM86m32atbXZuN3T49Yd
is skipping doing some things because of this. I assume this creates consequences down the line (?)

I dont get it by putting a delay before start and connect. Then it will look like this

2024-01-26T00:01:02.345Z libp2p:transports creating listener for @libp2p/tcp on /ip4/127.0.0.1/tcp/0
2024-01-26T00:01:02.345Z libp2p:tcp:listener Listening on { address: '127.0.0.1', family: 4, port: 61966 }
2024-01-26T00:01:05.346Z libp2p:connection-manager dial 12D3KooWLNL49UuYwaSRh1fbE7Qj2CJwYj9KftHX2iBBR2m6eZJF
2024-01-26T00:01:05.347Z libp2p:connection-manager:dial-queue creating dial target for 12D3KooWLNL49UuYwaSRh1fbE7Qj2CJwYj9KftHX2iBBR2m6eZJF [
'/ip4/127.0.0.1/tcp/61966/p2p/12D3KooWLNL49UuYwaSRh1fbE7Qj2CJwYj9KftHX2iBBR2m6eZJF'
]
2024-01-26T00:01:05.348Z libp2p:tcp dialing /ip4/127.0.0.1/tcp/61966/p2p/12D3KooWLNL49UuYwaSRh1fbE7Qj2CJwYj9KftHX2iBBR2m6eZJF
2024-01-26T00:01:05.348Z libp2p:tcp connection opened /ip4/127.0.0.1/tcp/61966/p2p/12D3KooWLNL49UuYwaSRh1fbE7Qj2CJwYj9KftHX2iBBR2m6eZJF
2024-01-26T00:01:05.348Z libp2p:tcp new outbound connection /ip4/127.0.0.1/tcp/61966/p2p/12D3KooWLNL49UuYwaSRh1fbE7Qj2CJwYj9KftHX2iBBR2m6eZJF
2024-01-26T00:01:05.348Z libp2p:tcp:socket starting the outbound connection upgrade
2024-01-26T00:01:05.348Z libp2p:tcp:socket selecting outbound crypto protocol [ '/noise' ]
2024-01-26T00:01:05.348Z libp2p:tcp:listener new inbound connection /ip4/127.0.0.1/tcp/61980
2024-01-26T00:01:05.348Z libp2p:tcp:socket starting the inbound connection upgrade

@marcus-pousette
Copy link
Contributor Author

marcus-pousette commented Jan 26, 2024

It seems to matter which node is dialling whom.. If the node that is not closing is dialing the the other node. Then it will fail, but the other way around it works. I assume that the node that is not stopping have a connection lingering that ought to close, which makes dialling again not possible. Given this, this might not be a problematic issue after all because it is usually the node that is starting that is starting to dial.

But the only issue perhaps is that I received a ConnectionImpl object after dialing (from the "wrong" side), even if it was perhaps closing, which made this problem less easy to understand.

You might close this if this is expected behaviour

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

1 participant