Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using QUIC/HTTP3 to replace utp-native for the Data Transfer Layer in the networking domain #234

Closed
19 tasks done
CMCDragonkai opened this issue Aug 30, 2021 · 42 comments · Fixed by #525
Closed
19 tasks done
Assignees
Labels
design Requires design development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices research Requires research

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 30, 2021

Specification

We can separate the networking to 3 layers:

  • P2P application layer - e.g. kademlia, automerge, nodes domain
  • RPC layer - e.g. grpc (and once considered jsonrpc but it's too late now)
  • Data Transfer layer - UTP, Wireguard, QUIC

The Data Transfer Layer is particularly special since it is the lowest part of our stack and is always fundamentally built on top of UDP. These are its characteristics:

  • Must be built on top of UDP (due to hole-punching requirements)
  • Capable of NAT-traversal via hole punch packets
  • Capable of being proxied for traversing symmetric NAT
  • End to end encrypted - based on MTLS or otherwise
  • Low latency and high throughput (and ideally low power for maintaining the connection liveness)
  • Compatible with Linux, Mac, Windows, iOS, Android

Currently we use the TLS on top of UTP protocol as the Data Transfer Layer. This is implemented with a forward and reverse proxy termination points bridging TCP to the UTP protocol.

The proxy termination bridges was necessary due to us using GRPC as the RPC layer and its implementation fixing on HTTP2 stack, but with an escape hatch via a connect proxy. This is more of an implementation detail than anything else. But it does give us a fairly generic system that can turn any TCP-protocol into something NAT-traversable.

The usage of MTLS also enables a seamless usage of the same X.509 certificate that we already use for the rest of PK.

The underlying library being used here is utp-native https://github.com/mafintosh/utp-native, it is C++ module that incorporates the utp C++ library https://github.com/bittorrent/libutp and wraps it into a NodeJS module. It works fine for Linux, Window, and Mac. However there is no clear path for usage on Android or iOS. See mafintosh/utp-native#30. This could be done by using https://github.com/janeasystems/nodejs-mobile (but this is not as popular), or by compiling libutp natively for iOS and Android, and then wrapping it out as native code on NativeScript/React Native.

The utp-native is also OLD. It has some issues like:

All of this means that continuing down with trying to use utp-native might just mean flogging a dead horse.

An alternative already existed, and we had previously used it inside Matrix OS and that's Wireguard. The reason for not using when we first started is that there are no nodejs libraries available for it when we started, and we need something quick to prototype with. Many existing P2P applications have been built on top of UTP protocol especially in the NodeJS ecosystem, so that's basically where we started. Even then we went on a journey trying to use a raw JS UTP library that didn't work before eventually arriving on utp-native and still having to adapt it in our network domain.

Trying to use WG will be a lot of work however, and there are many things we have to consider if it is going to work.

Wireguard it's own issues. It is of course will be a C/C++ codebase as well. Originally it was made for Linux only. Now it is available inside and outside the Linux kernel. However for an application like PK, wireguard would have to be a userspace library. The great things all of this is now available: https://github.com/cloudflare/boringtun. With boringtun, it is claimed that it works all major desktops and android/ios and it's all userspace. It's a rust library exposing a C interface that can be wrapped as a native module in JS (just like how we use utp-native and leveldown). It is however NEW and so may a bunch of bugs: https://github.com/cloudflare/boringtun/issues

An additional issue is that Wireguard doesn't use X.509 certificates. It would completely replace the MTLS portion of the codebase, this is fine as we can always derive subkeys from the rootkey for WG utilisation. We would need to however understand how to deal with the certificate verification given that we use a cert chain when rotating root keys. There is no chain in Wireguard, so any key rotation here would end up breaking any connections, unless one were to connect and then verify at a higher level.

As for hole punching, it's possible that it does this automatically, but we would need to investigate its interface for hole punching to see how we would implement our hole punch relay and proxy relay mechanisms #182.

One advantage of using Wireguard is that we are already using Wireguard inside MatrixOS, and we can share expertise and knowledge/tooling between MatrixOS and Polykey. Only note that WG in MatrixOS is the in-kernel one, and not a userspace one. And our work in hole punch and proxy relay could then be shared to MatrixOS which can benefit from it as well.

Another alternative is QUIC. This is now available natively in NodeJS:

Because QUIC is so low level. It seems like a drop-in replacement for the combination of TLS + UTP. One advantage is that this drops the utp-native dependency requirement. However this doesn't solve how one might use QUIC on Android/iOS which is the main reason we want to make a switch. If we are going to do a whole heap of work to make use of UTP on Android/iOS we might as well spend that work upgrading to a more well-supported system.

One huge advantage of QUIC is that we can maintain the usage of TLS that we already use to secure GRPC client TLS #229, and doesn't involve a different protocol. It seems TLS isn't going anywhere, and wireguard is unlikely to ever be used in general web contexts which relies on the certificate authority system. There is also a risk that wireguard packets maybe blocked on corporate firewalls unlike QUIC which is going to look like HTTP3 packets.

This is likely to impact the browser integration where a browser extension is acting as a client. Already we have problems with using GRPC in our RPC layer so that the browser extension can use the same client protocol as we do with our CLI and GUI, so adding in wireguard is not going to help in the case of CLI/GUI and browser extension communication unless this gets resolve: cloudflare/boringtun#139. So it does seem choosing wireguard would bifurcate our data transfer layer between agent to agent and agent to client which is also not nice.

Integration and migration

With the quic system functional we can begin the migration to using quic.

There are two parts to this, the server side and client side. Client side is made up of the nodes domain with the NodeConenction encapsulating the QUICClient and RPCClient. This should be a reasonable drop-in for the existing systems.The server side is made up of a single QUICServer and a RPCServer with the server manifest.

Additional context

At any case we are going to probably need to drop down to native to make sure that we can support all platforms.

Tasks

  • 1. nodes domain needs changes
    • 1. NodeConnection needs to be gutted and replaced with RPCClient and QUICClient usage. Besides this, usage of the NodeConnection is mostly the same.
    • 2. Tests need to be updated
  • 2. Verification logic needs to be transplanted for use with quic.
    • 1. This needs to be tested.
  • 3. Ensure that the proper connection information is provided by the streams from the quic system.
    • 1. This needs to be tested.
  • 4. PolykeyAgent needs to be updated
    • 1. GRPC agent server needs to be replaced with a RPCServer and QUICServer combo
    • 2. Proxy needs to be removed.
  • 5. Agent domain needs to be migrated to using the agnostic RPC code.
    • 1. Tests need to be migrated
  • 6. Old code needs to be removed
    • 1. network domain gutted
    • 2. GRPC domain gutted
    • 3. Remove protobuf? and other package dependencies that are not used anymore.
  • 7. tests!
    • 1. network domain tests need to be removed, any tests still needed should be transplanted.
    • 2. grpc domain tests need to be removed, any tests still needed should be transplanted.
  • [ ] 8. Update relevant handlers with pagination
  • [ ] 9. Update agent handlers to be timed cancellable, implement cancellation.
@CMCDragonkai
Copy link
Member Author

Another issue is the forward proxy being used to bridge the gRPC required TCP port over HTTP. This would be nice to not be needed, but seems to require the grpc http2 usage to be worked out.

@CMCDragonkai
Copy link
Member Author

Relevant issue: nodejs/node#38233 (comment)

The way we are using UTP in our networking domain means we will need to send datagrams for hole punching, but still need a stream API to act as a tunnel/proxy.

The networking domain is pretty useful as it can be used for other P2P applications in the future and existing HTTP connections can work there.

@CMCDragonkai
Copy link
Member Author

Hyperswarm mentions "distributed hole punching", worth looking into how this is done and whether there are lessons we can apply here: https://hypercore-protocol.org/protocol/#hyperswarm.

Note that they are using noise, while we are using TLS, and if quic becomes available then we stick with TLS.

@CMCDragonkai CMCDragonkai changed the title Attempt using Wireguard or QUIC to replace utp-native for the Data Transfer Layer in the networking domain Attempt using QUIC to replace utp-native for the Data Transfer Layer in the networking domain May 11, 2022
@CMCDragonkai CMCDragonkai changed the title Attempt using QUIC to replace utp-native for the Data Transfer Layer in the networking domain Using QUIC/HTTP3 to replace utp-native for the Data Transfer Layer in the networking domain May 11, 2022
@CMCDragonkai
Copy link
Member Author

I've changed this title to talk about QUIC/HTTP3. I don't think we're going down the line of wireguard at all.

It makes sense that most platforms (mobile, desktop, nodejs, web) will eventually support HTTP3/QUIC when it becomes a "web standard".

This means that all platforms will basically have a UDP-native transport.

Currently this is not true though. So this is a long-term standardisation process will eventually reduce the complexity of our networking stack.

One critical requirement is the ability to send fire-and-forget UDP "messages" as a punch packet for NAT-busting. If the HTTP3/QUIC implementation does not provide this ability, it's not really useful, as we need this to do NAT-busting.

Here's a list of HTTP3/QUIC implementations to keep track of:

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 11, 2022

Critical features of the data transfer layer:

  1. The ability to do a fire-and-forget punch-message for the purposes of busting through NAT.
  2. The ability to multiplex separate connections into 1 socket, and therefore that means 1 port. This is necessary for NAT busting. A connection is between node to node. This means the same port/socket is used for multiple connections to different nodes.
  3. The ability to multiplex separate queries and streams on the same connection. This is not the same as multiplexing connections. This allows asynchronous communication on a connection, meaning queries and streams are not blocking each other.
  4. Must be a common web standard that is likely to be implemented by all platforms and therefore not require a C-native library to implement.

Currently our network stack achieves 3/4 except 4. The utp-native gives us messages that implement 1., utp-native can multiple independent utp connections into 1 socket, HTTP2 enables 3..

The go version of QUIC has implemented this quic-go/quic-go#1464

@CMCDragonkai CMCDragonkai added the epic Big issue with multiple subissues label May 11, 2022
@CMCDragonkai
Copy link
Member Author

Implementation of QUIC in C which could be embedded as a native addon: https://github.com/microsoft/msquic

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 9, 2022

The ability to send fire and forget messages in QUIC is specified in RFC 9221 https://www.rfc-editor.org/rfc/rfc9221.html.

image

This is a more recent RFC (finalized in April 2022?), so it is possible that not all QUIC implementations support it.

I believe this is necessary to use QUIC for NAT hole punching. So we must examine the QUIC implementations for this support.

@CMCDragonkai
Copy link
Member Author

This will be useful: https://codeahoy.com/learn/libuv/ch3/

@CMCDragonkai
Copy link
Member Author

I wonder if we were to expose the quiche C functions directly to JS, it may be possible to do the plumbing directly at the NodeJS JS level, where we just use the dgram module to open the UDP sockets and pass data. That might actually work.

We would still need to maintain some notion of the "stream" objects. And that would have to be represented in C++ just like how we do it in js-db.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 17, 2022

I had a quick look at WebRTC as well just to compare.

One of the things that is incoming is the concept of "web transport".

This is meant to replace Web Sockets and relies on the QUIC.

Browsers do not expose raw QUIC sockets, and are unlikely to do so in the near future.

However even for Web Transport, this has not been developed to be capable of P2P communication, because of the lack of NAT traversal techniques.

WebRTC is the only thing that is browser-native and capable of NAT-traversal and P2P communication with other browsers too.

However WebRTC does not expose the underlying NAT traversal mechanism. It is done behind the scenes with explicit support for the ICE, STUN and TURN servers.

For example the RTCPeerConnection https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection constructor explicitly takes configuration for ice servers https://developer.mozilla.org/en-US/docs/Web/API/RTCIceServer.

And it's encryption system is not the same as TLS.

This basically means WebRTC is its own enclosed system. To make use of WebRTC (and thus be browser compatible) would mean to completely change all of our P2P capabilities to be based around WebRTC.

Right now, we don't really need browsers to be able to talk to agents in a P2P way, it's sufficient for browsers to simply talk to the PK agents via a client-server protocol.

Assuming a transport agnostic RPC system, and a RPC system that doesn't use node-native libraries, then any reliable full-duplex stream is sufficient to be used.

This means websockets, web transport are both suitable for creating this full-duplex stream to the PK agent from which the RPC can rely on. (See also this crazy hack using HTTP2 fetch API to construct a full duplex stream: https://github.com/davedoesdev/browser-http2-duplex).

What does this mean for #166? Well this would mean that GRPC is dropped, and instead a transport agnostic RPC on top of web sockets would be used. Third parties attempting to communicate to the PK agent would have to perform the RPC calls on top of the websockets. This is not as nice as being able to use just a RESTful HTTP request/response API...

It's also possible to support both HTTP request/response API and also web sockets for the RPC system.

I imagine that something like JSON or BSON or CBOR will be needed for serialisation/deserialisation.

There is also the question of muxing/demuxing. Any custom muxing/demuxing at the RPC level is going to be a custom thing. Every single HTTP request/response is already muxed/demuxed. If each RPC call needs to be muxed/demuxed this does mean using a custom RPC system to talk to the PK agent. It would be ideal that the transport agnostic RPC system doesn't do its own muxing/demuxing, but instead relies on something underlying to do muxing/demuxing. The problem is... different transports support or don't support muxing/demuxing.

  1. HTTP - muxing is embedded with separate HTTP requests/responses for every RPC call, this is not efficient with HTTP1.1, but much more efficient in HTTP2 and HTTP3
  2. Web Sockets - there is no muxing, you have to build it in, unless you are creating a separate web socket connection for every RPC call, doing so makes it similar to HTTP 1.1
  3. QUIC - muxing is embedded by using separate streams per connection (1 connection can have multiple streams)

So it seems that since 3. would be used for P2P communication, then we don't really need muxing in our transport agnostic RPC. We can just rely on QUIC streams.

But for the PK client service, using web sockets we would either need a muxing/demuxing system attached the RPC system, or we just end up creating a new websocket connection for each RPC call.

I think... we should just open 1 websocket connection per RPC call for the PK client service. This saves us some time so we don't need to build a muxing/demuxing system. It also makes it simpler for third party applications to integrate, they don't need the muxing/demuxing system.

image

https://blog.codavel.com/http2-multiplexing

Now I'm biasing the PK client service to web sockets or HTTP because the assumption is that they can only use these 2 basic technologies. HTTP is pretty much available everywhere in the standard libraries. Web socket connections are some what less available, usually requiring a third party library. But it's also possible that the third party also supports QUIC. In such a case, we could also just accept QUIC connections to the client service too. But this QUIC interface would be entirely client-server oriented, and no NAT traversal or anything is involved.

In the future we either make our P2P network compatible with webrtc or if web transport exposes low-level NAT traversal techniques, then it would be possible for the networking portion of PK to run entirely inside browsers. This is unlikely to happen for a long time, because other parts of PK are native (crypto, rocksdb) as well.

image

The above is libp2p's vision for cross-browser/non-browser P2P. Last time I checked libp2p's codebase which was about 4-5 years ago, it was way too messy to make use of. Maybe things are better now.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 17, 2022

It turns out in some cases, websockets is done over HTTP2. If that's true, then again opening new websocket connections is as cheap as opening new streams on HTTP2. That would mean again, RPC does not require its own muxer/demuxer.

https://www.rfc-editor.org/rfc/rfc8441.html

@CMCDragonkai
Copy link
Member Author

I found another Rust to NodeJS bindings called napi-rs. It is more recent than neon, and it was aligned specifically with the Node API bindings, whereas neon has been around longer and recently migrated to the Node API backend.

Going over the docs for napi-rs seems much more comprehensive than neon, which is funny given that neon is the older project.

Also while I've been experimenting with the neon and quiche integration, I've come up with this kind of code:

struct Config(quiche::Config);

type BoxedConfig = JsBox<RefCell<Config>>;

impl Finalize for Config {}

fn config_new(mut cx: FunctionContext) -> JsResult<BoxedConfig> {
  let config = quiche::Config::new(quiche::PROTOCOL_VERSION).or_else(
    |err| cx.throw_error(err.to_string())
  )?;
  let config_ = Config(config);
  let config__ = RefCell::new(config_);
  return Ok(cx.boxed(config__));
}

fn config_verify_peer(mut cx: FunctionContext) -> JsResult<JsUndefined> {
  let config = cx.argument::<BoxedConfig>(0)?;
  let verify = cx.argument::<JsBoolean>(1)?.value(&mut cx);
  let mut config = config.borrow_mut();
  config.0.verify_peer(verify);
  return Ok(cx.undefined());
}

fn config_set_max_idle_timeout(mut cx: FunctionContext) -> JsResult<JsUndefined> {
  let config = cx.argument::<BoxedConfig>(0)?;
  let idle_timeout = cx.argument::<JsNumber>(1)?.value(&mut cx) as u64;
  let mut config = config.borrow_mut();
  config.0.set_max_idle_timeout(idle_timeout);
  return Ok(cx.undefined());
}

#[neon::main]
fn main(mut cx: ModuleContext) -> NeonResult<()> {
  cx.export_function("configNew", config_new)?;
  cx.export_function("configVerifyPeer", config_verify_peer)?;
  cx.export_function("configSetMaxIdleTimeout", config_set_max_idle_timeout)?;
  return Ok(());
}

What you can see here is a "new type abstraction" that wraps the quiche::Config type. This is necessary because we are going to expose this Rust-native data structure via an external reference in JS via JsBox. This ends up requiring us to use impl Finalize for Config {} in order for there to be a cleanup operation when the object eventually gets garbage collected.

This is actually far more succinct in comparison to the C++ code in js-db which required manual usage of the napi_create_reference... but part of that reason is because I was C-biased libraries like napi macros, instead of the C++ framework node-addon-api. So with this rust code there's much more magic going on.

Afterwards we can define several top-level functions. We still can't create a "rust class" instance that is made available to JS directly. It seems neon should have this ability, but it's unclear how to achieve it, their examples in https://github.com/neon-bindings/examples all use variants of the above style.

I just checked napi-rs code, and they have an actual section of the docs addressing how to create "rust class object" that is exposed to JS directly, then we can actually just construct objects directly in JS without further bridging. I'm going to try napi-rs now.

@CMCDragonkai
Copy link
Member Author

Going to shift the implementation discussion to MatrixAI/js-quic#1.

@tegefaulkes
Copy link
Contributor

I've updated the task list.

tegefaulkes added a commit that referenced this issue May 25, 2023
* Related #512
* Related #495
* Related #234

[ci skip]
tegefaulkes added a commit that referenced this issue Jun 5, 2023
* Related #512
* Related #495
* Related #234

[ci skip]
@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jun 26, 2023

What is the status of this task list? I see that this will be closed from #525.

Note that I have closed MatrixAI/js-quic#1 because all subtasks were done and a prototype prerelease was provided which is being integrated in #525. However new refactorings is taking place in MatrixAI/js-quic#26

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Jun 26, 2023

#525 has the most up to date task list. Most of the conversion has been done except for a few points.

  1. Blocked on custom TLS verification and secured event changes in js-quic
  2. Reverse connections need to be handled and tracked in the NodeConnectionManager. Also blocked by 1.
  3. Monitors can be applied to the object map locking in the NodeConnectionManager it may simplify some logic.
  4. Tests need to be fixed and updates and general cleaning up done.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jun 26, 2023

Reverse connections need to be handled and tracked in the NodeConnectionManager. Also blocked by

Blocked by what?

@tegefaulkes
Copy link
Contributor

Sorry, mistype, blocked by the custom TLS verification changes in point 1.

tegefaulkes added a commit that referenced this issue Jul 7, 2023
* Related #512
* Related #495
* Related #234

[ci skip]
tegefaulkes added a commit that referenced this issue Jul 10, 2023
* Related #512
* Related #495
* Related #234

[ci skip]
@CMCDragonkai
Copy link
Member Author

Remember to tick off the tasks when done too.

@tegefaulkes
Copy link
Contributor

The task list is being updated in the agent migration PR. #525

I'll mirror the progress here.

tegefaulkes added a commit that referenced this issue Jul 20, 2023
* Related #512
* Related #495
* Related #234

[ci skip]
@CMCDragonkai
Copy link
Member Author

@tegefaulkes can you update the task list here too:

image

@CMCDragonkai
Copy link
Member Author

Oh I see you've ticked them already but 8/9 is not done, that's for stage 2.

@CMCDragonkai
Copy link
Member Author

Great! Just need to integrate pagination, time cancellability into all handlers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Requires design development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices research Requires research
Development

Successfully merging a pull request may close this issue.

2 participants