Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket #755

Closed
krizhanovsky opened this issue Jun 18, 2017 · 6 comments
Closed

Websocket #755

krizhanovsky opened this issue Jun 18, 2017 · 6 comments

Comments

@krizhanovsky
Copy link
Contributor

krizhanovsky commented Jun 18, 2017

Need to support Websocket (RFCs 6455), both the ws:// and wss:// schemes.

HTTP/1

Parser extensions

  • Upgrade: websocket header RFC 7230 6.7 must be implemented. We support only websocket, so values like h2c or h2c, websocket should be just ignored.
  • Upgrade directive for Connection header. This must be done for requests (HTTP/1 and HTTP/2) and responses. It seems we shouldn't do anything if a server just advertises Upgrade w/o preceding client Upgrade request.

Since Upgrade and Connection is a hop by hop header, if we ignore their values, then a backend receives a request w/o the headers.

If all the headers are good and we can upgrade to websocket, then a new websocket flag for the request must be set and tfw_h1_adjust_req() must regenerate the headers. Note that Sec-WebSocket-* headers aren't hop-by-hop and shouldn't be stripped.

All the extensions must be tested for HTTP/1 in test_http_parser.c

Scheme parsing

Req_Uri state at the moment handles only http:// scheme, which must be fixed to allow https:// and extend with the new ws:// and wss://. Since we'll need to match 4 strings, it makes sense to introduce new states with switch on prefixes (see as headers parsing is done). It seems that browsers never (or almost) never send full URLs, so the states should be cold and unlikely.

HTTP/2

Websockets for HTTP/2 are described in 8441 and we have a very special case of proxying HTTP/2 CONNECT to HTTP/1 websocket.

Nginx just proxies CONNECT method. I don't understand how reliable websockets in these setups, but there are issues in the Internet with websockets over HTTP/2 proxying by Nginx, e.g. RocketChat/Rocket.Chat#15028 . The problem is that RFC 8441 does not require an HTTP/2 client to set Sec-WebSocket-Key header, but maybe the most implementations actually do set the header.

HAproxy implements the header computation for this case (see h1_search_websocket_key()). I believe we also should generate the header from tfw_h2_adjust_req() plus to generation of Upgrade and Connection headers as in tfw_h1_adjust_req().

Parser extensions

  1. send SETTINGS_ENABLE_CONNECT_PROTOCOL to a client
  2. CONNECT method in the h2 request parser. The parser must take care about restrictions described in RFC 7540 8.3: the ":scheme" and ":path" pseudo-header fields MUST be omitted and the ":authority" pseudo-header field contains the host and port to connect to.
  3. parse :protocol pseudo header and allow websocket value only. It's not in the HPACK static table, but it seems we need to support it for the dynamic table compression from the client side
  4. Extend the Req_HdrPsSchemeV state with http scheme for ws

Please add the h2 tests to t/unit/test_http_parser.c.

Backend connections

WebSockets must use separate TCP backend connection (i.e. a connection shall not be used to send send websocket frames from other client and/or HTTP messages from the same or other client). The same backend connection can be upgraded to websocket, meaning that a new TCP connection must be established (provisioned) with the backend to satisfy upcoming HTTP client requests. In this sense the issue depends/linked with #710 (dynamic server connections).

Once we receive 101 response from the server, we do

  1. set client TfwConn->proto.type to a new types Conn_WsClnt and Conn_WssClnt and server's type to Conn_WsSrv and Conn_WssSrv, just like appropriate HTTP and HTTPS connections
  2. move the current server connection out of the scheduler and initialize the reconnection procedure (as the connection was terminated). The websocket connection must be properly closed on Tempesta shutdown
  3. WS and WSS network handling must copy the similar handling for HTTP and HTTPS, but WS protocol is just a plain proxy, which simply retransmits all the data to the linked peer socket. Should be implemented in new websocket.[ch] files.
  4. point 3 implies that the step 1 must also reset TfwConnHooks for the connection to the appropriate websocket hooks

Since we need to introduce a new connection type, it makes sense to port 2eae1da to not to bother with GFSM here.

Rate limits

Client and server inactivity timeouts as well as upgraded connection timeout must be configurable. Websocket server is a frequent target for DDoS attacks, so ratelimits for websocket client connections client_ws_timeout must be implemented. We can handle the server timeouts just as normal TCP timeouts.

General considerations

Understanding the protocol

There is no need to decode or analyse WebSocket binary frames somehow, just pass them through. In future, probably we'll add some security checks, but not in the issue.

TCP proxying

It's recommended to use wss://, WebSocket over TLS, so we should just retransmit WebSocket application data messages after handshake. Because of TLS we can not use IPVS, and basically there is not much logic for IPVS.

Testing

The code probably can be tested by Node.js using the Websocket module. See Nginx's test as an example.

Functional tests for the issue are in #881, which depends on migration to Python3. If the migration isn't here in time, then just a simple Python3 scripts using the Python websockets must be developed. The scripts will later be used for #881.

References

Server-Sent Events, WebSockets, and HTTP by Mark Nottingham

@krizhanovsky krizhanovsky added this to the 1.0 WebOS milestone Jun 18, 2017
@vankoven
Copy link
Contributor

vankoven commented Jun 23, 2017

Note, that Upgrade is hop-by-hop header and it won't be forwarded to backend server. So special processing of the header is required in the http parser.

* TODO: RFC 6455 WebSocket Protocol

Other proxy servers require special directives in configuration file to allow passing Upgrade header. E.g. for nginx: http://nginx.org/en/docs/http/websocket.html

@krizhanovsky krizhanovsky mentioned this issue Jan 8, 2018
7 tasks
@krizhanovsky krizhanovsky modified the milestones: backlog, 0.7 HTTP/2 Jan 9, 2018
@vankoven
Copy link
Contributor

vankoven commented Apr 3, 2018

Protocol upgrade request transforms connection into tunnel which doesn't use HTTP messages. So passing the request in the existing server connection will make it unusable and broken:

  • Tempesta will forward other HTTP requests into this tunnel
  • When server will respond with something, it will be treated as HTTP message parsing error, connection will be closed and client wouldn't receive it's data.

Currently protocol upgrade request will be forwarded to backend server without upgrade query. See comment above. Seems that a new server connection is required to serve the request (#710). The connection is not suitable to be used in schedulers.

@krizhanovsky
Copy link
Contributor Author

Many modern web sites use websockets, so severity is crucial

@ttaym
Copy link
Contributor

ttaym commented Feb 21, 2022

For now i do no observe any dependency of the issue on CONNECT verb support. It support only needed for explicit proxies. And if Tempesta FW honours it this effectively prevents Tempesta FW from seeing websocket communication of client with backend server.

ttaym added a commit to ttaym/tempesta that referenced this issue Feb 21, 2022
Contributes to tempesta-tech#755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 21, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <ak@tempesta-tech.com>
Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <ak@tempesta-tech.com>
Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
@ttaym ttaym linked a pull request Feb 22, 2022 that will close this issue
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <ak@tempesta-tech.com>
Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <ak@tempesta-tech.com>
Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 22, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (tempesta-tech#77)
and Tempesta Language (tempesta-tech#102).

Contributes to tempesta-tech#755

Based-on-patch-by: Alexander K <ak@tempesta-tech.com>
Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit that referenced this issue Feb 24, 2022
Almost literaly follow ak patch from 2eae1da

Replace GFSM calls with direct calls to TLS and HTTP handlers
 on low level networking layers.

GFSM was designed to build graphs of network protocols FSMs (this
design was inspired by FreeBSD netgraph). However, during the years
neither we nor external users have any requirements to introduce
any modules which use GFSM to hook TLS or HTTP entry code. There
are only 2 users of the mechanism for TLS and HTTP for now:
1. TLS -> HTTP protocols handling
2. HTTP limits (the frang module)

This patch replaces GFSM calls with direct calls to
tfw_http_req_process(), tfw_tls_msg_process() and frang_tls_handler()
in following paths:
1. sync sockets -> TLS
2. sync sockets -> HTTP
3. TLS -> HTTP
4. TLS -> Frang

As the result the function tfw_connection_recv() was eliminated.
Now the code is simpler and has lower overhead.

We still might need GFSM for the user-space requests handling (#77)
and Tempesta Language (#102).

Contributes to #755

Based-on-patch-by: Alexander K <ak@tempesta-tech.com>
Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Feb 24, 2022
Contributes to tempesta-tech#755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit that referenced this issue Feb 25, 2022
Contributes to #755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Mar 1, 2022
Contributes to tempesta-tech#755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Mar 4, 2022
Contributes to tempesta-tech#755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit to ttaym/tempesta that referenced this issue Mar 5, 2022
Contributes to tempesta-tech#755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
ttaym added a commit that referenced this issue Mar 16, 2022
Contributes to #755

Signed-off-by: Aleksey Mikhaylov <aym@tempesta-tech.com>
@krizhanovsky
Copy link
Contributor Author

The backend can be tested with Node.js's wscat (see https://stackoverflow.com/a/47861907) as wscat -P -l 8080

@krizhanovsky krizhanovsky self-assigned this Jun 5, 2022
@krizhanovsky
Copy link
Contributor Author

It seems Webscokets over HTTP/2 aren't used:

Support of the feature in non-browser websocket implementations is crucial since we need automated tests for the feature. Otherwise we might find at some point that the feature was silently broken. #1625 is quite sophisticated and changes the logic of current HTTP messages management, so we might introduce additional bugs for the feature, which isn't used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants