Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/2 #309

Closed
7 tasks done
krizhanovsky opened this issue Oct 11, 2015 · 6 comments
Closed
7 tasks done

HTTP/2 #309

krizhanovsky opened this issue Oct 11, 2015 · 6 comments

Comments

@krizhanovsky
Copy link
Contributor

krizhanovsky commented Oct 11, 2015

The initial implementation of HTTP/2 was done in #1176 . Need a review and further development of the code. I suggest to create a new pull request instead of developing #1176.

Consider Cache Digests as important for Ideal HTTP Performance.

There is HTTP/2 test suite which should be used to test the implementation.

Context of the issue is just a robust and performance architecture of HTTP/2 (with #687 in mind - we should not let similar issues to pass to the master and probably there will be cheap (quick) opportunities to fix some of #687 problems). All the small extensions, protocol features and bugs should be reported in small separate tasks.

An minimal requirement for the issue to be done is that a HTTP/2 capable browser must be able to load tempesta-tech.com via HTTP/2.

QUIC foundations

Some of HTTP/2 mechanisms are moved to QUIC (#724): streams, frames, compression. So please make required TODO comments and necessary code adjustments for further extensions to QUIC. In particular at least these docs affect current HTTP/2 design:

  • QUIC transport defines main primitives of QUIC.

  • QPACK for HPACK. Probably Avoiding Head-of-Line Blocking is the most influential feature of QPACK which affect synchronization design of HTTP/2 streams and maybe more connection-wide attributes.

  • HTTP/3 for the whole new code. Chapters 4 and A.2 discuss similarities and differences in the framing layers.

Notes

We should not support Huffman encoding, see #1176 (comment) . Huffman decoding is only required now. However, in #1125 we'll generate HTTP/2 requests and Huffman encoding does make difference for HTTP headers. So I'd leave current Huffman encoding in the source code, but leave it unused until #1125.

Unfortunately we can not pass Huffman-encoded strings to HTTP parser because upper and lower case characters with different umber of bits and have not clear transition (like 0x20 for base ASCII). So we have to decode HTTP headers before passing them to HTTP parser.

HTTP/2 amplification threat

HTTP/2 HPACK introduces HTTP/2 amplification threat. Protection against the attack is left for #488.

Framing

HTTP/2 (QUIC) uses very close, but still different in many details, frames (see QUIC transport chapter 12 and HTTP/3 chapters 4 and A.2 ), so it seems we can not just reuse the code, but, probably, some logic us reusable.

Anyway the logic is relatively complex and defines a logical layer, so I propose to develop it in a new source code file (logic module) http_frame.c which later can be split for HTTP/2- and HTTP/3- specific code.

Since max frame size is 16MB, should call the decoder layer on each frame chunk except if there is not enough payload for processing.

The framing layer determines type of frames: service, headers, body. Header frame payload should go through the decoder layer while body (data) frame payload should skip the decoder and go to HTTP parser directly. The parser code must be split to handle body and headers separately. Service frames must manage current stream state and don't imply HTTP parser calling.

Streams

An HTTP/2 or HTTP/3 stream is essentially FSM with the defines state transitions. A stream and frame type must be passed to tfw_http_req_process(). If we define stream with TfwStream data structure, then TfwStream->parser should replace current TfwConn->parser.

I propose to use 128 for SETTINGS_MAX_CONCURRENT_STREAMS and handle the streams in a binary heap storing identifiers and pointers to dynamically allocated stream descriptors. Maybe there is a better data structure.

Flow control should be left until #498 and now we should announce the largest possible window and ignore the client window - just send the whole response to a client. Keeping the response in Tempesta memory and send it in small chunks may lead to significant security flaw. It's bad to ignore the client window, so #498 was marked as crucial and moved to the next milestone.

Streams prioritization and dependency is the subject for #1196.

All in all the stream logic implies relatively complex logic about flow control, prioritization and so on, so I propose also to move it to a new http_stream.c. Control stream(s) logic, in sense of frames mentioned below, must be handled also here by calling appropriate HTTP, framing, and connection routines. Note stream operation in HTTP/3 described in https://tools.ietf.org/html/draft-ietf-quic-transport-18#section-2 and https://tools.ietf.org/html/draft-ietf-quic-http-18#section-3 as well.

Only HEADERS, DATA, CONTINUATION, RST_STREAM, PING, GOAWAY frames must be implemented in this issue, probably SETTINGS also should be supported for some cases. A client nevere sends PUSH_PROMISE frame, so it's implementation is left for #1194. WINDOW_UPDATE frame is left for #498.

HTTP/2 <-> HTTP/1.1 message transformation

I propose to extend TfwHttpHdrTbl that it can contain plain HTTP/1.1 header strings as well as indexes to to encoded HTTP/2 headers. The interfaces to the current table should be generalized in such way that all the logic can get necessary information about a header regardless whether it's HPACK encoded or in plain TfwStr format.

However, if TfwHttpHdrTbl we can handle Huffman-decoded only headers.

Note that HTTP/2 defines message length in other ways than HTTP/1.1, so current chunked, Content-Length and connection close logic must be adusted.

Decoder layer

The HTTP parser function must call from a new decoder layer. The decoder layer must be aware about HTTP parser states and feed current decoded chunk to the parser. The decoder should be responsible for:

  1. HPACK index decoding (only part (value) of an HTTP header may be fed to the parser)
  2. Huffman decoding
  3. HTTP normalization HTTP normalization #2 .

We should take advantage from the bit string translation to a characters string. In particular, the should be multiple Huffman decoding tables implementing characters filtration (just return error for particular bits mapping to a prohibited character). Next the Huffman decoding state machine can be generated in such way that for example 010101 00000 100001 will be immediately blocked or translated to a space (the bit string means %0A, \n). There is no need to implement any functionality of #2, but the decoder must be designed in appropriate. (In #2 we can add support for run-time updated Huffman decoder table to support reconfiguration of the translation tables). However, current alphabets checking must be implemented for HTTP/2 using Huffman decoder. Also, please remove http_norm.h and all the related stuff.

The decoder must receive a chunk of data, execute the decoding logic, for an encoded string if any and call the parser for the string. In this case the parser will have many entries so #1131 is crucial now.

The decoder must write data to a new HTTP request because decoded data is usually larger than original.

In general case Humman-encoded symbol can cross byte bounds, so the decoder FSM must store some context data to be able to process frame chunks passed from the framing layer.

HTTP parser

The parser must eat HTTP/1.1 and HTTP/2 messages depending on information from the framing/streaming layer, HTTP/2 path must be in preference (in sense of conditions and likely paths). This is required since HTTP/2 headers use binary separators instead of CRLF.

It's also possible to process the binary delimiters on decoder side (after all the delimeters determine type of current header - indexed, encoded, or plain) and left current HTTP parser to process ASCII parts of the headers only.

tfw_http_parse_req() must be split into method, URI, HTTP version, headers, and body parsing parts. Only some of the parts must be called for HTTP/2.

Characters filtration, the SIMD alphabets checking, must not be executed for HTTP/2 strings since we have Huffman decoders.

Headers conversion

Headers adjustments, tfw_http_adjust_{req,resp}() are the right place to convert headers for different format (from/to HTTP/2 to/from HTTP/1.1). They're also good for this because of #1103 and whe need/have extra space for the format conversion and changing the headers.

Responses should be encoded in-place (they're always smaller than HTTP/1.1) in tfw_http_adjust_resp() leaving more optimization opportunities for #1103. The new added headers must be immediately compressed in-place, so HTTP XFRM logic must be adjusted for HTTP/2 (create HPACK'ed headers instead of compress plain strings added by the current logic).

Caching

It'd be good to keep HTTP/2 and HTTP/1.1 headers in cached entries, but now it's not necessary and we can translare HTTP/1.1 headers to HTTP/2 in tfw_http_adjust_resp().

TODO

The code must be done in separate branches producing many sequential small PRs which of them must go through a review and be merged into the master. It's easier to review a smaller code and we can catch issues earlier. Also this way we can avoid heavy rebases. I suggest to split PRs in the way that there are not more than 1 of the tasks below in a PR.

  • Framing decoding. In general, framing layer should be hooked the same way as TLS and current HTTP layers. It's not necessary to use GFSM, but it might be useful to handle data offsets. Current tfw_http_req_process() handles the message parsing and it's processing logic (scheduling, caching etc.) and this can remain - different HTTP/2 frames can be treated by the function just as separate chunks. However, from framing layer we know exact type of current frame and we should call the right (headers or body) parser instead of analyzing current parsing state.

  • Starting HTTP/2 (RFC 7540 chapter 3): ALPN handling, initial transmission of 101 (Switching protocols) response and parsing the client preface.

  • Stream FSMs handling. At this stage functional test for PING frame must work.

  • HTTP/2 <-> HTTP/1.1 message transformation to transfer messages between HTTP/2 clients and HTTP/1.1 backends.

  • HPACK and Huffman logic.

  • HTTP/1.1 response streaming for pipelined requests - with HTTP/2 we don't have blocking requests any more, so each received response can be forwarded to a client immediately. So it seems seq_queue becomes HTTP/1.1 specific and in context of Improve the architecture that supports the correct order of HTTP responses #687 it'd be good to not to use it for HTTP/2 at all.

  • Update the Wiki list of known implementations of HTTP/2.

@krizhanovsky krizhanovsky added this to the 0.6 TBD milestone Oct 11, 2015
@krizhanovsky krizhanovsky modified the milestones: 0.5.0 Web Server, 0.6 OS Oct 27, 2015
@krizhanovsky krizhanovsky modified the milestones: 0.6 OS, 0.5.0 Web Server Nov 27, 2015
@krizhanovsky krizhanovsky modified the milestones: 0.5.0 Web Server, 0.6 OS Nov 17, 2016
@krizhanovsky krizhanovsky mentioned this issue May 29, 2017
2 tasks
@krizhanovsky krizhanovsky self-assigned this Jul 17, 2017
@krizhanovsky krizhanovsky modified the milestones: 0.5.0 Web Server, 0.7 HTTP/2 Jan 8, 2018
@krizhanovsky krizhanovsky modified the milestones: 1.0 Beta, 0.7 HTTP/2 Jul 15, 2018
@krizhanovsky krizhanovsky removed their assignment Feb 16, 2019
@aleksostapenko
Copy link
Contributor

aleksostapenko commented Feb 26, 2019

HTTP/2 layers overview

  1. Layers UP flow order:
  • TLS
  • Framing
    • Incorporated with Stream FSM and Stream Scheduler, to be faster processed on Stream ID determination and determination dependent Streams as well as errors with incorrect FSM states etc., before decoding/processing the frame payload;
    • After passing through Stream FSM and Stream Scheduler, the further processing depends on the type of frame:
      • service frames should not be passed to upper layers;
      • HEADERS frames' payload must be passed to Decoder at first, and then - to HTTP parser;
      • DATA frames' payload must be passed to HTTP parser directly;
  • Decoder FSM (must be called on each skb for HEADERS frames' payload);
  • HTTP Parser (must be called on each skb for DATA and HEADERS frames' payload; for HEADERS - from the Decoder layer);
  1. Layers DOWN flow order: TLS <- Framing <- Encoder.

Stream Priority/Dependencies

It seems that full featured scheduler for TfwStreams is needed for each TfwCliConn - called after Stream FSM (for UP and DOWN flows); maybe the base part of this functionality worth to implement in context of #309 and more specific logic leave for #1196; in this context, to generalize the current approach and HTTP/2 implementation, the possible variant may be to add into TfwCliConn - pointer to TfwStreamSched structure; should be the separate instance of that structure for each TfwCliConn; TfwStreamSched instance in turn can contain and maintain TfwStream instances according scheduling strategy (based on Streams Dependencies/Priorities):

  1. For pure HTTP it can be just one TfwStream instance for one client connection without any scheduling;
  2. For HTTP/2 at beginning (HTTP/2 #309: HTTP/2 <-> HTTP/1.1 proxy) there may be the base default scheduling logic - no dependencies and only equal weights for Streams (i.e. all dependencies - on stream 0x0, and all weights equal 16: https://tools.ietf.org/html/rfc7540#section-5.3.5); in this case the transformed HTTP/2 requests (assembled into HTTP/1.1 requests) must be sent to backends in order they came from TCP connection (with head-of-line blocking problem arised on server connections, but this cannot be avoided in HTTP/2 -> HTTP/1.1 proxy case); fields parser, *msg, seq_queue and seq_qlock must be moved from TfwCliConn to TfwStream (requests from TfwStream.seq_queue linked with TfwSrvConn.fwd_queue in usual way - to map HTTP/1.1 responses to corresponding HTTP/2 Steams);
  3. For full HTTP/2 Multiplexing support the scheduling based on TfwStream's dependency/priority must be implemented.

HTTP/2 layers implementation

Since tfw_http_req_process() is overloaded already and we need a lot of additional logic in it for HTTP/2 (Framing, Stream FSM, Stream Scheduler, Decoder), so to avoid additional complication and massive code copy-paste, in my opinion, tfw_http_req_process() should be splitted into two parts to reuse them from HTTP/1.1 and HTTP/2 contexts (from __tfw_http_msg_process()): the first part must be dependent on HTTP/1.1 or HTTP/2 contexts, and the second common part (scheduling, caching etc.) must remain almost unchanged (except that TfwStream must replace TfwCliConn instance in places where *msg, seq_queue and seq_qlock fields are used); in this variant the layers ordering will have the following view:

  1. TLS layer;
  2. HTTP layer;
  3. HTTP/2 or HTTP/1.1 protocol choice; in __tfw_http_msg_process() - atop of processing of skb chain, formed on TLS layer;
  4. HTTP/2 Framing, Stream FSM, Stream Scheduler; collect skbs with HTTP/2 frames and pass them to the next layer with frame type; service frames are processed just right here;
  5. tfw_http_req_process() for HTTP/2 instead of tfw_http_parse_req() from ss_skb_process() - calls assembling logic for request;
  6. HTTP/2 request assembling layer; this layer is necessary since it seems that for HTTP/2 <-> HTTP/1.1 request transformation - we need to accumulate HTTP/2 frames (HEADERS, DATA, DATA, ..., HEADERS in context of particular Stream) - because headers should be concatenated with bodies for HTTP/1.1 (and, by the way, for responses we need to split HTTP/1.1 responses to form HTTP/2 separate HEADER and DATA frames, with creation of additional Streams); in future HTTP/2 <-> HTTP/2 proxy implementation, this layer should treat separate frames as requests; regarding received frame type this layer in turn calls HTTP/2 Decoder or HTTP parser;
  7. HTTP/2 Decoder decodes HTTP/2 HEADERS frames and calls separate parts of splitted tfw_http_parse_req(); also allocations for new skbs will be needed on previous layer since decoded data is larger than original;
  8. HTTP Parser.

Starting/switching HTTP/2

Considering #1176 (comment), the possible variant for HTTP/2 starting and subsequent switching between HTTP/1.1 <-> HTTP/2 processing inside Tempesta FW - can be the special flag for TfwCliConn which can be set during TLS ALPN procedure.
Based on this flag, the further processing (HTTP/2 or HTTP/1.1) can be chosen in __tfw_http_msg_process(), as described in previous section. Described variant does not use GFSM switching, since in case of GFSM usage - separate HTTP/2 FSM is needed to be registered on top of TLS, but that will lead to both HTTP and HTTP/2 FSM will be processed during tfw_gfsm_move() call from TLS layer; to avoid such behavior - current GFSM model should be changed; thus, in my opinion, GFSM usage looks more difficult for switching to HTTP/2 processing.

HTTP/2 and QUIC with HTTP/3

In context of HTTP/2 features, which must be implemented, the main parts of future QUIC very rough architecture may be looked like:

  1. New sock.c (for implementation QUIC Connections management, Congestion control and Loss recovery over UDP);
  2. TLS layer;
  3. Framing and Streams management;
  4. QPACK as scaled for multiconnection support HPACK;
  5. HTTP/3 (as HTTP/2 over QUIC).

However, with described scheme there are several questions, which can lead to different QUIC implementation, and should be discussed:

Besides, for HTTP/2 over QUIC implementation (which in fact is HTTP/3), it is worth to mention the most significant HTTP/3 specifics (compared with HTTP/2):

Therefore, three implementation variants for HTTP/2 and QUIC<->HTTP/3 could be specified:

  1. Model defined at the beginning of current paragraph: implement QUIC<->HTTP/3 in our existing TLS<->HTTP flow (as well as HTTP/2). Considering issues mentioned above - about significant mismatch between QUIC model and current Tempesta FW layers architecture, this variant looks very hard to implement (or maybe even impossible), since it will require significant changes in current Tempesta FW architecture;
  2. New, completely separate FSM for QUIC<->HTTP/3 processing in parallel with the existing TLS<->HTTP FSMs; the new sock.c implementation in this case must be only a thin wrapper over plain UDP; and all the core QUIC functionality including QUIC Connections management, QUIC Loss Recovery and Congestion Control, QUIC Framing, QUIC TLS, QUIC Stream management (with performing Stream Dependencies/Prioritization) should be implemented inside new FSM with separate HTTP/3 layer atop; in this context HTTP/2 must be implemented on the basis of current TLS<->HTTP flow inside of existing HTTP FSM;
  3. This variant is similar to the previous one, except that HTTP/2 must be implemented as completely separate FSM (also in parallel with the existing TLS<->HTTP FSMs) - as the foundation of future QUIC<->HTTP/3 implementation (as described in previous variant); this way also seems quite difficult due mentioned above issues about mismatches between HTTP/2 and QUIC<->HTTP/3 models; a part of HTTP/2 functionality included into QUIC, and the other part (with some addings like QPACK support) - included into HTTP/3; besides, in some cases this separation rather non-trivial - like with Stream Prioritization/Dependencies or double Framing on QUIC and HTTP/3 layers (this cases described in issues above); therefore, adaptation of the HTTP/2 for QUIC<->HTTP/3 structure may result (and most likely will) in non-optimal HTTP/2 implementation;

In my opinion, the second variant among described above is most suitable for implementation; the drawback of this approach is the necessity to implement in fact two different models with rather large part of functionality overlapped; but very likely that quite a lot of code can be reused from HTTP/2 implementation for future QUIC<->HTTP/3 FSM; in this context, the maximum goal is to distinguish common HTTP/2/3 top layer to use it without changes in both implementations, but some obstacles can significantly complicate this task:

  • Different Frame formats for HTTP/2 and HTTP/3;
  • For HTTP/2 - Framing layer must be moved up, atop of Stream FSM layer (to fit the model of HTTP/3 Frames atop of QUIC Frames and Streams) which seems hard to implement since Stream FSM must know at least about Stream ID (from Framing layer); or Stream FSM layer should be switchable (can be disabled) for common HTTP/2/3 framework;
  • Stream Scheduler (for HTTP/2) must be splitted into two parts (manage and perform) - to fit the prioritization model in QUIC<->HTTP/3 stack;
  • Possible performance penalties for HTTP/2 due to unnecessary levels/interfaces/stubs.

Flow Control

It seems that we should implement at least WINDOW_UPDATE frame in #309 - in order to announce the largest possible window (second paragraph in https://tools.ietf.org/html/rfc7540#section-5.2.2).

@krizhanovsky
Copy link
Contributor Author

krizhanovsky commented Mar 3, 2019

Several comments:

For pure HTTP it can be just one TfwStream instance for one client connection without any scheduling;

In HTTP/2 and HTTP/3 there are always 0th control stream, so probably it makes sense to embed TfwStream into TfwCliConn to save some allocations and pointer dereferences and make HTTP/1-compatibility logic simpler.

HTTP/2 request assembling layer; this layer is necessary ....

How it's different from current assembling HTTP messages from ingress skb chunks? I'd expect that after decoding we can fail to our current messages processing logic.

headers should be concatenated with bodies for HTTP/1.1

Not necessary. In #498 we exactly want to proxy message chunks as is. So probably in HTTP/2, when we assembled HTTP headers and can execute scheduling logic, we can proxy data frames w/o assembling. However, if it's simpler for now just to assemble whole messages - I'm fine with it. I'm only against introducing additional architectural layer which will be reworked in #498 - just don't want to do the same work twice.

Maybe the layer is required - I didn't actually get this

in future HTTP/2 <-> HTTP/2 proxy implementation, this layer should treat separate frames as requests; regarding received frame type this layer in turn calls HTTP/2 Decoder or HTTP parser;

why do we need the assembling layer for full HTTP/2 proxying?

for responses we need to split HTTP/1.1 responses to form HTTP/2 separate HEADER and DATA frames, with creation of additional Streams

I think technically we can just insert two skb frames with appropriate HTTP/2 frame headers and, of course, update TfwStream state.

can be the special flag for TfwCliConn which can be set during TLS ALPN procedure. ... Described variant does not use GFSM switching

Yes. GFSM is a generalization of hooks mechanism which allows several FSM to subscribe to a particular events and do context switches. FreeBSD netgraph uses similar approach for network protocols handling, but I don't think that it's an efficient way. Moreover, TfwConn may describe UDP flow for HTTP/3 in future.

we should implement at least WINDOW_UPDATE frame in #309

Agree.

It seems that for QUIC we must implement seperate Framing/StreamFSM/Scheduler) layer

The question is how much of the code we'll be able to reuse for HTTP/3. Now we only can keep HTTP/3 in mind during HTTP/2 development.

QUIC Loss Recovery mechanism involves not only QUIC Connections layer, but also higher Framing layer too

I think the framing layer will provide downcalls plus to upcalls for HTTP/2, just the same way as current connection.[ch] provides hooks and normal function APIs.

QUIC integrates TLS inward and this results in very closely intertwined interaction between QUIC Connections ... TLS handshakes must live atop of QUIC Frames

Yes, that's true. The only thing that we can do now is to separate framing layer as much as possible to be able to call it from the new UDP sock.c version. And the things aren't so bad - now we call TLS record creation exactly from TCP layer - we'll do very close things for HTTP/3. In #1031 we'll do more in this direction.

three implementation variants for HTTP/2 and QUIC<->HTTP/3 could be specified

The 3rd variant is absolutely not an option. We'll definitely die if try to implement TCP/TLS/HTTP/2 concurrently with UDP/HTTP/3, especially given that the QUIC standard is constantly changing. Now we need to implement HTTP/2 ASAP and should only keep HTTP/3 in mind to choose more HTTP/3-friendly architecture if we have a choice. It'd be good if we can reuse the same logic, at least as a copy & paste foundation (e.g. we can copy some HTTP/2 framing logic and adjust it for HTTP/3). So I propose back to choosing between 1st and 2nd options when we have HTTP/2 and start with #724 .

aleksostapenko added a commit that referenced this issue Oct 4, 2019
1. Correction of comment about HTTP/2 design;
2. Some corrections in the Huffman decoding functionality.
aleksostapenko added a commit that referenced this issue Oct 4, 2019
aleksostapenko added a commit that referenced this issue Oct 11, 2019
HTTP/2 HPACK layer implementation (#309).
aleksostapenko added a commit that referenced this issue Nov 4, 2019
1. Changes in HPACK decoder to copy only Huffman-decoded and dynamically indexed headers;
2. Appropriate changes in HPACK-decoder/parser unit-tests.
aleksostapenko added a commit that referenced this issue Nov 4, 2019
Corrections as a result of HPACK decoder/encoder/parser unit-tests debugging.
aleksostapenko added a commit that referenced this issue Feb 27, 2020
Parse name, colon, LWS, value and RWS of HTTP/1.1-response headers into
separate chunks to facilitate the name/value splitting and colon/OWS
eviction during HTTP/1.1=>HTTP/2 response transformation.
@krizhanovsky krizhanovsky added the h2 label Mar 3, 2020
aleksostapenko added a commit that referenced this issue Mar 4, 2020
…ion-h2-cache

HTTP/2 implementation: HTTP/2-cache (#309).
aleksostapenko added a commit that referenced this issue Mar 5, 2020
@krizhanovsky
Copy link
Contributor Author

Done in #1368

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants