Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove chunked encoding in http/1 responses #1418

Merged
merged 17 commits into from
Nov 8, 2022
Merged

Remove chunked encoding in http/1 responses #1418

merged 17 commits into from
Nov 8, 2022

Conversation

vankoven
Copy link
Contributor

@vankoven vankoven commented Jun 9, 2020

Chunked encoding is stripped from the response while it's
being parsed, if the response is forwarded to h1 client, the
encoding is applied back for correct message framing.
Since HTTP/2 is first citizen, we don't worry about performance
degradation for http1 clients, and this solution allows us to
avoid any further worries about unneeded chunked encoding in other
subsystems like cache.

Obsoletes #1410 Fixes #1379

Instead of stripping skb fragments during parsing, another approach is used:
a new TfwStr cut member was added into http parser object to track all the fragments to be removed from the message. It's generic enough to allow removal of any fragments from
both response and request.

Once parser has finished its job (on both TFW_POSTPONE TFW_PASS return codes), the
cut storage is stripped from the message. Since http message is processed skb-by-skb, the cut function will be called for every skb only once. This approach doesn't modify skb in parser internals, thus there is no need to pass modified skb parts to lower levels, which complicates code and makes it hardly tangled.

Since cut may (and will!) grow together with usual strings not supposed to be removed, a realloc-race is possible: both strings will grow together, which will block zero-copy realloc operations. To avoid that I've created a new TfwPool for parser needs. It's created for every new server connection. Once message is parsed and whole cut object was stripped, the memory required for cut object is released. This is not usual, since we normally never free objects from pools manually, but it allows to have only one pool for all messages in the single connections, and it shouldn't be exhausted.

Thus during parsing, the body from h1 responses is split into two parts: resp->body and parser->cut. First contains only data, the second - only chunk descriptors. Current realisation uses the same code for request and response parsing, and makes it effectively: usually MOVE_fixup macroses creates a lot of chunks with 1 or two characters, which is not suitable for large bodies. Instead chunks are combined together by their role, i.e. adjastment TfwStr chunks with chunk descriptors are glued together. According to my observations, it allows to have 3-6 times less TfwStr chunks to store single chunk descriptor.

Unfortunately the same approach can't be used to remove chunked token from the Transfer-Encoding header. Server is allowed to apply chunked as not final encoding, so chunked token must be stripped only if it's final encoding. We can't guarantee this during parsing, so the chunked token is marked with flag, and removed later manually.

UPD: See comment below

Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had the discussion in the chat and I wrote my thought in the comments.

Also there was an assumption that the PR might fix #1438 , so please check this and if it actually fixes the problem, then a scenario for the bug is required.

tempesta_fw/http.c Outdated Show resolved Hide resolved
tempesta_fw/cache.c Outdated Show resolved Hide resolved
tempesta_fw/http_parser.c Outdated Show resolved Hide resolved
tempesta_fw/http_msg.c Outdated Show resolved Hide resolved
tempesta_fw/http.c Outdated Show resolved Hide resolved
tempesta_fw/ss_skb.c Outdated Show resolved Hide resolved
tempesta_fw/http_parser.c Outdated Show resolved Hide resolved
tempesta_fw/http_parser.c Outdated Show resolved Hide resolved
tempesta_fw/http_parser.c Outdated Show resolved Hide resolved
tempesta_fw/sock_srv.c Outdated Show resolved Hide resolved
@const-t
Copy link
Contributor

const-t commented Sep 24, 2022

For stripping chunked parts at body parsing stage we flag chunks that should be cut, then cut chunked parts from body during HTTP1 to HTTP2 message transformation. For cached responses we always store into cache HTTP2 friendly version. During caching we not modify message, only copy data from body excluding chunks. Modification take place only during HTTP2 framing.
For non-cachable responses to HTTP1 client simply forward message without any modification.

Besides body modification we also remove Transfer-Encoding header from response to HTTP2 client. It compliance with RFC 9113 8.2.2. During caching we also ignore Transfer-Encoding and not place into cache. All encodings from Transfer-Encoding move to Content-Encoding header, except chunked encoding.

If message doesn't have Transfer-Encoding: chunked and Content-Length when backend close connection, we try to determine size of the body and then add Content-Length header and forward it to client, however such messages are not cached, because we can't be ensured that message is fully received.

@RomanBelozerov
Copy link
Contributor

RomanBelozerov commented Oct 20, 2022

I get errors which are not in master:

  1. h2 request with 'Transfer-Encoding: chunked' (capitalized) header and chunked body. Header is not cut as expected.
  2. h2 request with 'Content-Length: 5 (capitalized) header. Tempesta does not see this header and add new one.
  3. h2 request with 'transfer-encoding: chunked' (lowercase) header. Tempesta failed with:
dmesg
[80145.842735] ------------[ cut here ]------------
[80145.844171] WARNING: CPU: 3 PID: 181027 at /root/tempesta/fw/http_stream.c:80 tfw_h2_stream_fsm+0x12d/0x2b0 [tempesta_fw]
[80145.845916] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) btrfs blake2b_generic xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c cpuid binfmt_misc nfnetlink_queue nfnetlink_log nfnetlink cfg80211 iptable_mangle xt_mark bpfilter sha256_ssse3 sha512_ssse3 rfcomm bnep vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock nls_iso8859_1 intel_rapl_msr intel_rapl_common crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi btusb snd_seq_midi_event btrtl snd_rawmidi btbcm btintel snd_seq bluetooth snd_seq_device snd_timer vmw_balloon joydev input_leds snd vmw_vmci serio_raw ecdh_generic ecc soundcore mac_hid sch_fq_codel vmwgfx ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon efi_pstore pstore_zone
[80145.845994]  ip_tables x_tables autofs4 hid_generic usbhid hid mptspi mptscsih mptbase psmouse crc32_pclmul e1000 ahci scsi_transport_spi libahci i2c_piix4 pata_acpi [last unloaded: tempesta_lib]
[80145.862854] CPU: 3 PID: 181027 Comm: python3 Kdump: loaded Tainted: G           OE     5.10.35+ #1
[80145.864392] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[80145.866416] RIP: 0010:tfw_h2_stream_fsm+0x12d/0x2b0 [tempesta_fw]
[80145.867418] Code: 89 a0 01 00 00 0f 0b 83 f8 06 74 6b 83 f8 07 75 f4 45 31 c0 41 80 fd 02 0f 84 56 ff ff ff 41 80 fd 09 0f 85 46 ff ff ff eb 8c <0f> 0b 41 b8 03 00 00 00 e9 3d ff ff ff 41 b8 03 00 00 00 e9 3c ff
[80145.870504] RSP: 0018:ffffa6cac05484c8 EFLAGS: 00010293
[80145.871352] RAX: 0000000000000002 RBX: ffff8962f71bba68 RCX: 0000000000000001
[80145.872478] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8962f71bba98
[80145.873803] RBP: ffffa6cac05484f8 R08: ffffa6cac0548514 R09: 0000000000000000
[80145.874969] R10: 0000000000000002 R11: 0000000000000000 R12: ffff8962f71bba98
[80145.876079] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000001
[80145.877197] FS:  00007f3aa3ab3700(0000) GS:ffff8964f5ec0000(0000) knlGS:0000000000000000
[80145.878681] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[80145.879604] CR2: 00007f3aa3cd0b70 CR3: 0000000181cd0003 CR4: 0000000000370ee0
[80145.880785] Call Trace:
[80145.881190]  <IRQ>
[80145.881590]  tfw_h2_stream_id_close+0xe7/0x110 [tempesta_fw]
[80145.882767]  tfw_h2_prep_resp+0x95/0x310 [tempesta_fw]
[80145.883618]  ? alloc_pages_current+0x87/0xe0
[80145.884296]  tfw_h2_send_resp+0x5c/0xb0 [tempesta_fw]
[80145.885095]  tfw_h2_send_err_resp+0x9a/0xc0 [tempesta_fw]
[80145.886092]  tfw_http_cli_error_resp_and_log+0x3d3/0x440 [tempesta_fw]
[80145.888196]  ? tfw_http_parse_req+0xf1d0/0xf1d0 [tempesta_fw]
[80145.893988]  tfw_http_req_process+0xf8/0x970 [tempesta_fw]
[80145.896972]  ? alloc_pages_current+0x87/0xe0
[80145.897810]  ? __get_free_pages+0x11/0x40
[80145.898931]  ? tfw_pool_alloc_pages+0x47/0x50 [tempesta_fw]
[80145.900418]  ? __tfw_pool_new+0x2d/0x70 [tempesta_fw]
[80145.901242]  tfw_http_msg_process_generic+0x197/0x670 [tempesta_fw]
[80145.902372]  ? ss_skb_chop_head_tail+0xbb/0x1d0 [tempesta_fw]
[80145.903356]  ? memcpy_fast+0xe/0x10 [tempesta_lib]
[80145.904136]  tfw_h2_frame_process+0x308/0x440 [tempesta_fw]
[80145.905051]  tfw_http_msg_process+0x42/0x50 [tempesta_fw]
[80145.906162]  tfw_connection_recv+0x59/0xa0 [tempesta_fw]
[80145.907147]  tfw_tls_connection_recv+0x19d/0x380 [tempesta_fw]
[80145.908124]  ss_tcp_process_data+0x1e7/0x3f0 [tempesta_fw]
[80145.909102]  ss_tcp_data_ready+0x44/0xb0 [tempesta_fw]
[80145.910216]  tcp_data_ready+0x2b/0xd0
[80145.910985]  tcp_data_queue+0x483/0xd30
[80145.911693]  tcp_rcv_established+0x230/0x670
[80145.912429]  ? sk_filter_trim_cap+0xde/0x240
[80145.913220]  tcp_v4_do_rcv+0x140/0x200
[80145.914005]  tcp_v4_rcv+0xcfd/0xe10
[80145.915113]  ? iptable_mangle_hook+0x36/0xf0 [iptable_mangle]
[80145.916227]  ip_protocol_deliver_rcu+0x30/0x1b0
[80145.917191]  ip_local_deliver_finish+0x48/0x60
[80145.918099]  ip_local_deliver+0xfa/0x110
[80145.918787]  ? ip_protocol_deliver_rcu+0x1b0/0x1b0
[80145.919544]  ip_rcv_finish+0x87/0xa0
[80145.920114]  ip_rcv+0xcc/0xe0
[80145.920605]  ? ip_rcv_finish_core.isra.0+0x420/0x420
[80145.921421]  __netif_receive_skb_one_core+0x88/0xa0
[80145.922317]  __netif_receive_skb+0x18/0x60
[80145.923078]  process_backlog+0xa9/0x160
[80145.923741]  net_rx_action+0x13e/0x390
[80145.924383]  __do_softirq+0xd9/0x291
[80145.924971]  asm_call_irq_on_stack+0x12/0x20
[80145.925681]  </IRQ>
[80145.926141]  do_softirq_own_stack+0x3d/0x50
[80145.926848]  do_softirq.part.0+0x46/0x50
[80145.927486]  __local_bh_enable_ip+0x50/0x60
[80145.928197]  ip_finish_output2+0x1ab/0x590
[80145.928846]  ? __cgroup_bpf_run_filter_skb+0x3c3/0x3d0
[80145.929715]  __ip_finish_output+0xd8/0x220
[80145.931006]  ip_finish_output+0x2d/0xb0
[80145.931722]  ip_output+0x7a/0x100
[80145.932263]  ? __ip_finish_output+0x220/0x220
[80145.932970]  ip_local_out+0x3d/0x50
[80145.933535]  __ip_queue_xmit+0x17a/0x470
[80145.934321]  ? get_page_from_freelist+0xed8/0x1100
[80145.935088]  ip_queue_xmit+0x15/0x20
[80145.935643]  __tcp_transmit_skb+0xa94/0xc90
[80145.936301]  tcp_write_xmit+0x453/0x1350
[80145.936915]  __tcp_push_pending_frames+0x37/0x100
[80145.937657]  tcp_push+0xfc/0x100
[80145.938275]  tcp_sendmsg_locked+0xd36/0xe70
[80145.938949]  tcp_sendmsg+0x2d/0x50
[80145.939477]  inet_sendmsg+0x43/0x70
[80145.940057]  sock_sendmsg+0x5e/0x70
[80145.940599]  sock_write_iter+0x93/0xf0
[80145.941195]  new_sync_write+0x192/0x1b0
[80145.941787]  vfs_write+0x1ca/0x280
[80145.942427]  ksys_write+0xb1/0xe0
[80145.942976]  __x64_sys_write+0x1a/0x20
[80145.943598]  do_syscall_64+0x38/0x90
[80145.944165]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[80145.944964] RIP: 0033:0x7f3aa7b1e0af
[80145.945578] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 49 65 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 44 24 08 e8 7c 65 f8 ff 48
[80145.948602] RSP: 002b:00007f3aa3ab0790 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[80145.949911] RAX: ffffffffffffffda RBX: 0000000000000091 RCX: 00007f3aa7b1e0af
[80145.951052] RDX: 0000000000000091 RSI: 00007f3a9c004113 RDI: 0000000000000007
[80145.952164] RBP: 00007f3a9c004030 R08: 0000000000000000 R09: 0000000000000008
[80145.953261] R10: 00007f3a9c004115 R11: 0000000000000293 R12: 00007f3a9c004113
[80145.954484] R13: 0000000000000091 R14: 00007f3a9c001f80 R15: 0000000000000000
[80145.955749] ---[ end trace 1122a337d9c74038 ]---

Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

vankoven and others added 17 commits November 8, 2022 14:32
For this purposes was extended Transfer-Encoding parser, while
we don't have complex parser logic we can use single parser
function for both headers. For h2 used separate function
Got rid of unnecessary error print during request body less
method check
and Content-Encoding at the same time

Such responses look suspicious, because Transfer-Encoding
other then chunked used very rare especially with
Content-Encoding in same request.
Responses to HTTP2 clients must not have connection-specific
headers such as Transfer-Encoding, therefore we drop
Transfer-Encoding from response by marking it "hop-by-hop"
and copy encodings except "chunked" from Transfer-Encoding to new
header Content-Encoding.

During caching we skip Transfer-Encoding and save Content-Encoding
to cache with value from Transfer-Encoding except "chunked",
i.e data in cache store in HTTP2 friendly fashion. That implies
in response from cache we will have Content-Encoding for both
protocols HTTP2 and HTTP1.

Also now we add "Content-Length" header instead of chunking body
when response dosn't have framing information and conneсtion was
terminated by server.
Choosed another strategy to strip chunks from body.
Now we don't fill *cut* pool as previously, just
flag each chunk and then remove it during caching
and http2 framing. Cut pool not used.

Chunked message for http1 clients not strip,
just forwards as is.
cutting data and parsig Content-Encoding

Also has been removed old test file.
Previous value was the same as TFW_STR_FULL_INDEX
that leads to confusing tfw_hpack_encode function.
As result all headers from trailer part was
considered by tfw_hpack_encode as hpack full
indexable.

Also has been fixed inappropriate usage of
TFW_STR_FULL_INDEX in http2 unit tests
The purpose of TFW_STR_CUT is flag body chunks
which must be stripped from chunked body.
Previously for this purposes was used TFW_STR_NAME
which is confusing.
…splitting

Added comments and TODO. Clarified existing comments.
What has been done:
 -tfw_cache_h2_add_hdr: Added index bound check in BUG_ON.
 -__cache_entry_size: Prettified using of assingment operator.
 -tfw_http_resp_copy_encodings: Changed return code on error.
 -tfw_http_resp_terminate: Now we try to cache response here.
 -__parse_check_encodings: Moved before using in macro.
 -__body_fixup_append: Rmoved unused argument.
 -tfw_perfstat_seq_show: Now socket buffers count not only for DEBUG.
 -__tfw_h2_make_frames has been reworked to function.
 -tfw_http_resp_copy_encodings now works with splitted
 TfwStr chunk. e.g ["gz", "ip"] will be copied as
 ["gzip"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove chunked encoding on h1 -> h2 response transformation
7 participants