Added support for streaming multipart decoding #222

tzickel · 2018-06-16T16:45:09Z

This is feature complete, and should parse the stream like the normal MultipartDecoder class (and passes it's tests).

Added benefits against the normal API is better memory and time savings on large inputs (if you can handle the data as a stream better than the normal code which reads all the stream chunks into memory, copies them together (then removes the chunks), then splits the data into parts (while still retaining the original copy inside the request itself, unless you explicitly delete the request directly after using MultipartDecoder)).

import requests
from requests_toolbelt import MultipartStreamDecoder

r = requests.get('some multipart result url', stream=True) # Output needs to be streamable
with MultipartStreamDecoder.from_response(r) as decoder:
  for part in decoder:
    print(part.headers)
    for stream in part:
      print(stream)
    #print(part.content) # Read comment below

The context manager is used so the input stream will be depleted in case of exception or not reading all of it (mainly useful if you want to re-use the socket in case of http keep-alive)
StreamingPart (part in the example) supports selectively to stream the part or use part.content or part.text to get it like before.
If using MultipartStreamDecoder.from_response you can pass chunk_size to set how much data to try to read per iteration.

tzickel · 2018-06-21T09:54:22Z

https://gist.github.com/tzickel/4a81503acdb843dab4f03cfe950e84f3

This is a benchmark for this code that shows a potential use case for it. You can mess with the data size and chunk size in the end and see the time differences between the versions (peak memory measurement is much more tricky and depends on your OS). And in your use-case you might write the big part to disk instead of memory like here, and thus your peak memory usage by this code will be at max about the size of chunk_size.

tzickel force-pushed the streamingdecoder branch 2 times, most recently from 0743890 to 9dd7f10 Compare June 16, 2018 16:58

tzickel mentioned this pull request Jun 16, 2018

Response.content is wasteful in time and memory for large inputs psf/requests#4687

Open

tzickel force-pushed the streamingdecoder branch 10 times, most recently from 71bb640 to 2516b7b Compare June 21, 2018 09:39

Added support for streaming multipart decoding

2516b7b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for streaming multipart decoding #222

Added support for streaming multipart decoding #222

tzickel commented Jun 16, 2018 •

edited

tzickel commented Jun 21, 2018 •

edited

Added support for streaming multipart decoding #222

Are you sure you want to change the base?

Added support for streaming multipart decoding #222

Conversation

tzickel commented Jun 16, 2018 • edited

tzickel commented Jun 21, 2018 • edited

tzickel commented Jun 16, 2018 •

edited

tzickel commented Jun 21, 2018 •

edited