Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming API #24

Open
Wolvan opened this issue Feb 10, 2020 · 8 comments
Open

Streaming API #24

Wolvan opened this issue Feb 10, 2020 · 8 comments
Labels
enhancement New feature or request

Comments

@Wolvan
Copy link

Wolvan commented Feb 10, 2020

I am considering to use this library with big files (read archives >4GB). Is there a possibility to implement streaming the output of a file extraction action without storing it in memory? Otherwise I'll probably end up with multiple GB of RAM usage only to hold the data that the library extracted.

@nika-begiashvili
Copy link
Owner

do you have any specific API in mind ? should we just return chucks of typed arrays ?

@Wolvan
Copy link
Author

Wolvan commented Feb 14, 2020

Chunked type arrays would work perfectly. I don't think that browsers have a standardized streaming interface, so just continously returning the chunks (in order, of course) in a callback is a decent implementation.

@nika-begiashvili nika-begiashvili added the enhancement New feature or request label Feb 14, 2020
@AndreiRinea
Copy link

I would also be very happy to have this feature enhancement!

@amykhailovskyi
Copy link

I'm just wondering if there are any plans to have it.

@nika-begiashvili
Copy link
Owner

unfortunately I do not have this planned yet due to lack of time

@nika-begiashvili
Copy link
Owner

Revisiting this I have a question about use-case, if there's a single large file wouldn't it end-up in RAM anyway even if it's streamed as chunked ? unless it's streamed to network right away, it which case it would make more sense to decompress on server

@venkatd
Copy link

venkatd commented Jan 13, 2024

Hi @nika-begiashvili we are interested in a streaming API.

We sometimes need to process 10GB+ files in the browser. We are only interested in a subset of the files in these archives based on a pattern (this subset is about <1% of the overall size). Our use case would be to scan the archive to get a list of file paths, then selectively unarchive files based on a file pattern.

Is this something that is theoretically possible with the way libarchive is designed? We'd be willing to sponsor an improvement.

@nika-begiashvili
Copy link
Owner

Yes, I think that should be possible since javascript File object can be read by chunks and libarchive does provide custom read callbacks, although it will need to call javascript functions from C

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants