Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV compression only change shortcut #96

Open
jonkeane opened this issue Sep 23, 2022 · 0 comments
Open

CSV compression only change shortcut #96

jonkeane opened this issue Sep 23, 2022 · 0 comments

Comments

@jonkeane
Copy link
Contributor

For CSVs (and later when we support them for JSONs), if all that is being changed is compression, we can do a shortcut by compressing or decompressing the files directly instead of routing through a pyarrow dataset.

Doing this ends up being slightly tricky because reading lines and gzipping in python isn't particularly fast, so we should try and find something that doesn't do that (or we could rely on system utilities if they exist...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant