Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BatchedParquetReader ignores with_columns #8374

Open
2 tasks done
sarthak-sehgal opened this issue Apr 20, 2023 · 0 comments · May be fixed by #16244
Open
2 tasks done

BatchedParquetReader ignores with_columns #8374

sarthak-sehgal opened this issue Apr 20, 2023 · 0 comments · May be fixed by #16244
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars

Comments

@sarthak-sehgal
Copy link

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

BatchedParquetReader takes in the projection argument but does not consider the columns set by user using ParquetReader.with_columns. Might be a good idea to get projection from columns as done here.

If this is a bug rather than a design decision, I am happy to open a PR for this as I would like to start contributing to polars and this seems like a good entry point!

Reproducible example

Works as expected (first column selected) -

let pq_reader = ParquetReader::new(input_file)
        .with_projection(Some(vec![0]))
        .batched(chunk_size)
        .unwrap();

Selects all columns -

let pq_reader = ParquetReader::new(input_file)
        .with_columns(Some(vec![String::from("col1")]))
        .batched(chunk_size)
        .unwrap();

Expected behavior

BatchedParquetReader should filter columns specified using "with_columns"

Installed versions

v0.27.2 and v0.28.0

parquet
@sarthak-sehgal sarthak-sehgal added bug Something isn't working rust Related to Rust Polars labels Apr 20, 2023
@sarthak-sehgal sarthak-sehgal changed the title BatchedParquetReader ignore with_columns BatchedParquetReader ignores with_columns Apr 21, 2023
@stinodego stinodego added the needs triage Awaiting prioritization by a maintainer label Jan 13, 2024
@cyc4188 cyc4188 linked a pull request May 15, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants