DOC: Clarify allowed values for on_bad_lines in read_csv #58662

wjandrea · 2024-05-09T22:00:05Z

Move the callable options out of the version added/changed tags and improve the flow.

~~[ ] closes #xxxx (Replace xxxx with the GitHub issue number)~~
~~[ ] Tests added and passed if fixing a bug or adding a new feature~~
All code checks passed.
~~[ ] Added type annotations to new arguments/methods/functions.~~
~~[ ] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.~~

Check my work to make sure I got the details right, please. One thing is, I'm not sure if "process" is the right term for the PyArrow invalid_row_handler since all it does is say "skip" or "error".

BTW, invalid_row_handler isn't mentioned specifically, so at first I thought "signature as described in pyarrow documentation" was referring to the signature of the ParseOptions constructor.

Also, the description should probably state explicitly what happens when the function returns a list of strings with the right number of elements (with engine='python'). I haven't used it myself, but I'm inferring it gets forwarded to the parser that turns a table of strings into a dataframe.

Move the callable options out of the version added/changed tags and improve the flow.

space before colon

mroeschke · 2024-05-10T17:01:24Z

Thanks @wjandrea

wjandrea added 3 commits May 9, 2024 17:45

Clarify allowed values for on_bad_lines in read_csv

c8c5958

Move the callable options out of the version added/changed tags and improve the flow.

typo

c3b6c9e

space before colon

trim trailing whitespace

614ffd1

mroeschke approved these changes May 10, 2024

View reviewed changes

mroeschke added the Docs label May 10, 2024

mroeschke added this to the 3.0 milestone May 10, 2024

mroeschke merged commit e67241b into pandas-dev:main May 10, 2024
51 checks passed

wjandrea deleted the patch-1 branch May 10, 2024 17:58

wjandrea mentioned this pull request May 10, 2024

Clarify on_bad_lines PyArrow in read_csv #58666

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Clarify allowed values for on_bad_lines in read_csv #58662

DOC: Clarify allowed values for on_bad_lines in read_csv #58662

wjandrea commented May 9, 2024

mroeschke commented May 10, 2024

DOC: Clarify allowed values for on_bad_lines in read_csv #58662

DOC: Clarify allowed values for on_bad_lines in read_csv #58662

Conversation

wjandrea commented May 9, 2024

mroeschke commented May 10, 2024