Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT Deal with pandas 2.0 vs 2.1 #1299

Open
riedgar-ms opened this issue Sep 29, 2023 · 6 comments
Open

MNT Deal with pandas 2.0 vs 2.1 #1299

riedgar-ms opened this issue Sep 29, 2023 · 6 comments
Labels
dependencies Pull requests that update a dependency file

Comments

@riedgar-ms
Copy link
Member

riedgar-ms commented Sep 29, 2023

In pandas v2.1, a new DataFrame.map() API was added, while the previous DataFrame.applymap() was simultaneously deprecated. Since our builds fail on warnings, this causes issues if the latest pandas cannot be install (notably for Python 3.8, but also some of our 'otherml_packages' tests).

PR #1302 squares this particular circle by rewriting the handful of DataFrame.applymap() calls to being nested DataFrame.apply() ones. This is not going to be particularly performant, but the DataFrames in question should be relatively small.

This issue is to track the nested 'apply' calls, so they can be improved once we drop pandas v2.0.

Note This has been edited, so some of the discussion below may be obsolete.

@riedgar-ms riedgar-ms added the dependencies Pull requests that update a dependency file label Sep 29, 2023
@adrinjalali
Copy link
Member

The issue with not handling this now, is that downstream libraries are getting the FutureWarning and their CI fails due to that (I work on one of those libraries)

@riedgar-ms
Copy link
Member Author

Sorry.... FutureWarning from what? pandas?

@adrinjalali
Copy link
Member

Yes, the warning is propagated and it's making my CI fail in skops.

@riedgar-ms
Copy link
Member Author

Per my comment in Discord, my proposal is to keep pandas 2.0 for now (possibly older; one of my follow ups is going to be working out what our true lower bounds are), and release v0.10 in the New Year. Then go to pandas 2.1 after that release is out. We'd retain this issue to track the need to update to pandas 2.1.

@adrinjalali
Copy link
Member

So when would people stop getting the future warnings coming from pandas? Are you saying we'd fix those warnings somehow in the next release, or the one after? If the one after, it'd mean people would get those warnings for quite a long time from our side.

@riedgar-ms riedgar-ms mentioned this issue Oct 12, 2023
7 tasks
riedgar-ms added a commit that referenced this issue Oct 16, 2023
## Description

The recent `pandas` 2.1 release has simultaneously introduced a new `DataFrame.map()` API, and deprecated the `DataFrame.applymap()` API which we use in `MetricFrame`. Since our builds fail on warnings, this left us with a conundrum. This PR (temporarily) solves this by changing the `DataFrame.applymap()` calls to a nested `DataFrame.apply()` pattern. This is not particularly great, but this should not be being used in performance-critical portions of `MetricFrame` and can suffice until we decide to require `pandas` be at least version 2.1.

Also make sure that Python 3.7 is fully excised and add 3.11 to the regular builds and setup.

Issue #1299 exists to track the need to tidy all of this up once we drop `pandas` v2.0 support.

## Tests

- [ ] no new tests required
- [ ] new tests added
- [x] existing tests adjusted

## Documentation

- [x] no documentation changes needed
- [ ] user guide added or updated
- [ ] API docs added or updated
- [ ] example notebook added or updated
@riedgar-ms
Copy link
Member Author

And for extra fun, pandas 2.2 is now out and wreakings its havoc. Some fixes in #1351

@riedgar-ms riedgar-ms mentioned this issue Feb 13, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

No branches or pull requests

2 participants