Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add brain to fix tensorflow import errors. #2174

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hmc-cs-mdrissi
Copy link

@hmc-cs-mdrissi hmc-cs-mdrissi commented May 9, 2023

Summary

Fix the import false positives with tensorflow. I was guided by this comment, although the implementation is rather different as that comment was for much older tensorflow version.

Type of Changes

Type
βœ“ πŸ› Bug fix

Description

Closes pylint issue.

I've manually tested fix on tensorflow 2.5, 2.8, and 2.12 latest for tensorflow.estimator which gives import-error without this fix, but imports fine with it. 2.5 was released about 2 years ago (May 2021) and is newest version to support python 3.9. If you go further back eventually this solution may not work as packaging/import magic has evolved for tensorflow. However even tf 2.5 is already too old to be compatible with pylint. tf 2.5 has upper bound on typing extensions <4 making it depenency conflict to try to install it + pylint. So I think working that far back is long enough. Also as this only adds a fallback for import-errors even if user installs a very old tf version like 1.10 and ignores dependency conflicts this will not worsen their experience.

Most tensorflow imports go through tensorflow.python so this adds fallback to try that. You can see that happening here.

Testing

How should I test this? Should I? It's fairly short simple brain. If it's fine to add tensorflow as a dependency to requirements_full similar to numpy I can add it there and try to add a test case similar to the ones found in test/brain.

@DanielNoord
Copy link
Collaborator

I have not had a look at this PR, but was looking into the brains recently for performance implications.

What about putting this into a pylint-tensorflow plugin? The issue with these brains is that they are loaded for all users of pylint even if they are not using tensorflow. That will obviously be a considerable group.
This is very much out of the scope of your PR @hmc-cs-mdrissi but I think we should consider this before adding more brains for stuff outside of the stdlib.

@codecov
Copy link

codecov bot commented May 9, 2023

Codecov Report

Merging #2174 (7a7a582) into main (351c8de) will decrease coverage by 0.02%.
The diff coverage is 80.00%.

❗ Current head 7a7a582 differs from pull request most recent head 0411505. Consider uploading reports for the commit 0411505 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2174      +/-   ##
==========================================
- Coverage   92.60%   92.59%   -0.02%     
==========================================
  Files          94       95       +1     
  Lines       10800    10810      +10     
==========================================
+ Hits        10001    10009       +8     
- Misses        799      801       +2     
Flag Coverage Ξ”
linux 92.34% <80.00%> (-0.02%) ⬇️
pypy 87.53% <80.00%> (-0.01%) ⬇️
windows 92.18% <80.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Ξ”
astroid/brain/brain_tensorflow.py 80.00% <80.00%> (ΓΈ)

@hmc-cs-mdrissi
Copy link
Author

hmc-cs-mdrissi commented May 9, 2023

Hmm I'm fine with that. There is no pylint-tensorflow package today though so it would need to be created/uploaded to pypi first. It's easier on my side to maintain this one file vs add a repository/package to maintain pylint-tensorflow (probably skipping setting up CI/related things). I'd also guess visibility of pylint-tensorflow will be low relative to tensorflow and most users won't install it. If you decide (entirely fair) to want this not in astroid proper I'm fine closing this pr.

I think this particular brain is about as cheap performance wise as can be though. It adds one import-error fallback that checks for exact module name of tensorflow and exits if any other name.

edit: One other idea is could astroid have extras and there be astroid[tensorflow] with small astroid-tensorflow package here?

@Pierre-Sassoulas
Copy link
Member

The issue with these brains is that they are loaded for all users of pylint even if they are not using tensorflow.

It's easier on my side to maintain this one file vs add a repository/package to maintain pylint-tensorflow (probably skipping setting up CI/related things).

Great points. I did not check specifically how much time was taken in startup to load brain but everything count and adds up on each invocation of pylint. We probably need to create a pylint-dev/pylint-x project for each of our brains if we want plugin to be a reasonable solution for contributors and get better performance overall. This could start with a cookie cutter template for brain-plugins, then a migration of our brains to their own project. We also need to warn users that they will need to install those plugins in the future. Seems like the path forward long term tbh, but it's not something that is fast and doable overnight.

So meanwhile we should merge brains like we always did so astroid is still useful for tensorflow users, right ?

@Pierre-Sassoulas Pierre-Sassoulas added Enhancement ✨ Improvement to a component Brain 🧠 Needs a brain tip labels May 9, 2023
@Pierre-Sassoulas Pierre-Sassoulas added this to the 3.0.0a3 milestone May 9, 2023
@hmc-cs-mdrissi
Copy link
Author

I think my main open question left is recommendations for testing this. Is it fine to add tensorflow here for full test dependencies and then add a few test cases checking it works for imports/still returns an import error for a fake module?

@Pierre-Sassoulas
Copy link
Member

For any other lib I would say to add it like we did for numpy, urllib, etc. But tensorflow is like 400 mb to download, right ?

@hmc-cs-mdrissi
Copy link
Author

Yes. It is a large wheel and not an ideal dependency for 1 test case.

I'm fine maintaining it without test here as I will use this for my own internal codebase (~30k lines) where pylint is enforced in CI and will run on a lot of tensorflow imports.

@Pierre-Sassoulas
Copy link
Member

I'm considering creating a proof of concept project in pylint-dev for the tensorflow plugin (we need autoregistry it when the package is installed ? I'm not sure of the specific, but there will be some rough edges to smooth). A lot of person could benefit from your contribution. In short if we don't do it for tensorflow we're never going to start.

@hmc-cs-mdrissi
Copy link
Author

I am fine with that. If you make a initial repository pylint-tensorflow even if minimal at first (manual pypi release + copy over setup.cfg settings) I'll move this pr there and help with initial setup files.

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented May 9, 2023

So I created a repository (https://github.com/pylint-dev/pylint-tensorflow) added you as maintainer and took the namespace in pypi (https://pypi.org/project/pylint-tensorflow/). This is still very rudimentary but let's start small and add what we need when we need it (release not operational at the moment I need to add a pypi API key). There's a ton of thing that could be done better like using orgs label, and probably a million other thing, I'm thinking of using this to see what need to be done to make pylint plugin maintenance easier in the long term.

@crzdg
Copy link

crzdg commented May 10, 2023

Just a user of pylint and tensorflow here. I just ran into the problem with pylint throwing import-errors as reported. This discussion looks like you work on a solution for that. However, I also ran into the import-error problem for tensorflow-serving-api-package. Maybe the plugin also could incorporate the handle of those false positives as well. Would be great if there is a solution available somewhat covering the landscape of tf-packages.

However, thanks a lot for tackling this. This definitely will make it easier to write clean tensorflow code.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can now release in https://github.com/pylint-dev/pylint-tensorflow. I wonder if this MR should add an extra require of pylint-tensorflow that would register the proper transform automatically when the package is available. The issue of course is that it introduce a circular import. Maybe it's better if users have to install pylint-tensorflow on their own and pylint-tensorflow depends on astroid not vice versa. Or we could consider that astroid -> pylint-tensorflow -> astroid[tensorflow] is not a circular dependency.

@DanielNoord
Copy link
Collaborator

We can now release in https://github.com/pylint-dev/pylint-tensorflow. I wonder if this MR should add an extra require of pylint-tensorflow that would register the proper transform automatically when the package is available. The issue of course is that it introduce a circular import. Maybe it's better if users have to install pylint-tensorflow on their own and pylint-tensorflow depends on astroid not vice versa. Or we could consider that astroid -> pylint-tensorflow -> astroid[tensorflow] is not a circular dependency.

Let's let pylint-tensorflow depend on pylint, that fixes the astroid dependency automatically and couples the plugin to what it refers to.

@Pierre-Sassoulas
Copy link
Member

Let's let pylint-tensorflow depend on pylint, that fixes the astroid dependency automatically and couples the plugin to what it refers to.

That would mean astroid can't be used for tensorflow parsing alone without a dependency to pylint (simply to get the tensorflow brain). I think astroid -> pylint-tensorflow -> astroid[tensorflow] or having to add pylint-tensorflow explicitly is more acceptable than that.

@DanielNoord
Copy link
Collaborator

Let's let pylint-tensorflow depend on pylint, that fixes the astroid dependency automatically and couples the plugin to what it refers to.

That would mean astroid can't be used for tensorflow parsing alone without a dependency to pylint (simply to get the tensorflow brain). I think astroid -> pylint-tensorflow -> astroid[tensorflow] or having to add pylint-tensorflow explicitly is more acceptable than that.

Why would you ever want only astroid and pylint-tensorflow? Shouldn't it be astroid-tensorflow then?

@Pierre-Sassoulas
Copy link
Member

Hmm, right, we might want to mix astroid-x and pylint-x for simplicity's sake.

@DanielNoord
Copy link
Collaborator

Hmm, right, we might want to mix astroid-x and pylint-x for simplicity's sake.

I think it is very likely that people will also start asking about ignoring specific messages for specific constructs after we have made such a package. I think one cookie cutter way of doing these plugins is probably best. I don't see a lot of value of using astroid on its own, especially in combination with a plugin, so I think it is fine to have all plugins be couple to pylint. If we ever see a need for astroid specific plugins we can fix that then.

@Pierre-Sassoulas
Copy link
Member

OK so this MR should be closed in favor of something in pylint-tensorflow, and user should know they have to install pylint-tensorflow (so probably needs a doc MR in pylint once there's actually something in pylint-tensorflow), right ?

@DanielNoord
Copy link
Collaborator

OK so this MR should be closed in favor of something in pylint-tensorflow, and user should know they have to install pylint-tensorflow (so probably needs a doc MR in pylint once there's actually something in pylint-tensorflow), right ?

Yes! We can keep a list of "supported" pylint plugins somewhere in our docs.

@hmc-cs-mdrissi
Copy link
Author

I’m following discussion and will try to close and reopen the pr in new repo today. May take me a day or two though depending on how busy I am with work.

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 3.0.0a3, 3.0.0a4 May 14, 2023
@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 3.0.0a4, 3.0.0a5 Jun 6, 2023
@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 3.0.0a5, 3.0.0a6 Jun 13, 2023
@jacobtylerwalls jacobtylerwalls added the Blocked 🚧 A PR or issue blocked by another PR or issue label Jun 19, 2023
@jacobtylerwalls jacobtylerwalls modified the milestones: 3.0.0a6, 3.0.0b0 Jul 4, 2023
@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 3.1.0, 3.2.0 Feb 4, 2024
@jacobtylerwalls jacobtylerwalls modified the milestones: 3.2.0, 3.3.0 May 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked 🚧 A PR or issue blocked by another PR or issue Brain 🧠 Needs a brain tip Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants