Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong handling of . with namespace packages #261

Closed
kolypto opened this issue Nov 24, 2020 · 11 comments · Fixed by #264
Closed

Wrong handling of . with namespace packages #261

kolypto opened this issue Nov 24, 2020 · 11 comments · Fixed by #264

Comments

@kolypto
Copy link

kolypto commented Nov 24, 2020

We have a namespace package:

# this is a namespace package
import pkg_resources
pkg_resources.declare_namespace(__name__)

which is incorrectly handled by this backport.

Prepared('first.second').exact_matches
# Must give:
['first.second.dist-info', 'first.second.egg-info']
# Actually gives:
['first_second.dist-info', 'first_second.egg-info']

As a result, this search fails:

list(FastPath('/opt/first.second').search(Prepared('first.second')))
# Must give:
[PosixPath('/opt/first.second/first.second.egg-info')]
# Actually gives:
[]

as a result, our build broke as soon as this package was updated :)

Please revert this change: dbdef6b . It should not replace . with _!

In Cpython, it's still like it used to be: https://github.com/python/cpython/blob/master/Lib/importlib/metadata.py#L478

@kolypto
Copy link
Author

kolypto commented Nov 24, 2020

This is the error that I was getting:

importlib_metadata.PackageNotFoundError: No package metadata was found for first.second

It worked with normal packages; it fails for namespace packages.

@kolypto
Copy link
Author

kolypto commented Nov 24, 2020

Here's how to reproduce this error:

$ mkdir bugtest   && cd bugtest
$ pyenv local 3.7.9
$ python -m venv venv 
$ . ./venv/bin/activate
$ pip install importlib_metadata
Successfully installed importlib-metadata-3.1.0 zipp-3.4.0
$ mkdir -p first/second
$ echo '__import__("pkg_resources").declare_namespace(__name__)' > first/__init__.py
$ echo '' > first/second/__init__.py
$ python
>

Now in Python:

import first.second  # works

Try importlib_metadata:

from importlib_metadata import Distribution

Distribution.from_name('first.second')
# importlib_metadata.PackageNotFoundError: No package metadata was found for first.second

this is what fails:

from importlib_metadata import Prepared, FastPath

list(FastPath('./').search(Prepared('first.second')))  # -> 
[]

try deeper:

from importlib_metadata import Prepared

Prepared('first.second').exact_matches  # -> 
['first_second.dist-info', 'first_second.egg-info']

so you see that it's the . -> _ conversion that ruins the whole thing :)

@analog-cbarber
Copy link

Yes, this is a pretty serious bug. It is going to break any project using namespace packages that use this package.

@FFY00
Copy link
Member

FFY00 commented Nov 29, 2020

@jaraco I assume you want this behavior in new versions? If so I can open a PR that only ops-in to normalization in Python >=3.10.

@FFY00
Copy link
Member

FFY00 commented Nov 29, 2020

Well, actually, I am not sure if that is the correct approach as people might want to use this because it backports the new behavior. Let me know what you think.

@jaraco
Copy link
Member

jaraco commented Dec 1, 2020

I don't believe the issue is with namespace packages, but with any package whose name has a . in its name. In #253, we attempted to create a backward-compatible change that would also honor the mangled name on the file system. During that PR, I added tests in an attempt to guarantee compatibility, but it seems I've missed that expectation.

@jaraco
Copy link
Member

jaraco commented Dec 1, 2020

Here's how to reproduce this error:

I don't believe this repro is correct. Importlib_metadata operates on package metadata and not on packages themselves. Creating a Python package called first.second won't create any metadata. To create the metadata, you need a .dist-info file that matches the distribution name of the package (which could be first.second).

It's this test that's intended to capture the legacy expectation. In that test, the fixture ensures that foo.bar-1.0.0.dist-info is on the sys.path, then loads that metadata using various different forms of the name. It uses distribution(name), which is the same as Distribution.from_name(name).

Prepared('first.second').exact_matches
# Must give:
['first.second.dist-info', 'first.second.egg-info']
# Actually gives:
['first_second.dist-info', 'first_second.egg-info']

Here, exact_matches is intended to match the name exactly as it will appear when normalized (based in the specification). So I think it's reasonable to see this output.

I think I see the issue now. The issue is that the compatibility fallback is looking for a name with a version but will miss metadata directories without a version.

jaraco added a commit that referenced this issue Dec 1, 2020
…hasn't been updated to the new normalization technique. Ref #261.
jaraco added a commit that referenced this issue Dec 1, 2020
…hasn't been updated to the new normalization technique. Ref #261.
@jaraco
Copy link
Member

jaraco commented Dec 1, 2020

I've found a fix for the issue, but I'm pretty sure the fix, although it passes all the tests, will break package discovery. This code is so hairy and hyper-optimized that this one tweak has really mucked things up.

@jaraco
Copy link
Member

jaraco commented Dec 2, 2020

In 4098b51, I've captured the breakage with package discovery with a test and addressed the concern.

I still have a concern that now legacy egg discovery is no longer accurate.

@FFY00
Copy link
Member

FFY00 commented Dec 2, 2020

You mean this? Seems to be working

>>> import importlib_metadata
>>> importlib_metadata.Distribution.from_name('jaraco.classes')._path
PosixPath('/usr/lib/python3.8/site-packages/jaraco.classes-3.1.0-py3.8.egg-info')
>>> importlib_metadata.Distribution.from_name('jaraco_classes')._path
PosixPath('/usr/lib/python3.8/site-packages/jaraco.classes-3.1.0-py3.8.egg-info')

@jaraco
Copy link
Member

jaraco commented Dec 2, 2020

I meant this case for legacy .egg files/directories with an EGG-INFO file in it.

I addressed that concern in 6036a37 (after some refactoring).

I didn't bother writing tests for it as I'm unsure how important that feature is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants