Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidRequirement produced for git URL in extra #357

Closed
hauntsaninja opened this issue Dec 15, 2021 · 10 comments · Fixed by #358
Closed

InvalidRequirement produced for git URL in extra #357

hauntsaninja opened this issue Dec 15, 2021 · 10 comments · Fixed by #358

Comments

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Dec 15, 2021

Originally filed at pypa/packaging#494

mkdir reqparse
cd reqparse

cat > setup.py <<HERE
from setuptools import setup

setup(
    name="reqparse",
    extras_require={
        "test": [
            "gym-retro @ git+https://github.com/openai/retro.git@v0.8.0",
        ]
    },
)
HERE

python3 -m venv env
source env/bin/activate
python3 -m pip install packaging -e .

cat reqparse.egg-info/requires.txt

# this now gives me
# packaging.requirements.InvalidRequirement: Parse error at "'extra =='": Expected string_end
python3 -c '
import importlib.metadata
from packaging.requirements import Requirement
Requirement(importlib.metadata.distribution("reqparse").requires[0])
'

This results in the equivalent of:

from packaging.requirements import Requirement
Requirement('gym-retro@ git+https://github.com/openai/retro.git@v0.8.0; extra == "test"')

uranusjr suggested adding a space before the semicolon here: https://github.com/python/cpython/blob/908fd691f96403a3c30d85c17dd74ed1f26a60fd/Lib/importlib/metadata/__init__.py#L678

Happy to make a PR for that — if so, should I also make a PR to CPython, or is there a separate process for porting?

@FFY00
Copy link
Member

FFY00 commented Dec 15, 2021

Happy to make a PR for that — if so, should I also make a PR to CPython, or is there a separate process for porting?

Please do. The fix will be committed here and then synced to CPython later.

@jaraco
Copy link
Member

jaraco commented Dec 15, 2021

Would someone explain what the problem is and why adding the space corrects that condition?

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Dec 15, 2021

The issue is that importlib.metadata.requires produces requirements that cause packaging.requirements.Requirement to raise an error: packaging.requirements.InvalidRequirement: Parse error at "'extra =='": Expected string_end

Grammar stuff is tricky, but I thiiiiink packaging's behaviour is compliant to the grammar outlined in PEP 508. What I think is happening is that the semicolon gets parsed as part of the urlspec (probably because semicolon is included in pchar in the grammar), so we're not able to match quoted_marker? in (name:n wsp* extras?:e wsp* urlspec:v (wsp+ | end) quoted_marker?. By adding a whitespace, we ensure this isn't the case, since that causes the parser to recognise the end of urlspec.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 16, 2021

In other more different reductive words, you need a space before the semicolon, because this is one of the paper cuts in PEP 508's URLs. 🤷🏻‍♂️

@uranusjr
Copy link
Contributor

uranusjr commented Dec 16, 2021

What I think is happening is that the semicolon gets parsed as part of the urlspec (probably because semicolon is included in pchar in the grammar)

You're correct. Semicolon is a valid character in a URL, so the PEP 508 parser's behaviour is correct to include it in the urlspec run. You could argue that the rules should be smarter to accomplish this punctuation-before-space case (most Markdown parsers implement this so you can stick a comma right after a URL and get the desired rendering result), but adding spaces around the URL is the safest format for tools to emit.

@jaraco
Copy link
Member

jaraco commented Dec 16, 2021

Thanks for the deeper explanation. That's exactly what I was looking for. I also looked at the grammar and the complete grammar. Interestingly, both supply different definitions for url_req. The complete grammar provides a better definition because it doesn't require whitespace at the end of a url_req whereas the grammar does.

Either way, the spec does require the space between the urlspec and the ;, so importlib_metadata is indeed incorrectly formatting the req.

@jaraco
Copy link
Member

jaraco commented Dec 16, 2021

I do observe, though, that name_req does not require the space. The space is only required if a url_req is used.

@jaraco
Copy link
Member

jaraco commented Dec 16, 2021

Thanks @hauntsaninja and others for reporting, triaging, and drafting a patch. Fix is rolling out in 4.8.3.

Does this fix need a backport to 2.x (Python 2.7 support)? That is, are there use-cases on older Pythons affected by this bug that would benefit from the fix?

@jaraco
Copy link
Member

jaraco commented Dec 16, 2021

I realize this change needs a port to CPython, but so also does 4.6.4 and later. I'm unsure if the changes from 4.7 and 4.8 can be safely backported in CPython, so I'll probably need to introduce those changes and the bugfix changes separately.

@jaraco
Copy link
Member

jaraco commented Dec 17, 2021

Ported to CPython in bpo-46105.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants