Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glob() recursively matches on * #98

Closed
nh2 opened this issue Jul 9, 2023 · 4 comments · Fixed by #99
Closed

glob() recursively matches on * #98

nh2 opened this issue Jul 9, 2023 · 4 comments · Fixed by #99

Comments

@nh2
Copy link
Contributor

nh2 commented Jul 9, 2023

This a bug made it into the Python 3.12 stdlib.

Issue

When you have a .zip containing a/b/c.txt, and on the top-level do .glob('*.txt'), it produces a match when it shouldn't.

.glob() is accidentally implemented to always behave like the recursive rglob() if a * is used.

Repro

Using the released Python 3.12 zipfile stdlib; the same happens with current zipp:

#! /usr/bin/env bash
set -e

mkdir -p zipp-testdir/a/b/
touch zipp-testdir/a/b/c.txt

rm -f zipp-testdir.zip
zip -r zipp.zip zipp-testdir

python3 --version
python3 -c 'import zipfile; print(list( zipfile.Path(zipfile.ZipFile("zipp.zip")).glob("*.txt") ))'

Prints

[Path('zipp.zip', 'zipp-testdir/a/b/c.txt')]

when it shouldn't.

Reason

This code

zipp/zipp/__init__.py

Lines 370 to 375 in ee6d711

matches = re.compile(fnmatch.translate(pattern)).fullmatch
return (
child
for child in self._descendants()
if matches(str(child.relative_to(self)))
)

uses fnmatch.translate() when the fnmatch docs say

Note that the filename separator ('/' on Unix) is not special to this module. See module glob for pathname expansion

@nh2

This comment was marked as off-topic.

@nh2
Copy link
Contributor Author

nh2 commented Jul 9, 2023

Unfortunately the test suite for .glob() is extremely sparse and does not check such things.

I added a (failing) test for this in PR #99.

@nh2

This comment was marked as off-topic.

@jaraco
Copy link
Owner

jaraco commented Jul 12, 2023

Thanks for the report. This functionality was added in #85 (zipp 3.11) and Python 3.12 (not a regression). The early implementation attempted to re-use the selection functionality from pathlib, which was messy because it required constructing a Path object with all of the platform-specific behaviors that brought. I re-wrote the implementation to have a native implementation, but clearly missed some important cases.

It sure would be nice to have fixes for these. I'm not entirely confident a proper solution can be developed in short order, though I plan to work on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants