Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

black ignores file in current folder if current folder is git workspace and (parent of) current folder is a symbolic link #4205

Closed
bersbersbers opened this issue Feb 2, 2024 · 5 comments · Fixed by #4222
Labels
C: file collection Related to file collection (e.g. gitignore & cache) or file discovery and all of its configuration. T: bug Something isn't working

Comments

@bersbersbers
Copy link

bersbersbers commented Feb 2, 2024

Describe the bug

black incorrectly ignores a file in the current folder

To Reproduce

(.venv) C:\>dir

12/21/2023  10:19 AM    <JUNCTION>     Code [C:\ws]
...
02/02/2024  08:33 AM    <DIR>          ws

(.venv) C:\>cd Code

(.venv) C:\Code>mkdir Bug

(.venv) C:\Code>cd Bug

(.venv) C:\Code\Bug>git init
Initialized empty Git repository in C:/ws/Bug/.git/

(.venv) C:\Code\Bug>echo 1 > bug.py

(.venv) C:\Code\Bug>black --verbose bug.py
Identified `C:\ws\Bug` as project root containing a .git directory.
bug.py ignored: is a symbolic link that points outside C:\ws\Bug
No Python files are present to be formatted. Nothing to do 😴

Expected behavior
black should not ignore bug.py

Environment

black, 24.1.2.dev11+g632f44bd68 (compiled: no)
Python (CPython) 3.12.1
Windows 10 22H2

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Feb 11, 2024

Hm, I tried the following and couldn't repro:

. λ mkdir actual
. λ ln -sf actual symlink
. λ cd symlink
./symlink λ ls
./symlink λ mkdir bug
./symlink λ cd bug
./symlink/bug λ git init
Initialized empty Git repository in /Users/shantanu/dev/black/actual/bug/.git/
./symlink/bug λ echo "x     = 'asdf'" > bug.py
./symlink/bug λ black --verbose bug.py
Identified `/Users/shantanu/dev/black/actual/bug` as project root containing a .git directory.
Found input source: "bug.py"
reformatted bug.py

All done! ✨ 🍰 ✨
1 file reformatted.

Perhaps something Windows/junction-specific is happening?

I also wasn't able to repro with any --stdin-filename variants

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Feb 11, 2024

Okay, while looking into --stdin-filename variants I noticed relative path vs absolute path is an issue. When located in ~/dev/black/symlink/bug:

~/dev/black/symlink/bug main λ black bug.py -v --diff
Identified `/Users/shantanu/dev/black/actual/bug` as project root containing a .git directory.
Found input source: "bug.py"
--- bug.py	2024-02-11 00:29:53.369145+00:00
+++ bug.py	2024-02-11 02:12:44.358110+00:00
@@ -1 +1 @@
-x     = 'asdf'
+x = "asdf"
would reformat bug.py

All done! ✨ 🍰 ✨
1 file would be reformatted.

~/dev/black/symlink/bug main λ black ~/dev/black/symlink/bug/bug.py -v --diff
Identified `/Users/shantanu/dev/black/actual/bug` as project root containing a .git directory.
/Users/shantanu/dev/black/symlink/bug/bug.py ignored: is a symbolic link that points outside /Users/shantanu/dev/black/actual/bug
No Python files are present to be formatted. Nothing to do 😴

This could pretty easily account for microsoft/vscode-black-formatter#444

hauntsaninja added a commit to hauntsaninja/black that referenced this issue Feb 11, 2024
This relates to psf#4015, psf#4161 and the behaviour of os.getcwd()

Black is a big user of pathlib and as such loves doing `.resolve()`,
since for a long time it was the only good way of getting an absolute
path in pathlib. However, this has two problems:

The first minor problem is performance, e.g. in psf#3751 I (safely) got rid
of a bunch of `.resolve()` which made Black 40% faster on cached runs.

The second more important problem is that always resolving symlinks
results in unintuitive exclusion behaviour. For instance, a gitignored
symlink should never alter formatting of your actual code. This was
reported by users a few times.

In psf#3846, I improved the exclusion rule logic for symlinks in
`gen_python_files` and everything was good.

But `gen_python_files` isn't enough, there's also `get_sources`, which
handles user specified paths directly (instead of files Black
discovers). So in psf#4015, I made a very similar change to psf#3846 for
`get_sources`, and this is where some problems began.

The core issue was the line:
```
root_relative_path = path.absolute().relative_to(root).as_posix()
```
The first issue is that despite root being computed from user inputs, we
call `.resolve()` while computing it (likely unecessarily). Which means
that `path` may not actually be relative to `root`. So I started off
this PR trying to fix that, when I ran into the second issue. Which is
that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or
`Path.cwd`) also often resolves symlinks!
```
>>> import os
>>> os.environ.get("PWD")
'/Users/shantanu/dev/black/symlink/bug'
>>> os.getcwd()
'/Users/shantanu/dev/black/actual/bug'
```
This also meant that the breakage often would not show up when input
relative paths.

This doesn't affect `gen_python_files` / psf#3846 because things are always
absolute and known to be relative to `root`.

Anyway, it looks like psf#4161 fixed the crash by just swallowing the error
and ignoring the file. Instead, we should just try to compute the actual
relative path. I think this PR should be quite safe, but we could also
consider reverting some of the previous changes; the associated issues
weren't too popular.

At the same time, I think there's still behaviour that can be improved
and I kind of want to make larger changes, but maybe I'll save that for
if we do something like psf#3952

Hopefully fixes psf#4205, fixes psf#4209, actual fix for psf#4077
hauntsaninja added a commit to hauntsaninja/black that referenced this issue Feb 11, 2024
This relates to psf#4015, psf#4161 and the behaviour of os.getcwd()

Black is a big user of pathlib and as such loves doing `.resolve()`,
since for a long time it was the only good way of getting an absolute
path in pathlib. However, this has two problems:

The first minor problem is performance, e.g. in psf#3751 I (safely) got rid
of a bunch of `.resolve()` which made Black 40% faster on cached runs.

The second more important problem is that always resolving symlinks
results in unintuitive exclusion behaviour. For instance, a gitignored
symlink should never alter formatting of your actual code. This kind of
thing was reported by users a few times.

In psf#3846, I improved the exclusion rule logic for symlinks in
`gen_python_files` and everything was good.

But `gen_python_files` isn't enough, there's also `get_sources`, which
handles user specified paths directly (instead of files Black
discovers). So in psf#4015, I made a very similar change to psf#3846 for
`get_sources`, and this is where some problems began.

The core issue was the line:
```
root_relative_path = path.absolute().relative_to(root).as_posix()
```
The first issue is that despite root being computed from user inputs, we
call `.resolve()` while computing it (likely unecessarily). Which means
that `path` may not actually be relative to `root`. So I started off
this PR trying to fix that, when I ran into the second issue. Which is
that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or
`Path.cwd`) also often resolves symlinks!
```
>>> import os
>>> os.environ.get("PWD")
'/Users/shantanu/dev/black/symlink/bug'
>>> os.getcwd()
'/Users/shantanu/dev/black/actual/bug'
```
This also meant that the breakage often would not show up when input
relative paths.

This doesn't affect `gen_python_files` / psf#3846 because things are always
absolute and known to be relative to `root`.

Anyway, it looks like psf#4161 fixed the crash by just swallowing the error
and ignoring the file. Instead, we should just try to compute the actual
relative path. I think this PR should be quite safe, but we could also
consider reverting some of the previous changes; the associated issues
weren't too popular.

At the same time, I think there's still behaviour that can be improved
and I kind of want to make larger changes, but maybe I'll save that for
if we do something like psf#3952

Hopefully fixes psf#4205, fixes psf#4209, actual fix for psf#4077
@hauntsaninja
Copy link
Collaborator

@bersbersbers would you mind testing #4222 and seeing if it fixes the case that you're seeing?

@bersbersbers
Copy link
Author

@hauntsaninja I can still repro the bug with black 24.1.1. 196b586 from #4222 fixes it, though - thanks!

hauntsaninja added a commit that referenced this issue Feb 12, 2024
This relates to #4015, #4161 and the behaviour of os.getcwd()

Black is a big user of pathlib and as such loves doing `.resolve()`,
since for a long time it was the only good way of getting an absolute
path in pathlib. However, this has two problems:

The first minor problem is performance, e.g. in #3751 I (safely) got rid
of a bunch of `.resolve()` which made Black 40% faster on cached runs.

The second more important problem is that always resolving symlinks
results in unintuitive exclusion behaviour. For instance, a gitignored
symlink should never alter formatting of your actual code. This kind of
thing was reported by users a few times.

In #3846, I improved the exclusion rule logic for symlinks in
`gen_python_files` and everything was good.

But `gen_python_files` isn't enough, there's also `get_sources`, which
handles user specified paths directly (instead of files Black
discovers). So in #4015, I made a very similar change to #3846 for
`get_sources`, and this is where some problems began.

The core issue was the line:
```
root_relative_path = path.absolute().relative_to(root).as_posix()
```
The first issue is that despite root being computed from user inputs, we
call `.resolve()` while computing it (likely unecessarily). Which means
that `path` may not actually be relative to `root`. So I started off
this PR trying to fix that, when I ran into the second issue. Which is
that `os.getcwd()` (as called by `os.path.abspath` or `Path.absolute` or
`Path.cwd`) also often resolves symlinks!
```
>>> import os
>>> os.environ.get("PWD")
'/Users/shantanu/dev/black/symlink/bug'
>>> os.getcwd()
'/Users/shantanu/dev/black/actual/bug'
```
This also meant that the breakage often would not show up when input
relative paths.

This doesn't affect `gen_python_files` / #3846 because things are always
absolute and known to be relative to `root`.

Anyway, it looks like #4161 fixed the crash by just swallowing the error
and ignoring the file. Instead, we should just try to compute the actual
relative path. I think this PR should be quite safe, but we could also
consider reverting some of the previous changes; the associated issues
weren't too popular.

At the same time, I think there's still behaviour that can be improved
and I kind of want to make larger changes, but maybe I'll save that for
if we do something like #3952

Hopefully fixes #4205, fixes #4209, actual fix for #4077
@JasonGross
Copy link

It looks like #4222 already has testing, so I'm not sure if this adds much, but I posted this on the vscode plugin and it was suggested that I post it here:


@JasonGross you should recommend this as a smoke test to black itself.

Originally posted by @karthiknadig in microsoft/vscode-black-formatter#444 (comment)


In case this ever comes back in the future, I wrote the following script for debugging this before realizing that it had already been reported and fixed (note that a previous version of black had problems with symlinks only in the presence of pyproject.toml):

#!/usr/bin/env bash
verbose=""
if [ "$1" == "-v" ] || [ "$1" == "--verbose" ]; then
    verbose=1
    set -x
fi
if [ -z "$PYTHON" ]; then
    PYTHON=python
fi
"$PYTHON" -m black --version
uname -a
testdir="$(mktemp -d)"
trap "rm -rf $testdir" EXIT
cd "$testdir"
subdir="."
mkdir -p "directory/$subdir"
ln -s directory symlink
cd "directory/$subdir"
runtest() {
    for cwd in "directory" "symlink"; do
        for usedir in "directory" "symlink"; do
            cd "$testdir/$cwd/$subdir"
            usefile="$testdir/$usedir/$subdir/test.py"
            echo "import numpy as np# foo" > test.py
            printf "%s in %s using %s..." "$1" "$cwd" "$usedir"
            if [ -n "$verbose" ]; then
                printf "\n"
                echo "import numpy as np# foo" | "$PYTHON" -m black --stdin-filename "$usefile" -
                printf "=====================================\n"
            else
                result="$(echo "import numpy as np# foo" | "$PYTHON" -m black --stdin-filename "$usefile" - 2>&1)"
                if printf -- "$result" | grep -q "1 file reformatted."; then
                    printf "pass\n"
                else
                    printf "fail\n"
                fi
            fi
        done
    done

}
runtest "without pyproject.toml"
touch pyproject.toml
runtest "with pyproject.toml"

Originally posted by @JasonGross in microsoft/vscode-black-formatter#444 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: file collection Related to file collection (e.g. gitignore & cache) or file discovery and all of its configuration. T: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants