Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_repeated_merge_spill fails with ValueError #892

Closed
j-bennet opened this issue Jun 28, 2023 · 8 comments
Closed

test_repeated_merge_spill fails with ValueError #892

j-bennet opened this issue Jun 28, 2023 · 8 comments

Comments

@j-bennet
Copy link
Contributor

j-bennet commented Jun 28, 2023

Error looks like this:

ValueError: 'C:\\Miniconda3\\envs\\etc\\jupyter\\jupyter_notebook_config.d\\dask_labextension.json' is not in the subpath of 'C:\\Miniconda3\\envs\\test\\Lib\\site-packages' OR one path is relative and the other is absolute

Linked issue:

Full error:

__________________________ test_repeated_merge_spill __________________________
[gw1] win32 -- Python 3.9.16 C:\Miniconda3\envs\test\python.exe

upload_cluster_dump = <function upload_cluster_dump.<locals>._upload_cluster_dump at 0x000001850A5E3CA0>
benchmark_all = <function benchmark_all.<locals>._benchmark_all at 0x000001850A5EB4C0>
cluster_kwargs = {'embarrassingly_parallel': {'backend_options': {'multizone': True, 'region': 'us-east-1', 'spot': True, 'spot_on_dema...spot_on_demand_fallback': True}, 'n_workers': 10, 'package_sync': True, 'scheduler_vm_types': ['m6i.large'], ...}, ...}
dask_env_variables = {'DASK_COILED__TOKEN': '***'}
C:\Miniconda3\envs\test\lib\site-packages\importlib_metadata\__init__.py:5[46](https://github.com/coiled/benchmarks/actions/runs/5372522470/jobs/9745995351#step:8:47): in <genexpr>
    (subdir / name)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = WindowsPath('C:/Miniconda3/envs/etc/jupyter/jupyter_notebook_config.d/dask_labextension.json')
other = (WindowsPath('C:/Miniconda3/envs/test/Lib/site-packages'),)
parts = ['C:\\', 'Miniconda3', 'envs', 'etc', 'jupyter', 'jupyter_notebook_config.d', ...]
drv = 'C:', root = '\\'

    def relative_to(self, *other):
        """Return the relative path to another path identified by the passed
        arguments.  If the operation is not possible (because this is not
        a subpath of the other path), raise ValueError.
        """
        # For the purpose of this method, drive and root are considered
        # separate parts, i.e.:
        #   Path('c:/').relative_to('c:')  gives Path('/')
        #   Path('c:/').relative_to('/')   raise ValueError
        if not other:
            raise TypeError("need at least one argument")
        parts = self._parts
        drv = self._drv
        root = self._root
        if root:
            abs_parts = [drv, root] + parts[1:]
        else:
            abs_parts = parts
        to_drv, to_root, to_parts = self._parse_args(other)
        if to_root:
            to_abs_parts = [to_drv, to_root] + to_parts[1:]
        else:
            to_abs_parts = to_parts
        n = len(to_abs_parts)
        cf = self._flavour.casefold_parts
        if (root or drv) if n == 0 else cf(abs_parts[:n]) != cf(to_abs_parts):
            formatted = self._format_parsed_parts(to_drv, to_root, to_parts)
>           raise ValueError("{!r} is not in the subpath of {!r}"
                    " OR one path is relative and the other is absolute."
                             .format(str(self), str(formatted)))
E           ValueError: 'C:\\Miniconda3\\envs\\etc\\jupyter\\jupyter_notebook_config.d\\dask_labextension.json' is not in the subpath of 'C:\\Miniconda3\\envs\\test\\Lib\\site-packages' OR one path is relative and the other is absolute.

C:\Miniconda3\envs\test\lib\pathlib.py:939: ValueError
This was referenced Jun 28, 2023
@dchudz
Copy link
Collaborator

dchudz commented Jun 28, 2023

oh sorry, I saw #893 first and commented there.

@dan-blanchard
Copy link

dan-blanchard commented Jun 28, 2023

I believe I know what's going on here, but I'm not entirely sure on how to proceed. Here are the facts:

I recently added a feature to package sync where it will try to determine the correct version of a package to use when people have multiple versions of the same package installed in their site-packages directory. Sometimes this is because of an editable install over an existing install, other times it's because of pip installing over conda or vice versa. For every version of the package that we have metadata for in site-packages, we look at the list of files that were supposed to be installed for that package, and then compare the hashes of the files to the actual files on the filesystem.

However, on Windows, trying to retrieve the list of files that were installed for the dask-labextension package raises a ValueError because of this line in importlib_metadata/__init__.py:

        paths = (
            (subdir / name)
            .resolve()
            .relative_to(self.locate_file('').resolve())
            .as_posix()
            for name in text.splitlines()
        )

Because dask-labextension uses jupyter-packaging to install data files outside of the site-packages directory, the relative_to call raises a ValueError. I'm not certain why this is only happening on Windows—maybe the files are inside site-packages on other platforms?—but it feels like a bug in either jupyter-packaging or importlib_metadata that you cannot call Distribution.files with packages like this.

I can try to work around this on the package sync side, but it's probably going to be hacky.

@dan-blanchard
Copy link

This comment sure makes it sounds like jupyter-packaging is violating some standard about where files can be installed:

        Read installed-files.txt and return lines in a similar
        CSV-parsable format as RECORD: each file must be placed
        relative to the site-packages directory and must also be
        quoted (since file names can contain literal commas).

@dan-blanchard
Copy link

Actually, I'm going to call this an instance of python/importlib_metadata#455

@dchudz
Copy link
Collaborator

dchudz commented Jun 28, 2023

I won't reopen but closing may be a mistake until the fix is actually deployed b/c Dask's CI will probably make new issues when this happens again.

@j-bennet
Copy link
Contributor Author

I won't reopen but closing may be a mistake until the fix is actually deployed b/c Dask's CI will probably make new issues when this happens again.

Yeah I'll keep an eye on those.

@dchudz
Copy link
Collaborator

dchudz commented Jun 28, 2023 via email

@ntabris
Copy link
Member

ntabris commented Jun 29, 2023

Deploy with intended fix just went out, coiled 0.7.12 is on pypi and will be on conda-forge later once the bots do their thing.

This was referenced Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants