Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No such file or directory due to race condition #8159

Open
3 of 4 tasks
andyhasit opened this issue Jul 6, 2023 · 11 comments
Open
3 of 4 tasks

No such file or directory due to race condition #8159

andyhasit opened this issue Jul 6, 2023 · 11 comments
Labels
kind/bug Something isn't working as expected status/triage This issue needs to be triaged

Comments

@andyhasit
Copy link

andyhasit commented Jul 6, 2023

  • Poetry version: 1.5.1
  • Python version: 3.10.4
  • OS version and name: Ubuntu 22
  • pyproject.toml:
  • I am on the latest stable Poetry version, installed using a recommended method.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have consulted the FAQ and blog for any relevant entries or release notes.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option) and have included the output below.

This is similar to6958 but is not a duplicate as its a different use case.

This was happening for us on 1.3.1 and previous versions too.

Issue

Poetry fails to either install or upgrade a package (in this case setuptools) because it fails to find the .dist-info directory of another package (in this case proto_plus-1.22.2.dist-info) and that's because that package was upgraded from 1.22.2 to 1.22.3 just before (see output) but I suspect that operation didn't update the register in time.

In this case it's the pip uninstall which fails:

 ['/usr/local/bin/python3.10', '-m', 'pip', 'uninstall', 'setuptools', '-y']

(copied from output below)

But the same issue can happen with a pip install too:

'['/usr/local/bin/python3.10', '-m', 'pip', 'install', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/usr/local', '--no-deps', '/root/.cache/pypoetry/artifacts/2a/5a/46/20df50732755163cfab8d91b52f952990368c16a4c1005b865230dca24/google_cloud_core-2.3.2-py2.py3-none-any.whl']' returned non-zero exit status 1.

(taken from a different run that failed)

Re-running the job usually works, but it can fail on other packages or other projects in the job. This happens on maybe 1 out of 50-100 installs, but seeing we have 30+ projects/installs per job, jobs as a whole fail quite frequently. This has been happening on and off for well over a year, but weirdly it started happening with increasing frequency lately (on 1.3.1) and we just updated to 1.5.1 hoping it might resolve but to no avail.

There is no change to our CI environment that we are aware of (but it could even be a network change causing some packages to install much faster than before/on some occasions). We have no record of this happening locally but that could be down to number of installs.

I suspect this is related to the race condition fixed by 6186 and suspect that if we switch off parallel installations that fixes it.

The output below is from a private runner for github CI, and the command is running against a docker container.

Skipping virtualenv creation, as specified in config file.
docker run -v /runner-tmp/np-v2/np-v2/.venv/lib/python3.10/site-packages:/usr/local/lib/python3.10/site-packages  eu.gcr.io/np-root-project/np-db-utils:f2141c4a bash -c "PIP_ROOT_USER_ACTION=ignore poetry install"
Installing dependencies from lock file

Package operations: 1 install, 27 updates, 0 removals

  • Updating protobuf (4.23.2 -> 4.23.3)
  • Updating exceptiongroup (1.1.1 -> 1.1.2)
  • Updating google-auth (2.19.1 -> 2.21.0)
  • Updating googleapis-common-protos (1.59.0 -> 1.59.1)
  • Updating grpcio (1.54.2 -> 1.56.0)
  • Updating pyparsing (3.0.9 -> 3.1.0)
  • Updating google-api-core (2.11.0 -> 2.11.1)
  • Updating grpcio-status (1.54.2 -> 1.56.0)
  • Updating typing-extensions (4.6.3 -> 4.7.1)
  • Updating platformdirs (3.5.1 -> 3.8.0)
  • Updating proto-plus (1.22.2 -> 1.22.3)
  • Updating pydantic (1.10.8 -> 1.10.10)
  • Updating redis (4.5.5 -> 4.6.0)
  • Updating setuptools (67.8.0 -> 68.0.0)

  CalledProcessError

  Command '['/usr/local/bin/python3.10', '-m', 'pip', 'uninstall', 'setuptools', '-y']' returned non-zero exit status 2.

  at /usr/local/lib/python3.10/subprocess.py:524 in run
       520│             # We don't call process.wait() as .__exit__ does that for us.
       521│             raise
       522│         retcode = process.poll()
       523│         if check and retcode:
    →  524│             raise CalledProcessError(retcode, process.args,
       525│                                      output=stdout, stderr=stderr)
       526│     return CompletedProcess(process.args, retcode, stdout, stderr)
       527│ 
       528│ 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['/usr/local/bin/python3.10', '-m', 'pip', 'uninstall', 'setuptools', '-y'] errored with the following return code 2
  
  Output:
  ERROR: Exception:
  Traceback (most recent call last):
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 167, in exc_logging_wrapper
      status = run_func(*args)
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/commands/uninstall.py", line 60, in run
      session = self.get_default_session(options)
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 75, in get_default_session
      self._session = self.enter_context(self._build_session(options))
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 89, in _build_session
      session = PipSession(
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/network/session.py", line 282, in __init__
      self.headers["User-Agent"] = user_agent()
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/network/session.py", line 157, in user_agent
      setuptools_dist = get_default_environment().get_distribution("setuptools")
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/metadata/__init__.py", line 24, in get_default_environment
      from .pkg_resources import Environment
    File "/usr/local/lib/python3.10/site-packages/pip/_internal/metadata/pkg_resources.py", line 9, in <module>
      from pip._vendor import pkg_resources
    File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pkg_resources/__init__.py", line 3252, in <module>
      def _initialize_master_working_set():
    File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pkg_resources/__init__.py", line 32[35](https://github.com/NetPurpose/np-v2/actions/runs/5465733537/jobs/9950196335#step:16:36), in _call_aside
      f(*args, **kwargs)
    File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pkg_resources/__init__.py", line 3287, in _initialize_master_working_set
      list(map(working_set.add_entry, sys.path))
    File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pkg_resources/__init__.py", line 623, in add_entry
      for dist in find_distributions(entry, True):
    File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pkg_resources/__init__.py", line 2065, in find_on_path
      for dist in factory(fullpath):
    File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pkg_resources/__init__.py", line 2127, in distributions_from_metadata
      if len(os.listdir(path)) == 0:
  FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.10/site-packages/proto_plus-1.22.2.dist-info'
  

  at /opt/poetry/venv/lib/python3.10/site-packages/poetry/utils/env.py:1[52](https://github.com/NetPurpose/np-v2/actions/runs/5465733537/jobs/9950196335#step:16:53)5 in _run
      1521│                 output = ""
      1522│             else:
      1523│                 output = subprocess.check_output(cmd, stderr=stderr, env=env, **kwargs)
      1524│         except CalledProcessError as e:
    → 1525│             raise EnvCommandError(e, input=input_)
      1526│ 
      1527│         return decode(output)
      1528│ 
      1529│     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:

Error: Process completed with exit code 1.
@andyhasit andyhasit added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Jul 6, 2023
@dimbleby
Copy link
Contributor

dimbleby commented Jul 6, 2023

This is a duplicate of #6958 (and potentially others)

then please close as such!

This repository already has far more open issues than anyone can keep up with: scattering discussion of duplicates across multiple places does not help. It also leave a much messier trail for the next person who hits something similar, it becomes much harder for them to find whatever useful discussion there is.

@andyhasit
Copy link
Author

andyhasit commented Jul 6, 2023

@dimbleby well given this one narrows down the possible cause of the issue, and refers to a more recent version of poetry, perhaps we should close the other one instead?

@dimbleby
Copy link
Contributor

dimbleby commented Jul 6, 2023

and then next time someone makes further progress, close this one in favour of that? this seems to me silly

let's just have the discussion in the open issue that already tracks the problem

@andyhasit
Copy link
Author

And update that issue to state it happens on 1.5.1 too? Is there not a risk that ticket will get closed because it points to an old version?

@andyhasit
Copy link
Author

andyhasit commented Jul 6, 2023

Actually 6958 talks about a different use case. I'll update my ticket to state it's not a duplicate.

We have been encountering this error for a long time, and only just recently figured out it likely relates to concurrency, and are now able to demonstrate the behaviour on a recent version of poetry. This ticket should absolutely not be closed (unless we find an exact duplicate ) - no matter how many open issues this project is being overloaded with.

@andyhasit
Copy link
Author

andyhasit commented Jul 6, 2023

I have been reading through the source code to understand what's going on in more detail. Would I be correct in thinking that what happens is:

  1. Poetry identifies that packages A and B (1.0 > 1.1) need updated.
  2. Poetry issues pip commands to update packages A and B in parallel.
  3. The pip command for package B finds that depends on package A which is at version 1.0, and it remembers this.
  4. Both commands do work, but A completes first, and replaces site-packages/a-1.0.dist-info to site-packages/a-1.1.dist-info
  5. The command for package B finishes its downloads, goes back to write to site-packages/a-1.0.dist-info but finds it no longer exists.

I think the only way to avoid this situation would be for poetry to determine that pip was going to try to access package A as part of the operation on package B, and therefore not to allow these two operations to run in parallel.

My understanding was that poetry already does this by searching for the deepest dependency, but my guess is that serial installs perhaps hide some of the subtleties of this strategy, and parallel installs occasionally get caught short.

I'm thinking:

  • Multiple packages attempting to update a common dependency to different versions?
  • Not taking into account the change of dependencies between package versions?

I briefly looked at the puzzle code but couldn't figure out how it works.

@dimbleby
Copy link
Contributor

dimbleby commented Jul 6, 2023

per your stack, the error is happening while shelling out to pip uninstall: specifically you seem to be hitting rather a narrow window where a directory is deleted between these two lines: https://github.com/pypa/pip/blob/4734c4c735ce63664efd165a6f54fa0e957f13f7/src/pip/_vendor/pkg_resources/__init__.py#L2190-L2191

it looks as though pip uninstall is here in the middle of reading metadata for all packages in the environment: ie the deleted file has nothing much to do with the package that pip is uninstalling. So I guess that any two parallel uninstalls are potentially exposed to this.

(Upgrading a package requires first uninstalling the old version: so it's not only pure uninstalls)

ideally it's desirable not to shell out to pip here at all - poetry's modern installer shells out to pip only when uninstalling packages - but to have some in-process uninstaller, and for that not to care about packages B, C, D etc while it is uninstalling A

pragmatically I expect you should prefer simply to disable parallelism in your CI

@andyhasit
Copy link
Author

Agree with all the above. We'll disable parallel installs on CI. Do we know roughly what performance hit to expect from doing so? I measured locally and it's barely 15% slower across several runs, but then again I have everything in cache, which CI won't.

@zyv
Copy link
Contributor

zyv commented Jul 24, 2023

I'm hitting frequent failures upon parallel poetry install and have to re-run the command multiple times for it to succeed.

The stack trace looks completely different from what what discussed in this and the related tickets (ends up in pip at value = _canonicalize_regex.sub("-", name).lower()), but I wonder if the cause is still the same, or I should be opening another issue.

I can reliably reproduce it on my machine with Ubuntu 20.04 and Python 3.11 from Anaconda, but none of my colleagues has complained so far.

Any ideas (other than disabling parallel installation altogether?)

  • Updating typing-inspect (0.8.0 -> 0.9.0): Failed

  CalledProcessError

  Command '['/venv/bin/python', '-m', 'pip', 'uninstall', 'typing-inspect', '-y']' returned non-zero exit status 2.

  at /usr/lib/python3.8/subprocess.py:516 in run
       512│             # We don't call process.wait() as .__exit__ does that for us.
       513│             raise
       514│         retcode = process.poll()
       515│         if check and retcode:
    →  516│             raise CalledProcessError(retcode, process.args,
       517│                                      output=stdout, stderr=stderr)
       518│     return CompletedProcess(process.args, retcode, stdout, stderr)
       519│ 
       520│ 

The following error occurred when trying to handle this error:


  EnvCommandError

  Command ['/venv/bin/python', '-m', 'pip', 'uninstall', 'typing-inspect', '-y'] errored with the following return code 2
  
  Output:
  ERROR: Exception:
  Traceback (most recent call last):
    File "/venv/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 167, in exc_logging_wrapper
      status = run_func(*args)
               ^^^^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/commands/uninstall.py", line 61, in run
      session = self.get_default_session(options)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 98, in get_default_session
      self._session = self.enter_context(self._build_session(options))
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 125, in _build_session
      session = PipSession(
                ^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/network/session.py", line 343, in __init__
      self.headers["User-Agent"] = user_agent()
                                   ^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/network/session.py", line 175, in user_agent
      setuptools_dist = get_default_environment().get_distribution("setuptools")
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/metadata/importlib/_envs.py", line 180, in get_distribution
      return next(matches, None)
             ^^^^^^^^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_internal/metadata/importlib/_envs.py", line 175, in <genexpr>
      matches = (
                ^
    File "/venv/lib/python3.11/site-packages/pip/_internal/metadata/base.py", line 594, in iter_all_distributions
      for dist in self._iter_distributions():
    File "/venv/lib/python3.11/site-packages/pip/_internal/metadata/importlib/_envs.py", line 167, in _iter_distributions
      yield from finder.find(location)
    File "/venv/lib/python3.11/site-packages/pip/_internal/metadata/importlib/_envs.py", line 71, in find
      for dist, info_location in self._find_impl(location):
    File "/venv/lib/python3.11/site-packages/pip/_internal/metadata/importlib/_envs.py", line 59, in _find_impl
      normalized_name = canonicalize_name(get_dist_name(dist))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/venv/lib/python3.11/site-packages/pip/_vendor/packaging/utils.py", line 34, in canonicalize_name
      value = _canonicalize_regex.sub("-", name).lower()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  TypeError: expected string or bytes-like object, got 'NoneType'

@sivanantha321
Copy link

sivanantha321 commented Feb 3, 2024

We are also facing the same exact issue in our github CI environment. We are using the latest poetry version 1.7.1.

 • Downgrading yarl (1.9.4 -> 1.9.3)
#18 5.128 
#18 5.129   CalledProcessError
#18 5.129 
#18 5.130   Command '['/prod_venv/bin/python', '-m', 'pip', 'uninstall', 'yarl', '-y']' returned non-zero exit status 2.
#18 5.130 
#18 5.130   at /usr/lib/python3.9/subprocess.py:528 in run
#18 5.305        524│             # We don't call process.wait() as .__exit__ does that for us.
#18 5.306        525│             raise
#18 5.306        526│         retcode = process.poll()
#18 5.307        527│         if check and retcode:
#18 5.307     →  528│             raise CalledProcessError(retcode, process.args,
#18 5.307        529│                                      output=stdout, stderr=stderr)
#18 5.308        530│     return CompletedProcess(process.args, retcode, stdout, stderr)
#18 5.326        531│ 
#18 5.327        532│ 
#18 5.327 
#18 5.327 The following error occurred when trying to handle this error:
#18 5.328 
#18 5.328 
#18 5.329   EnvCommandError
#18 5.329 
#18 5.329   Command ['/prod_venv/bin/python', '-m', 'pip', 'uninstall', 'yarl', '-y'] errored with the following return code 2
#18 5.329   
#18 5.329   Output:
#18 5.329   ERROR: Exception:
#18 5.329   Traceback (most recent call last):
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/cli/base_command.py", line 223, in _main
#18 5.329       status = self.run(options, args)
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/commands/uninstall.py", line 58, in run
#18 5.329       session = self.get_default_session(options)
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/cli/req_command.py", line 78, in get_default_session
#18 5.329       self._session = self.enter_context(self._build_session(options))
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/cli/req_command.py", line 88, in _build_session
#18 5.329       session = PipSession(
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/network/session.py", line 248, in __init__
#18 5.329       self.headers["User-Agent"] = user_agent()
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/network/session.py", line 158, in user_agent
#18 5.329       setuptools_version = get_installed_version("setuptools")
#18 5.329     File "/prod_venv/lib/python3.9/site-packages/pip/_internal/utils/misc.py", line 683, in get_installed_version
#18 5.329       working_set = pkg_resources.WorkingSet()
#18 5.329     File "/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl/pkg_resources/__init__.py", line 567, in __init__
#18 5.329       self.add_entry(entry)
#18 5.329     File "/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl/pkg_resources/__init__.py", line 623, in add_entry
#18 5.329       for dist in find_distributions(entry, True):
#18 5.329     File "/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl/pkg_resources/__init__.py", line 2065, in find_on_path
#18 5.329       for dist in factory(fullpath):
#18 5.329     File "/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl/pkg_resources/__init__.py", line 2127, in distributions_from_metadata
#18 5.329       if len(os.listdir(path)) == 0:
#18 5.329   FileNotFoundError: [Errno 2] No such file or directory: '/prod_venv/lib/python3.9/site-packages/google_auth-2.26.1.dist-info'
#18 5.329   
#18 5.330 
#18 5.330   at /opt/poetry/lib/python3.9/site-packages/poetry/utils/env/base_env.py:354 in _run
#18 5.376       350│                 output = subprocess.check_output(
#18 5.376       351│                     cmd, stderr=stderr, env=env, text=True, **kwargs
#18 5.376       352│                 )
#18 5.376       353│         except CalledProcessError as e:
#18 5.377     → 354│             raise EnvCommandError(e)
#18 5.377       355│ 
#18 5.377       356│         return output
#18 5.377       357│ 
#18 5.377       358│     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:
#18 5.377 
#18 5.377 Cannot install yarl.
#18 5.377 
#18 ERROR: process "/bin/sh -c cd pmmlserver && poetry install --no-root --no-interaction --no-cache" did not complete successfully: exit code: 1

@xNinjaKittyx
Copy link

Still happens in 1.8.2 poetry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working as expected status/triage This issue needs to be triaged
Projects
None yet
Development

No branches or pull requests

5 participants