Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install from requirements.txt on SageMaker training jobs does not install from github repo #4474

Open
rohit901 opened this issue Mar 4, 2024 · 2 comments
Labels
bug component: training Relates to the SageMaker Training Platform

Comments

@rohit901
Copy link

rohit901 commented Mar 4, 2024

Describe the bug
I have put my requirements.txt file in my code folder, and have specified github repository as installation mode for one of the library. However the job does not get executed successfully in Sagemaker, and gives error saying RemoteNotFoundError.

To reproduce
Just try to install a package using git like this by modifying your requirements.txt to include this line:

-e git+https://github.com/huggingface/diffusers.git@main#egg=diffusers

Expected behavior
Training should start

Screenshots or logs
logs:

/opt/conda/bin/python3.10 -m pip install -r requirements.txt

-- Obtaining diffusers from git+https://github.com/huggingface/diffusers.git@main#egg=diffusers (from -r requirements.txt (line 8))
ERROR: Exception:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 367, in get_remote_url
    found_remote = remotes[0]
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 180, in exc_logging_wrapper
    status = run_func(*args)
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 245, in wrapper
    return func(self, options, args)
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 377, in run
    requirement_set = resolver.resolve(
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 76, in resolve
    collected = self.factory.collect_root_requirements(root_reqs)
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 534, in collect_root_requirements
    reqs = list(
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 490, in _make_requirements_from_install_req
    cand = self._make_base_candidate_from_link(
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 207, in _make_base_candidate_from_link
    self._editable_candidate_cache[link] = EditableCandidate(
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 318, in __init__
    super().__init__(
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 156, in __init__
    self.dist = self._prepare()
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 225, in _prepare
    dist = self._prepare_distribution()
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 328, in _prepare_distribution
    return self._factory.preparer.prepare_editable_requirement(self._ireq)
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 692, in prepare_editable_requirement
    req.update_editable()
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/req/req_install.py", line 699, in update_editable
    vcs_backend.obtain(self.source_dir, url=hidden_url, verbosity=0)
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/vcs/versioncontrol.py", line 526, in obtain
    existing_url = self.get_remote_url(dest)
  File "/opt/conda/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 369, in get_remote_url
    raise RemoteNotFoundError

pip._internal.vcs.versioncontrol.RemoteNotFoundError

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: '2.210.0'
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): HuggingFace (Pytorch)
  • Framework version: Pytorch 2.1.0
  • Python version: 3.10
  • CPU or GPU: GPU
  • Custom Docker image (Y/N): N
@rohit901 rohit901 added the bug label Mar 4, 2024
@omarkahwaji
Copy link

What version of Sagemaker SDK are you using? I started experiencing similar issues with packages not getting installed from requirements.txt at all. The SDK skips over the installation all together

@knikure knikure added the component: training Relates to the SageMaker Training Platform label Mar 8, 2024
@rohit901
Copy link
Author

rohit901 commented Mar 9, 2024

Hi @omarkahwaji
I'm using 2.210.0. Make sure you are passing correct "source_dir" when using the Sagemaker Estimator. Otherwise ensure your main script which runs the training and the requirements.txt are in the same folder.

It should install packages from pypi, but it does not seem to install from git.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component: training Relates to the SageMaker Training Platform
Projects
None yet
Development

No branches or pull requests

3 participants