Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

locker: do not assign unrelated filenames/hashes to direct origin dependencies #6389

Conversation

radoering
Copy link
Member

@radoering radoering commented Sep 3, 2022

Fix for duplicate (or outdated) hashes of multiple constraint dependencies

Resolves: #6327
Resolves: #6349

  • Added tests for changed code.
  • Updated documentation for changed code.

Fixes a regression in poetry lock --no-update where filenames and hashes of multiple constraints dependencies are always kept/duplicated.

Background: There is only one metadata.files entry for all multiple constraints dependencies in poetry.lock. When reading the lock file we can't always decide for sure which filename/hash belongs to which of the multiple constraints dependencies so we just add all filenames/hashes for all packages. For non-direct origin dependencies, this is fixed later in complete_package() when the package is looked up in the repository. For direct origin dependencies this isn't fixed, so all entries are kept and duplicated once for each direct origin dependency. Even if one of the multiple constraints dependencies has been deleted from pyproject.toml, its filenames/hashes are kept if there is still one direct origin dependency.

(This issue did not occur in poetry 1.1.x because poetry was not able to recognize if a direct origin dependency had already been included in the lockfile and thus always updated it when running poetry lock --no-update.)

Update: Another manifestation of the fixed bug is that url, git or directory dependencies of multiple constraints dependencies can't be installed if there is a non-direct origin or file dependency among the multiple constraints dependencies.

@neersighted neersighted added area/installer Related to the dependency installer impact/backport Requires backport to stable branch impact/changelog Requires a changelog entry backport/1.2 labels Sep 3, 2022
src/poetry/puzzle/provider.py Outdated Show resolved Hide resolved
src/poetry/puzzle/provider.py Outdated Show resolved Hide resolved
@dimbleby
Copy link
Contributor

dimbleby commented Sep 3, 2022

isn't the real problem that all the files in the metadata in poetry.lock get attached to all packages?

What if we just... don't do that? ie remove the code in the locker that sets packages.files from metadata. Aren't the correct values going to be picked up later anyway?

(complete_package() is one of the biggest and ugliest methods in the codebase, this isn't helping that!)

@dimbleby
Copy link
Contributor

dimbleby commented Sep 3, 2022

hmm, ignoring the metadata might work during solving but it also means that we don't do any checking of hashes during install.

Well OK then, how about fixing the broken file format instead? We've learned that the files belong on the packages, let's write that in the lockfile?

Should be possible to do in a way that can still read old lockfiles: if the lockfile contains per-package files then use them, else fall back to the merged metadata

@neersighted
Copy link
Member

If we're talking about changing the lockfile, that would have to go into 1.3. This as a correctness fix for a regression would be eligible for 1.2... So maybe the correct solution is both?

I defer to you both for your expertise on the moving pieces of the solver -- my knowledge is still a combination of surface level and function level.

@dimbleby
Copy link
Contributor

dimbleby commented Sep 3, 2022

why would it have to go into 1.3, if the new code is capable of reading the old lockfile?

because it's a "large" change? or because of some compatibility rule that I don't know?

@neersighted
Copy link
Member

neersighted commented Sep 3, 2022

Because incrementing the lockfile version in a patch release is not backwards compatible or expected, before we even start considering the magnitude of risk for a 'stable' branch.

@dimbleby
Copy link
Contributor

dimbleby commented Sep 3, 2022

it can be made backwards compatible, I don't know about "expected"

@neersighted
Copy link
Member

neersighted commented Sep 3, 2022

Would we write the hashes twice, one to the new location, and one to the old? I still don't think that's appropriate for a patch release, especially given the intention is to release 1.3 with the backed up PRs as we can get them stabilized and over the finish line "soon..."

A new lock file format that is backwards-compatible in 1.3 (so that 1.2 can read them), followed by 1.4 generating lock files that 1.3 can read (but not 1.2) sounds like the preferred approach for this particular migration to me anyway.

@dimbleby
Copy link
Contributor

dimbleby commented Sep 3, 2022

yes, for backwards compatibility the approach would be to write in both places, new code would prefer to read from the right place, old code could continue to read from the less good place.

While I see what this fix is trying to do I am quite uncomfortable with it: conceptually it's patching up an earlier mistake instead of fixing that mistake at root, and practically it's a lot of ugly code in an already horribly ugly place.

I'm tempted to assert that the bug being fixed isn't important enough to be worth that, so long as there's a plan that does better in 1.3 and 1.4. But it would be reasonable to disagree.

@dimbleby
Copy link
Contributor

dimbleby commented Sep 4, 2022

for clarity, #6393 is the proposal I had in mind

@radoering
Copy link
Member Author

Well OK then, how about fixing the broken file format instead? We've learned that the files belong on the packages, let's write that in the lockfile?

Yes, we should fix the format long-term.

I'm tempted to assert that the bug being fixed isn't important enough to be worth that, so long as there's a plan that does better in 1.3 and 1.4. But it would be reasonable to disagree.

I disagree. The bug is a regression making poetry lock --no-update unusable for some (not so uncommon) use cases. I strongly advocate a backport and I think my fix is probably minimally invasive.

for clarity, #6393 is the proposal I had in mind

I haven't looked closely yet, but this is probably the solution for 1.3.

@neersighted If this workaround is only for 1.2, should we merge it into master first even if it is replaced by #6393 quickly after or should we only create a PR for 1.2?

@neersighted
Copy link
Member

I think it is important to keep a roughly unified history -- nothing should go directly into 1.2, except as a backport would be my general stance. I'd be fine merging this, and then immediately ripping it out.

That being said, I think we should try and establish some testing around lockfile compatibility before we bump the format, and also add some code to 1.2 that will print a useful error message when it encounters a lockfile beyond what is known to be compatible.

@dimbleby
Copy link
Contributor

dimbleby commented Sep 4, 2022

I think my fix is probably minimally invasive.

I don't think it is! #6393 suggests a different approach which seems like a truer halfway house: how about, when reading the lockfile, we don't attach files to url packages at all? (also other direct origin types)

That's a fix that's much closer to the root of the problem, if not as accurate as getting it right per package, and I think will be significantly more compact than this fix.

@dimbleby
Copy link
Contributor

dimbleby commented Sep 4, 2022

I think we should try and establish some testing around lockfile compatibility before we bump the format, and also add some code to 1.2 that will print a useful error message when it encounters a lockfile beyond what is known to be compatible.

This already exists (and is tested)

if lock_version_allowed and current_version < lock_version:
logger.warning(
"The lock file might not be compatible with the current version of"
" Poetry.\nUpgrade Poetry to ensure the lock file is read properly or,"
" alternatively, regenerate the lock file with the `poetry lock`"
" command."
)
elif not lock_version_allowed:
raise RuntimeError(
"The lock file is not compatible with the current version of Poetry.\n"
"Upgrade Poetry to be able to read the lock file or, alternatively, "
"regenerate the lock file with the `poetry lock` command."
)

@radoering
Copy link
Member Author

radoering commented Sep 4, 2022

how about, when reading the lockfile, we don't attach files to url packages at all? (also other direct origin types)

I also thought about that alternative and decided for changing complete_package() because that 's the place where files is fixed for non-direct origin dependencies. However, that was before the lockfile change loomed on the horizon. Now, I agree that handling it when reading the lockfile might be better. I'll convert the PR to that variant.

Btw, not attaching files is only correct for url, git and directory dependencies. For file dependencies the correct list item has to be found.

@radoering radoering marked this pull request as draft September 4, 2022 08:58
@dimbleby
Copy link
Contributor

dimbleby commented Sep 4, 2022

Btw, not attaching files is only correct for url, git and directory dependencies. For file dependencies the correct list item has to be found.

agreed, just lazy language on my part!

@dimbleby
Copy link
Contributor

dimbleby commented Sep 4, 2022

Also I don't know why poetry doesn't put a hash on url dependencies: if it's a good idea to calculate and store the hash for file dependencies then why not url dependencies too? But that can definitely wait for another day.

@radoering radoering force-pushed the fix-duplicate-and-outdated-hashes branch from 373f97b to 4401155 Compare September 4, 2022 12:08
@radoering radoering changed the title provider: remove unrelated filenames/hashes of direct origin dependencies in complete_package() locker: do not assign unrelated filenames/hashes to direct origin dependencies Sep 4, 2022
@radoering radoering marked this pull request as ready for review September 4, 2022 12:15
src/poetry/packages/locker.py Outdated Show resolved Hide resolved
src/poetry/packages/locker.py Outdated Show resolved Hide resolved
src/poetry/packages/locker.py Outdated Show resolved Hide resolved
@radoering radoering force-pushed the fix-duplicate-and-outdated-hashes branch 2 times, most recently from ee61e14 to a64f2f0 Compare September 4, 2022 13:40
@radoering radoering force-pushed the fix-duplicate-and-outdated-hashes branch from a64f2f0 to b183e83 Compare September 4, 2022 13:56
…endencies when used in multiple constraints dependencies
@radoering radoering force-pushed the fix-duplicate-and-outdated-hashes branch from b183e83 to 0921ff8 Compare September 4, 2022 14:47
@dimbleby
Copy link
Contributor

dimbleby commented Sep 4, 2022

macos pipeline failure looks like pypa/pip#11352, if so will presumably be fixed when a pip fix makes its way out

Copy link
Member

@neersighted neersighted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a whole lot cleaner -- thanks @radoering and @dimbleby for the excellent work cleaning up this change. I agree with #6393 being the best way forward long term, but I am very much in favor of not changing the lockfile format in 1.2 (especially given incidental/whatever possible compatibility with 1.1 is desirable).

@intgr
Copy link
Contributor

intgr commented Sep 19, 2022

Thanks! I tested Poetry 1.2.1 and both issues are fixed now.

jeffwidman added a commit to dependabot/smoke-tests that referenced this pull request Sep 21, 2022
`poetry` `1.2.1` no longer generates this line that was present in `1.2.0`.

I'm not 100% positive, but I _think_ that might be the result of python-poetry/poetry#6389.

Details here: dependabot/dependabot-core#5746 (comment)
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/installer Related to the dependency installer impact/backport Requires backport to stable branch impact/changelog Requires a changelog entry
Projects
None yet
4 participants