Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea to explore: incompatibility knowledge database #121

Open
mpizenberg opened this issue Oct 14, 2022 · 1 comment
Open

Idea to explore: incompatibility knowledge database #121

mpizenberg opened this issue Oct 14, 2022 · 1 comment

Comments

@mpizenberg
Copy link
Member

In some pathological cases, it's possible the solver takes much longer to find a solution. For any given dataset of packages and versions, and a given dependency provider, it is easy to find those edge case by trying to solve all packages.

What's interesting, is that once a package is solved, we have at our disposal the full set of incompatibilities that were recorded. And since incompatibilities are supposed to be always true (at least during one solve call) we could try to identify if injecting one/some of these incompatibilities directly in the dependency solver could dramatically improve the solving time. And if so, it would be interesting to see how many of these could be stored in some kind of database available to the dependency solver. These "key" incompatibilities could be useful at the package level, or even at the ecosystem level.

Of course, if we want to record incompatibilities outliving a single solve call, we'd have to be extra careful of what could be recorded. For example, it should avoid recording incompatibilities referencing to any future version, as we don't know what the future may hold. But even those could be "truncated" if needed.

I just wanted to record this idea, for future self or anyone wanting to explore it.

@Eh2406
Copy link
Member

Eh2406 commented Oct 14, 2022

I love this idea. Of course, subtle misuses of your knowledge database can lead to incorrect output. For example, if you are attempting to update the database because the "dependency provider" now has a new version of a package and you do not correctly remove all derived incompatibilities weird things will happen. Similarly, it is going to be hard to get comprehensive fuzz testing of all of the corner cases you can generate with such low level access. I think the knowledge database API needs to clearly be marked as an "advanced", "low level", "use at your own risk" interface.

Just looking at Cargos use of a resolver I can see two places where this interface could be really useful.

  1. As a solution to async advanced_dependency_providers#6 Cargo currently does resolution in a loop. If a dependency version has not yet been retrieved over the network it records that version as having no dependencies, and flags it to be filtered out later. If the solution includes any versions that have been flagged, it waits for all network transfers to complete and re-attempts resolution. Reusing a knowledge database between these resolutions would dramatically reduce redundant work.
  2. When handling workspaces, cargo ends up needing to do resolution twice. The first time to figure out the interactions of all of the dependencies of all of the workspace members, the second time to figure out the dependencies of the actual member being built. There's probably significant sharing the can be done here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants