Idea to explore: incompatibility knowledge database #121

mpizenberg · 2022-10-14T07:18:47Z

In some pathological cases, it's possible the solver takes much longer to find a solution. For any given dataset of packages and versions, and a given dependency provider, it is easy to find those edge case by trying to solve all packages.

What's interesting, is that once a package is solved, we have at our disposal the full set of incompatibilities that were recorded. And since incompatibilities are supposed to be always true (at least during one solve call) we could try to identify if injecting one/some of these incompatibilities directly in the dependency solver could dramatically improve the solving time. And if so, it would be interesting to see how many of these could be stored in some kind of database available to the dependency solver. These "key" incompatibilities could be useful at the package level, or even at the ecosystem level.

Of course, if we want to record incompatibilities outliving a single solve call, we'd have to be extra careful of what could be recorded. For example, it should avoid recording incompatibilities referencing to any future version, as we don't know what the future may hold. But even those could be "truncated" if needed.

I just wanted to record this idea, for future self or anyone wanting to explore it.

Eh2406 · 2022-10-14T15:09:04Z

I love this idea. Of course, subtle misuses of your knowledge database can lead to incorrect output. For example, if you are attempting to update the database because the "dependency provider" now has a new version of a package and you do not correctly remove all derived incompatibilities weird things will happen. Similarly, it is going to be hard to get comprehensive fuzz testing of all of the corner cases you can generate with such low level access. I think the knowledge database API needs to clearly be marked as an "advanced", "low level", "use at your own risk" interface.

Just looking at Cargos use of a resolver I can see two places where this interface could be really useful.

As a solution to async advanced_dependency_providers#6 Cargo currently does resolution in a loop. If a dependency version has not yet been retrieved over the network it records that version as having no dependencies, and flags it to be filtered out later. If the solution includes any versions that have been flagged, it waits for all network transfers to complete and re-attempts resolution. Reusing a knowledge database between these resolutions would dramatically reduce redundant work.
When handling workspaces, cargo ends up needing to do resolution twice. The first time to figure out the interactions of all of the dependencies of all of the workspace members, the second time to figure out the dependencies of the actual member being built. There's probably significant sharing the can be done here.

Eh2406 mentioned this issue Oct 14, 2022

Additional Constraints #120

Open

mpizenberg added this to the v0.4 milestone Oct 22, 2022

mpizenberg mentioned this issue Oct 18, 2023

Solving slows down dramatically when testing hundreds of versions #135

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea to explore: incompatibility knowledge database #121

Idea to explore: incompatibility knowledge database #121

mpizenberg commented Oct 14, 2022

Eh2406 commented Oct 14, 2022

Idea to explore: incompatibility knowledge database #121

Idea to explore: incompatibility knowledge database #121

Comments

mpizenberg commented Oct 14, 2022

Eh2406 commented Oct 14, 2022