Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[experiment] patch with patch files #13779

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

weihanglo
Copy link
Member

@weihanglo weihanglo commented Apr 18, 2024

What does this PR try to resolve?

An unstable -Zpatch-files flag is added to support patching your dependencies with patch files, with the patchtool you configure in [patchtool] config table.

This is one experiment based on rust-lang/rfcs#3177 that the team maybe be interested in.
See also #4648.

How should we test and review this PR?

Apparently commit by commit. You could start from test to understand the current user experience of it.

Some design decisions in this implementation:

  • Every patch is settled and applied before the dependency resolution. This implies
    • No re-resolve.
    • Cargo need to fetch a certain version of package beforehand.
      This is currently done by requiring an exact version requirement =,
      so Cargo knows what to download.
    • Due to the nature of patches applied before dependency resolution,
      we could patch whatever we want including dependencies and package.version.
    • The Summary registered to PacakgeRegistry is the source after patching.
  • { patches = [<path>] } field is only available for dependencies under [patch] table,
    and only allows patching registry dependencies for now.
  • A new source PatchedSource that fetches and applies patches to the original source code during block_until_ready.
    • Introduce a new protocol patched+, which compose the SourceId to patched.
    • Tracked what was patched in URL query string, for example patching serde@1.0.0 is like:
      patched+registry+https://github.com/rust-lang/crates.io-index?name=serde&version=1.0.0&patch=/path/to/my-bugfix.patch
      
      This SourceId representation also appears in Cargo.lock and Package ID Spec.
  • [patchtool] table in Config to control.
    I imagine it would be something like git mergetool for registering tools,
    though at this moment it only accepts one via path.
  • Cache of patched source resides in $CARGO_HOME/patched-src/<hash-of-original-source>/<pkg>-<version>/<cksum-of-patches>.
    Patched source rebuilds whenever content of patch files or the underlying source change.

You can find some more documentation under src/cargo/sources/patched.rs.

Additional information

I don't think patching before dependency resolution is ergonomic.
Your patched source code would be stuck in time and you need to eagerly update and rebuild them.
Some patches may be safely applied across major versions.
However with the current design you need to specify the same patch twice.

We may like to revisit patching during the resolution,
and reconsider the work of re-resolving dependencies.
That is a more user-friendly way for patching with patch files.

@rustbot
Copy link
Collaborator

rustbot commented Apr 18, 2024

r? @ehuss

rustbot has assigned @ehuss.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-configuration Area: cargo config files and env vars A-manifest Area: Cargo.toml issues A-testing-cargo-itself Area: cargo's tests A-unstable Area: nightly unstable support A-workspaces Area: workspaces S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2024
@weihanglo weihanglo force-pushed the unidiff branch 2 times, most recently from b8bb3e5 to 74c61c6 Compare April 19, 2024 01:08
/// A file indicates that if present, the patched source is ready to use.
const READY_LOCK: &str = ".cargo-ok";

/// `PatchedSource` is a source that, when fetching, it patches a paticular
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this doc comment for some design decisions.

@epage
Copy link
Contributor

epage commented Apr 19, 2024

@weihanglo how would you like this reviewer?

As this is unstable and we previously agreed to the idea of experiments, in theory the bar is fairly low.

I will admit some hesitance on this experiment. In adding a new source kind, I'm assuming this can be a bit invasive (to be fair, I've not yet looked at the code). I'm also a bit surprised at the design direction described in the PR description.

So how much design discussion should we work out vs get in, try it out, and then iterate on alternative design ideas as we see needed?

@weihanglo weihanglo force-pushed the unidiff branch 2 times, most recently from a46ced8 to ef73e26 Compare April 19, 2024 02:01
@weihanglo
Copy link
Member Author

@epage, This is not intended to merge as-is. It's more like a proof of concept that this approach does works but with a not okay user experience.
To be honest, I lean toward a resolve-then-patch solution but this one is not. I am tempted to write another one that might introduce re-resolve.

About reviewing the pull request, I would recommend starting from tests to get an idea of how it looks like. I would suggest not getting into details until we feel better with the UI.

We could probably open a Zulip topic on t-cargo for design discussions.

@bors
Copy link
Collaborator

bors commented Apr 19, 2024

☔ The latest upstream changes (presumably #13778) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Collaborator

bors commented Apr 22, 2024

☔ The latest upstream changes (presumably #13783) made this pull request unmergeable. Please resolve the merge conflicts.

@Eh2406
Copy link
Contributor

Eh2406 commented Apr 26, 2024

Having look this over it seems pretty reasonable. If you would prefer to experiment with a "resolution loop" I'd be happy to compare cost-benefits, but I suspect this is simpler.

Cargo need to fetch a certain version of package beforehand. This is currently done by requiring an exact version requirement =, so Cargo knows what to download.

I don't understand why the equals requirement is needed. Say the patch was for >2, why couldn't we ask of the underlying source for a list of versions and then patch each one as they turn out to be needed? Well, I guess will we might then have to patch them all to see if the version numbers change as a result of the patching process. But that could be avoided by banning changing version numbers.

I'm not remembering the API at the moment, it may be that we need to return them all up front instead of "as they turn out to be needed"? If that's the problem, we could think about making a lazier API. (PubGrub would be happy with the lazy API, I'm not sure how much churn would be involved in moving the current resolver.)

I don't think patching before dependency resolution is ergonomic.
Your patched source code would be stuck in time and you need to eagerly update and rebuild them.

I don't think I follow, what ends up stuck in time?

Some patches may be safely applied across major versions.
However with the current design you need to specify the same patch twice.

This seems to be an important point. Especially as it's across "different versions" with the current = limitation.

@epage
Copy link
Contributor

epage commented Apr 26, 2024

Oh, I should capture some office hour thoughts

In my mind, no matter the type of patch, it should all act the same. It sounds like this patches before resolution, rather than during, and allows your patches to affect resolution unlike regular patches (iiuc). If this is the case for both kinds of patches, then that feels like a barrier to move forward.

In my mind, I see this PR introducing two things:

  • Registry [patch] entries
  • patches key for any kind of [patch] entry

I think we should decouple these conversations so we can make sure we are getting the right semantics for each or, at minimum, defer non-patches registry [patch] entries without backing ourselves into a corner on its design.

A thought of mine that didn't come up during office hours is me worrying that pluggable patch tools might be generalizing too much. I'd want to see a better explanation of why that extra complexity is needed and what the alternatives are.

@Eh2406
Copy link
Contributor

Eh2406 commented Apr 26, 2024

unlike regular patches (iiuc).

Regular patches can also affect resolution. It is replace that can not.

@weihanglo
Copy link
Member Author

weihanglo commented Apr 26, 2024

Reply to #13779 (comment):

I don't understand why the equals requirement is needed.
...it may be that we need to return them all up front instead of "as they turn out to be needed"? If that's the problem, we could think about making a lazier API.

= requirement is needed so that we know what to patch before resolution. Otherwise yeah we need to patch every version in that version range.

We could have lazy mechanism, but I would like to avoid patching “during resolution” at this moment. Patching before or after resolution is less complicated I believe.

I don't think patching before dependency resolution is ergonomic.
Your patched source code would be stuck in time and you need to eagerly update and rebuild them.

I don't think I follow, what ends up stuck in time?

The exact requirement issue.


Reply to #13779 (comment):

In my mind, I see this PR introducing two things:

  • Registry [patch] entries
  • patches key for any kind of [patch] entry

Actually this PR intentionally touches nothing in ops::resolve or core::resolver. The core idea is having a new source and prepare it for resolution. You can consider it as a new kind of VCS source. Hence, it is not really that controversial.

The one major thing to explore is the semantic of the new patched source kind when being in [patch].


Just talked to Jacob earlier. There are at least three timing for patching:

  • Registry [patch] before resolution
    • This is how today's [patch] work, as well as this PR.
    • Need to know every summary upfront.
  • Register [patch] on demand during resolution
    • This might be hard. [patch] can be added or removed depending on application of other [patch], so more complicated backtraces might be needed. (I didn't look into this route deep though)
  • Register [patch] after resolution, and re-resolve if needed
    • Re-resolve happens if any summary has changed due to patch, and is currently in use in resolve graph
    • If we go this route, we could first ban any change to package.{name,version}. By doing so [patch] should have the same semantic with the current patch kinds (name@version must match).
    • This is the next thing I would like to experiment on.

(By registering [patch] I mean making [patch] entry available for dependency resolver to select/query.)

@bors
Copy link
Collaborator

bors commented Apr 30, 2024

☔ The latest upstream changes (presumably #13813) made this pull request unmergeable. Please resolve the merge conflicts.

Since `[patch]` section exists also in config,
so have it inboth cargo-features and -Z flag.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-configuration Area: cargo config files and env vars A-manifest Area: Cargo.toml issues A-testing-cargo-itself Area: cargo's tests A-unstable Area: nightly unstable support A-workspaces Area: workspaces S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants