Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built-in support for multiple Rust workspaces in same repo #487

Open
repi opened this issue May 30, 2023 · 3 comments
Open

Built-in support for multiple Rust workspaces in same repo #487

repi opened this issue May 30, 2023 · 3 comments

Comments

@repi
Copy link

repi commented May 30, 2023

Some of our more complex repositories have multiple Rust workspaces to cover different isolated aspects of the project and we would like to audit the crates in all of the workspaces.

We can do this by doing cargo vet init for each sub-directory & workspace but this is tedious and error prone to manage and update and as they are all in the same repository we would like to have a single record and ./supplychain/ folder for the audits of all of the crates in the workspaces.

In the longer term I'm hoping for native support in Cargo of nested workspaces (rust-lang/cargo#5042), but don't think there is much active development towards that.

Could we consider in ./supplychain/config.toml to have list of paths to additional Cargo.toml workspaces to include in the processing? Or any other way to configure and include multiple workspaces in a single auditing database to manage?

@bholley
Copy link
Collaborator

bholley commented Jun 1, 2023

@mystor WDYT?

@mystor
Copy link
Collaborator

mystor commented Jun 5, 2023

This wouldn't be trivial to implement, but it wouldn't be too bad. We currently always invoke cargo metadata once, within the specified workspace. In order to get something like that to work, we'd need to also invoke it in the other specified workspaces, and update logic inspecting the metadata to support multiple metadata outputs.

In terms of where we'd want to add the config, perhaps it would make more sense to have this information in Cargo.toml rather than in the config.toml, as we generally haven't put paths in config.toml yet (though we could). It'd be similar to how we allow customizing the supply-chain path: https://mozilla.github.io/cargo-vet/config.html#location. Perhaps something like:

[package.metadata.vet]
additional-manifests = ["path/to/Cargo.toml", "path/to/other/Cargo.toml"]

We'd probably end up doing the check by looking for the configuration in the metacfg after parsing it. After that, we'd need to run an additional metadata command for each manifest path provided.

We can't just naively merge the metadata together, as e.g. PackageIds would no longer be unique (as a package could exist in in multiple namespaces with different dependencies/features/etc. for example). Fortunately other than DepGraph building (which would need to keep track of which manifest each package comes from for dependency tracking) it seems like most code using the metadata just wants to iterate over every package.

I think a new type like MetadataSet could be used instead of Metadata in the functions which currently accept it, as a wrapper around a Vec<Metadata>. We'd then expose a .packages() iterator which chains together all of the package arrays for the existing callers which need it, and would manually fix up DepGraph building to support multiple metadata instances.

@repi
Copy link
Author

repi commented Jun 6, 2023

nice, sounds like a plan but yes not super trivial.

having it as metadata in the workspace Cargo.toml could work well, and feels closer to where the future built-in Cargo support for nested workspaces would be

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants