Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub action #42

Open
thomaseizinger opened this issue Jul 26, 2021 · 14 comments
Open

GitHub action #42

thomaseizinger opened this issue Jul 26, 2021 · 14 comments
Labels
help wanted Extra attention is needed

Comments

@thomaseizinger
Copy link

It would be nice to have a Github action that runs dylint without having to manually manage the installation and caching.

@smoelius
Copy link
Collaborator

Hi, @thomaseizinger. Admittedly, I don't know a lot about GitHub actions. Could you say a little bit more about what you are imagining?

Does something analogous exist for Clippy?

@thomaseizinger
Copy link
Author

Sure! Sorry for being very brief in the initial issue!

Does something analogous exist for Clippy?

Yes. It is pretty easy to run clippy using GitHub actions.

  1. Rust is pre-installed in the container that the actions run in, so if you are not concerned with running against latest stable Rust, doing something like: - run: cargo clippy -- -D warnings will fail the CI checks on any emitted warnings.
  2. A specific version of Rust can very easily be installed via: https://github.com/actions-rs/toolchain

Could you say a little bit more about what you are imagining?

I want to run dylint as part of my CI workflow but installing it every time takes up quite a bit of time. Caching can obviously be utilized but that requires some setup as well. In an ideal world, a GitHub action would be provided that manages the installation / download / caching for me so I can simply say:

- uses: trailofbits/dylint-action@v1
  with:
    args: -- -D warnings

This would assume we would provide an action in a repository dylint-action in the trailofbits organization.

In an ideal world, https://github.com/actions-rs/install could be used to solve this in a more generic manner but development there has stalled unfortunately :(

@smoelius
Copy link
Collaborator

This would assume we would provide an action in a repository dylint-action in the trailofbits organization.

This sounds reasonable to me. I'll ask around to see if anyone here has the bandwidth and necessary skills to tackle this. I'll also attach help wanted to this issue, as I think we'd be open to open to an external contribution.

As a note to myself:

Caching can obviously be utilized but that requires some setup as well.

I assume you are referring to this: https://github.com/actions/cache/blob/main/examples.md#rust---cargo I haven't tried it (though I probably will).

Also, as an aside, we do run Dylint from within a GitHub action for a separate project: https://github.com/trailofbits/test-fuzz/blob/6a532b5f37c5b9a6737f99d437d4a183fa560f4d/.github/workflows/ci.yml#L29-L32

Personally, I find the time it requires to be tolerable. But I can see the points you've made.

@smoelius smoelius added the help wanted Extra attention is needed label Jul 29, 2021
@thomaseizinger
Copy link
Author

If we attach binaries to releases, we could also download them instead of building them from scratch every time, mitigating the need for caching.

@smoelius
Copy link
Collaborator

smoelius commented Jul 31, 2021

If we attach binaries to releases, we could also download them instead of building them from scratch every time, mitigating the need for caching.

Hmm, I'm not sure that I want to take on that responsibility.

Can I take it that this would solve a particular need of yours? What platform would you require a binary for?

@thomaseizinger
Copy link
Author

If we attach binaries to releases, we could also download them instead of building them from scratch every time, mitigating the need for caching.

Hmm, I'm not sure that I want to take on that responsibility.

Can I take it that this would solve a particular need of yours? What platform would you require a binary for?

If we could download a binary, the GitHub action wouldn't need to cache it. I would likely run the workflows only on ubuntu-latest.

@smoelius
Copy link
Collaborator

smoelius commented Aug 1, 2021

I could maybe be convinced to publish ubuntu-latest binaries, but I am not sure that would completely address the problem. If you install Dylint from scratch and run it, much of the time will be spent on building drivers, and possibly building lint libraries.

I experimented with using actions/cache to cache Dylint artifacts and the results were good!

Here specifically is what I tried (https://github.com/trailofbits/test-fuzz/blob/2cac66d43ffe0889430927fb838c37f906fae3c4/.github/workflows/ci.yml#L12-L28):

    - name: Dylint versions
      run: cargo search dylint | sort | tee dylint_versions

    - uses: actions/cache@v2
      with:
        path: |
          ~/.cargo/bin/
          ~/.cargo/registry/index/
          ~/.cargo/registry/cache/
          ~/.cargo/git/db/
          ~/.dylint_drivers/
          ~/.rustup/toolchains/
          target/dylint/
        key: ${{ runner.os }}-dylint-${{ hashFiles('dylint_versions') }}

This took a 21 minute job down to about six minutes!

Note that the project where this was tested runs only one lint library, and it comes from the Dylint repository itself. If the project ran other lint libraries, their versions should be accounted for in the cache key.

So, I agree with you, having a "Dylint action" would be most convenient. But, short of that, this seems to be a pretty good solution.

What do you think, @thomaseizinger?

@thomaseizinger
Copy link
Author

For normal caching in Rust projects, I usually use https://github.com/Swatinem/rust-cache because it makes it effectively a one-liner which is pretty nice.

I didn't actually realize that it might be worthwhile to cache the built libraries and drivers of dylint as well. Initially I was only thinking of the dylint binary itself.

Do I understand correctly that dylint stores things outside of the target directory? Can I ask why? If it were to store things within target, then we could make use of existing caching infrastructure like https://github.com/Swatinem/rust-cache.

In regards to versions, would it somehow be possible to add dylint libraries as dev-dependencies to a project and therefore have the exact version tracked in Cargo.lock?

@smoelius
Copy link
Collaborator

smoelius commented Aug 2, 2021

For normal caching in Rust projects, I usually use https://github.com/Swatinem/rust-cache because it makes it effectively a one-liner which is pretty nice.

That is nice!

Do I understand correctly that dylint stores things outside of the target directory? Can I ask why?

The drivers (essentially wrappers around the rust compiler) are stored in ~/.dylint_drivers by default. The only thing that distinguishes them is the version of the compiler they wrap. So it makes sense to share them across projects.

If it were to store things within target, then we could make use of existing caching infrastructure like https://github.com/Swatinem/rust-cache.

This should be achievable by setting the DYLINT_DRIVER_PATH environment variable (though an apparent bug means it only works on the first try):

mkdir $PWD/target/dylint_drivers
DYLINT_DRIVER_PATH=$PWD/target/dylint_drivers cargo dylint --all --workspace

Admittedly, we might want a more ergonomic solution than having to set DYLINT_DRIVER_PATH.

In regards to versions, would it somehow be possible to add dylint libraries as dev-dependencies to a project and therefore have the exact version tracked in Cargo.lock?

I don't think so, or at least I can't immediately see how.

The main problem is that all members of a workspace are expected to be compiled with the same compiler version. But a Dylint library is pinned to a specific compiler version, which isn't necessarily the same as that of the project its run against.

I can see the benefits of what you're suggesting. Maybe there's some "trick" we could use to make it work, but I can't see it right now.

@thomaseizinger
Copy link
Author

The drivers (essentially wrappers around the rust compiler) are stored in ~/.dylint_drivers by default. The only thing that distinguishes them is the version of the compiler they wrap. So it makes sense to share them across projects.

As soon as you change your compiler version, I believe rustc compiles all of your dependencies again. From that perspective, isn't a dylint driver really just another dependency?

If it were to store things within target, then we could make use of existing caching infrastructure like Swatinem/rust-cache.

This should be achievable by setting the DYLINT_DRIVER_PATH environment variable (though an apparent bug means it only works on the first try):

On my personal computer, I am setting target-dir to a directory under ~/.cache via build.target-dir in ~/.cargo/config.toml. This serves two purposes:

  1. Nuking the target directory of all my different Rust projects in one go.
  2. Re-using build artifacts of dependencies across projects.

If dylint were to store its drivers under CARGO_TARGET_DIR, this reuse could happen without dylint needing to do anything by itself I believe :)

The main problem is that all members of a workspace are expected to be compiled with the same compiler version. But a Dylint library is pinned to a specific compiler version, which isn't necessarily the same as that of the project its run against.

I am not sure I completely follow. When and where is this pinning happening?

@smoelius
Copy link
Collaborator

smoelius commented Aug 3, 2021

Let me answer your last question first:

I am not sure I completely follow. When and where is this pinning happening?

Essentially, when you build the library. A library gets a name like:

libtry_io_result@nightly-2021-06-03-x86_64-unknown-linux-gnu.so
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The reason is that the compiler APIs that lints use are unstable and could change at any time. Dylint needs to know which version of the Rust compiler a lint library uses, as this determines the driver needed to run the library.

Back to your first question:

As soon as you change your compiler version, I believe rustc compiles all of your dependencies again. From that perspective, isn't a dylint driver really just another dependency?

The problem is the compiler version that a lint library uses isn't necessarily the same as what an author wants to use to build their package, e.g., for production.

For example, an author might want to build their package using (stable) 1.44.0-x86_64-unknown-linux-gnu. But they might want to lint that package using a library that compiles using nightly-2021-03-11-x86_64-unknown-linux-gnu. We'd like to give them that flexibility.

Occasionally, this causes problems. It can happen that an author's package can't be compiled with the compiler that a lint library uses. Fortunately, I've only bumped in to this a handful of times.

On my personal computer, I am setting target-dir to a directory under ~/.cache via build.target-dir in ~/.cargo/config.toml. This serves two purposes:

  1. Nuking the target directory of all my different Rust projects in one go.
  2. Re-using build artifacts of dependencies across projects.

That's an interesting idea. If I'm understanding correctly, you're sharing a target directory across projects.

If dylint were to store its drivers under CARGO_TARGET_DIR, this reuse could happen without dylint needing to do anything by itself I believe :)

That might make sense for users who use a shared target directory, as you've suggested. But I don't know how common that is. Given that a workspaces's target directory appears in its root by default, I think Dylint's current behavior makes sense, as it allows drivers to be reused across different workspaces.

@thomaseizinger
Copy link
Author

Essentially, when you build the library. A library gets a name like:

libtry_io_result@nightly-2021-06-03-x86_64-unknown-linux-gnu.so
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The reason is that the compiler APIs that lints use are unstable and could change at any time. Dylint needs to know which version of the Rust compiler a lint library uses, as this determines the driver needed to run the library.

Back to your first question:

As soon as you change your compiler version, I believe rustc compiles all of your dependencies again. From that perspective, isn't a dylint driver really just another dependency?

The problem is the compiler version that a lint library uses isn't necessarily the same as what an author wants to use to build their package, e.g., for production.

Right, that makes a lot of sense. Thank you for explaining!

That's an interesting idea. If I'm understanding correctly, you're sharing a target directory across projects.

Yes, correct.

That might make sense for users who use a shared target directory, as you've suggested. But I don't know how common that is. Given that a workspaces's target directory appears in its root by default, I think Dylint's current behavior makes sense, as it allows drivers to be reused across different workspaces.

It does make sense. I would have probably just not gone through the effort of building that optimization into dylint itself. If people are willing to accept rebuild of the same dependencies (like syn, etc) across multiple workspaces, they are likely also fine with dylint having to rebuild drivers per workspace. If they are not, they can use a shared target directory.

Overall, this was very educational, thank you!
I think setting dylint's driver path and using https://github.com/Swatinem/rust-cache is going to be my way forward :)

@smoelius
Copy link
Collaborator

smoelius commented Aug 4, 2021

Right, that makes a lot of sense. Thank you for explaining!

No problem. It's helpful for me to talk through it now and then. :)

If they are not, they can use a shared target directory.

Is using a shared target directory common?

Separately and for my own reference, I wondered whether Cargo offered a way to store the download cache in the target directory (because the download cache is shared across workspaces kind of like how Dylint drivers are). This was the only relevant piece of information I could find: https://stackoverflow.com/questions/45222791/is-it-possible-to-install-cargo-dependencies-in-the-same-directory-as-my-project

And back to this topic:

In regards to versions, would it somehow be possible to add dylint libraries as dev-dependencies to a project and therefore have the exact version tracked in Cargo.lock?

I think I may have some ideas here. I am going to start playing with this once I am confident I have #54 fixed.

@thomaseizinger
Copy link
Author

I think the reason for why downloads are shared is that the source-code is always the same, regardless of the cargo/Rust version.

I am not sure how common a shared target directory is but I prefer it simply for the ease of freeing up storage :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants