Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce TRAP caching #1172

Merged
merged 2 commits into from
Aug 9, 2022
Merged

Introduce TRAP caching #1172

merged 2 commits into from
Aug 9, 2022

Conversation

edoardopirovano
Copy link
Contributor

@edoardopirovano edoardopirovano commented Aug 5, 2022

This PR introduces everything that needs to happen on the Action's side for TRAP caching. In particular:

  1. We run codeql resolve languages --format=betterjson to see the extractor options.
  2. If we see an extractor that supports TRAP caching then we:
    • If we're running on the default branch: Start from a fresh cache, tell the extractor to write up to 1GB to the cache (tbc: is this a sensible amount for size of the Actions cache and the disk space of the default runners?). Upload it near the end of the run (after we've sent results to Code Scanning and uploaded the DB to MRVA, since both of those are higher priority). (sample run with some now-removed debugging output)
    • If we're running on a PR branch: Download the cache for the base commit if we have one, or any other recent commit failing that at the init step. Then we use that cache in read-only mode during extraction. (sample run with some now-removed debugging output)

All of this is gated behind a feature flag, although the trap-caching parameter to the init step can be used to force this to be on or off irrespective of the feature flag. I expect a few users will want to disable this if they're already using their Actions cache for something else so we're not competing for space.

We should document this new feature, and the option to disable it, before we turn this on for any external users. But for now we're only going to turn on the feature flag for some internal repositories, so I think it is fine to not have it in the public facing documentation just yet.

Before we enabled this on more than one or two internal repos (which we can monitor by hand), I would also like to implement some telemetry so we can have some visibility over whether this is causing any issues. Again, I propose leaving this for a later PR.

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.
  • Confirm the changelog has been updated if necessary.

@edoardopirovano edoardopirovano requested a review from a team as a code owner August 5, 2022 15:32
src/trap-caching.ts Fixed Show resolved Hide resolved
src/trap-caching.ts Fixed Show resolved Hide resolved
Copy link
Contributor

@henrymercer henrymercer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell the extractor to write up to 1GB to the cache (tbc: is this a sensible amount for size of the Actions cache and the disk space of the default runners?).

1 GB per language sounds reasonable to me as a default. I can think of cases where we might want to use more or less space however — perhaps we should make the TRAP cache size configurable? Question: Should the total size or the size per language be configurable?

We should document this new feature, and the option to disable it, before we turn this on for any external users. But for now we're only going to turn on the feature flag for some internal repositories, so I think it is fine to not have it in the public facing documentation just yet.

Sounds reasonable.

Before we enabled this on more than one or two internal repos (which we can monitor by hand), I would also like to implement some telemetry so we can have some visibility over whether this is causing any issues. Again, I propose leaving this for a later PR.

Also sounds reasonable 👍

src/config-utils.ts Outdated Show resolved Hide resolved
src/config-utils.test.ts Show resolved Hide resolved
src/config-utils.ts Outdated Show resolved Hide resolved
src/init-action.ts Outdated Show resolved Hide resolved
src/trap-caching.test.ts Outdated Show resolved Hide resolved
src/trap-caching.ts Show resolved Hide resolved
src/trap-caching.ts Outdated Show resolved Hide resolved
src/trap-caching.ts Outdated Show resolved Hide resolved
src/trap-caching.ts Outdated Show resolved Hide resolved
src/trap-caching.ts Show resolved Hide resolved
src/trap-caching.ts Show resolved Hide resolved
@edoardopirovano
Copy link
Contributor Author

perhaps we should make the TRAP cache size configurable? Question: Should the total size or the size per language be configurable?

Indeed, we may eventually want configuration options to fine-tune this. I don't think adding this should block getting this prototype out of the door, though. In the telemetry I will be adding, I will include a field that records the size of the cache so we can use that to get a feel for what the default should be and how configurable we need this to be before we properly declare this GA.

@asottile-sentry
Copy link

are there docs explaining what this is or why one would want to turn it on or off? I didn't see it mentioned in the docs linked from the README and this is chewing up our entire GHA cache since it seems to put many MB per commit

@adityasharad
Copy link
Contributor

@asottile-sentry I'm sorry to hear this feature has caused trouble for you. The changelog note here has some explanation of this feature and how to turn it off. The intended purpose is to speed up the "extraction" phase of CodeQL analysis (which processes your source code into a local database format so that it can be analysed), however this is completely optional and you are safe to turn it off in the workflow since it's using up your Actions cache quota. We have only rolled this out in a limited fashion for one language at the moment, so it is not in the public docs beyond the changelog.

To help us understand and investigate the problem better, would you able to point us to your repo, and let us know if you're using the Actions cache for any other purposes?

@asottile-sentry
Copy link

https://github.com/getsentry/sentry/actions/caches -- yes we use the cache for other things (I suspect this to be the common case -- so perhaps enabling this by default in codeql is the wrong default?)

@sayhiben
Copy link

In case the additional context helps, trap caching is also causing my organization's monorepo to exhaust our cache very rapidly. I had to turn off the feature, even though I want to use it. We use the GHA cache for other purposes as well and can't afford to let this churn our cache. I've opened another issue to hopefully move the cache storage location: #2030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants