Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stablize and flip repository_cache_urls_as_default_canonical_id #19549

Conversation

sluongng
Copy link
Contributor

Introduced in #14268, this flag is very useful for bigger enterprise
context where folks often version bumping dependencies without
remembering to update the SHA256 of the downloaded file, leading to
Bazel picking up older download entries from the repository cache.

As we get more questions about this flag in Slack, marking it as
stable and flip the default seems to be the right move.

@github-actions github-actions bot added awaiting-review PR is awaiting review from an assigned reviewer team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. labels Sep 18, 2023
@sluongng
Copy link
Contributor Author

Not quite sure what's up with Bazel CI, a lot of failures with Unknown host: github.com and I don't have permission to retry on buildkite

@sluongng
Copy link
Contributor Author

@meteorcloudy pinging you for review since you approved the last PR.

I think all the repository downloads failures on CI are the ones that are fetching directly from github and do not have a mirror.

@meteorcloudy
Copy link
Member

We run most of our integration tests without internet access on BazelCI, to do that, we passed the repo cache generated by the outter Bazel (6.3.2) to the inner Bazel (HEAD) inside the integration test.

I guess this change caused the repo cache not to work when the URL differs (mirror.bazel.build vs github.com) for the same artifact.

The simplest work around would be to turn off this flag for integration tests at all places we set the repo cache:
https://cs.opensource.google/search?q=%22common%20--repository_cache%22&ss=bazel
Otherwise, we'll have to figure out a way to make sure the outter Bazel also fetches artifact with this flag on and at the same URL.

@sluongng sluongng force-pushed the sluongng/flip-repo-cache-urls-canonical-id branch from cc9566d to 1772d65 Compare September 19, 2023 15:24
@meteorcloudy
Copy link
Member

Probably also here:

--repository_cache=derived/repository_cache \

@sluongng sluongng force-pushed the sluongng/flip-repo-cache-urls-canonical-id branch from 1772d65 to f62031c Compare September 19, 2023 15:52
Introduced in bazelbuild#14268, this flag is very useful for bigger enterprise
context where folks often version bumping dependencies without
remembering to update the SHA256 of the downloaded file, leading to
Bazel picking up older download entries from the repository cache.

As we get more questions about this flag in Slack, marking it as
stable and flip the default seems to be the right move.
@sluongng sluongng force-pushed the sluongng/flip-repo-cache-urls-canonical-id branch from f62031c to f88bd2d Compare September 19, 2023 16:14
@sluongng
Copy link
Contributor Author

@meteorcloudy CI passed for now. Do let me know if you prefer to move the change from bazel_bootstrap_distfile_test.sh to bazel/scripts/bootstrap/bootstrap.sh! 🙇

@meteorcloudy
Copy link
Member

Yes, please. Because if users just run the bootstrap build following https://bazel.build/install/compile-source#bootstrap-unix, they won't have the flag set, so their build will likely to fail without internet access. Let's not break that use case.

@sluongng sluongng force-pushed the sluongng/flip-repo-cache-urls-canonical-id branch from f88bd2d to 0cbfb73 Compare September 19, 2023 17:02
@meteorcloudy meteorcloudy added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Sep 20, 2023
@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Sep 21, 2023
@thesayyn
Copy link
Contributor

thesayyn commented Oct 6, 2023

I believe this is a mistake as it will lead to more disk space usage and performance degradation as identical blobs are downloaded even though they exist in the cache.

See: #19749

@fmeum
Copy link
Collaborator

fmeum commented Oct 7, 2023

I believe this is a mistake as it will lead to more disk space usage and performance degradation as identical blobs are downloaded even though they exist in the cache.

I supported the flip as it would have caught mistakes I made at numerous occasions, but I can see how it can also be problematic. I would propose to revert the flip and stabilization and rethink the flag -are canonical IDs really the right approach to catch the common mistake of forgetting to update a hash after updating a URL? I could see possible answers being that we want to limit the reach of this flag to http_* rules only or come with a way to show a warning that doesn't result in cache entry duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants