Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel does not populate/sync all caches in a combined cache #22357

Open
mendozatudares opened this issue May 13, 2024 · 1 comment
Open

Bazel does not populate/sync all caches in a combined cache #22357

mendozatudares opened this issue May 13, 2024 · 1 comment
Assignees
Labels
awaiting-user-response Awaiting a response from the author team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged

Comments

@mendozatudares
Copy link

mendozatudares commented May 13, 2024

Description of the bug:

When attempting to speed up a remote-cache-enabled build (--remote_cache, HTTP) using a local disk cache (--disk_cache, same drive), I discovered Bazel would not populate the disk cache with artifacts it retrieved from the remote cache for faster access. Bazel also populated only the remote cache upon a cache miss/rebuild. This left us with an empty disk cache that was never accessed.

This seems to be undocumented or inconsistent with anecdotes claiming Bazel reads and writes to both caches as a single cache.

Which category does this issue belong to?

Documentation, Remote Execution

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I was able repro this with just disk caches:

bazel build :target1 --disk_cache=~/.cache/disk1
bazel build :target2 --disk_cache=~/.cache/disk2

# disk1 already populated with target1, does not populate disk2
bazel clean && bazel build :target1 --disk_cache=~/.cache/disk1 --disk_cache=~/.cache/disk2

# disk2 already populated with target2, does not populate disk1
bazel clean && bazel build :target2 --disk_cache=~/.cache/disk1 --disk_cache=~/.cache/disk2

# neither disk1 nor disk2 populated with target3, only disk2 populated
bazel clean && bazel build :target3 --disk_cache=~/.cache/disk1 --disk_cache=~/.cache/disk2

In this example, :targetX's do not share any dependencies.

Which operating system are you running Bazel on?

Linux, macOS, Windows

What is the output of bazel info release?

release 7.1.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

  1. BuildBuddy article written sometime around Bazel 5.0-5.1 reports that this "combined cache" behavior should be the default, but can be tuned for different behavior with certain flags.
  2. remote_http_cache + experimental_remote_disk_cache from bazel-discuss group, where a suggested approach was to use a proxy to handle the caches.

Any other information, logs, or outputs that you want to share?

No response

@github-actions github-actions bot added team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Remote-Exec Issues and PRs for the Execution (Remote) team labels May 13, 2024
@coeuvre
Copy link
Member

coeuvre commented May 14, 2024

The repro doesn't make sense to me: you cannot combine multiple disk caches, only the last --disk_cache is used.

Things have changed a lot since Bazel 5. For example we have enabled "Build without the Bytes" by default in Bazel 7, therefore the intermediate outputs are not downloaded by Bazel which explains why some outputs are not downloaded to disk cache.

However, I doubt

I discovered Bazel would not populate the disk cache with artifacts it retrieved from the remote cache for faster access.

and

This left us with an empty disk cache that was never accessed.

Can you provide a proper repro?

@zhengwei143 zhengwei143 added the awaiting-user-response Awaiting a response from the author label May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-user-response Awaiting a response from the author team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged
Projects
None yet
Development

No branches or pull requests

6 participants