Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel Windows Binary cannot run without embedded JDK #16613

Closed
meteorcloudy opened this issue Oct 31, 2022 · 12 comments
Closed

Bazel Windows Binary cannot run without embedded JDK #16613

meteorcloudy opened this issue Oct 31, 2022 · 12 comments
Assignees
Labels
area-Windows Windows-specific issues and feature requests P1 I'll work on this now. (Assignee required) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug

Comments

@meteorcloudy
Copy link
Member

Description of the bug:

#16159 (comment)
#16159 (comment)

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Download https://releases.bazel.build/6.0.0/rc1/bazel_nojdk-6.0.0rc1-windows-x86_64.exe and run it on Windows

Which operating system are you running Bazel on?

Windows

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@meteorcloudy meteorcloudy added type: bug P1 I'll work on this now. (Assignee required) area-Windows Windows-specific issues and feature requests team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone labels Oct 31, 2022
@meteorcloudy meteorcloudy added this to the 6.0.0 release blockers milestone Oct 31, 2022
@meteorcloudy meteorcloudy self-assigned this Oct 31, 2022
@meteorcloudy
Copy link
Member Author

meteorcloudy commented Oct 31, 2022

I cannot reproduce #16159 (comment) building the windows Bazel nojdk binary on my windows machine. Also the nojdk Windows binary at last_green doesn't have this issue. I suspect there is some strange infrastructure flakiness?

@fmeum Did the rules_jni failure happen consistently after 902a0b5?

@meteorcloudy
Copy link
Member Author

meteorcloudy commented Oct 31, 2022

I reran the release pipeline at the same commit for release-6.0.0rc1:

Previous:

2022-10-24 19:20:07 INFO   Uploading artifact 01840b70-1ba9-4b0f-97f9-ac34367b8e80 bazel_nojdk-6.0.0rc1-windows-x86_64.exe (30310526 bytes)

Now

2022-10-31 10:47:38 INFO   Uploading artifact 01842da7-6d67-4353-a8b2-1497e9d4df4e bazel_nojdk-6.0.0rc1-windows-x86_64.exe (31858134 bytes)

The nojdk windows binary has significantly size difference, and the new one runs correctly. So there must be some infra problem while building the Windows binary.

@meteorcloudy
Copy link
Member Author

Maybe the https://cs.opensource.google/bazel/bazel/+/master:src/package-bazel.sh script is not hermetic enough, at least on Windows? 🧐

@fmeum
Copy link
Collaborator

fmeum commented Oct 31, 2022

@meteorcloudy Yes, it kept happening consistently. Note that I am not using the nojdk build, it's just that the regular with-JDK build doesn't seem to be able to find its embedded JDK.

@meteorcloudy
Copy link
Member Author

@fmeum Can you provide a minimal reproduce case?

@fmeum
Copy link
Collaborator

fmeum commented Oct 31, 2022

@meteorcloudy I don't have a Windows machine I could minimize this on, but this is the workflow of which the windows-2019 last_green JDK 17 job fails, both with and without Bzlmod. The job doesn't really do anything other than setting JAVA_HOME to a JDK 17 and running bazel test on the tests subdirectory. I also attached the logs.

Let me know if there is anything I could to help.

@fmeum
Copy link
Collaborator

fmeum commented Oct 31, 2022

@meteorcloudy I downloaded the Bazel binaries used in the runs:
https://storage.googleapis.com/bazel-builds/artifacts/windows/902a0b5763dddc445f75637c08a1396de8395411/bazel (broken)
https://storage.googleapis.com/bazel-builds/artifacts/windows/50b87c1ec0a5696e46dec7122de130d13112ffdd/bazel (functional)

The former is 30MiB, the latter 45MiB, but the commit history doesn't explain that drastic change at all. Maybe the JDK is no longer bundled correctly?

@meteorcloudy
Copy link
Member Author

I can reproduce the failure with 902a0b5763dddc445f75637c08a1396de8395411 by setting JAVA_HOME to a JDK 17 installation, but not with it's parent commit or the latest last_green. I suspect this has something to do with bazelbuild/continuous-integration#1408, the temporary workaround is to rebuild the binaries at the same commit. But I'll need to dig deeper to investigate why the binaries are getting corrupted.

@meteorcloudy
Copy link
Member Author

meteorcloudy commented Oct 31, 2022

@konste I have rebuilt and deployed Bazel binaries for 6.0.0rc1, the nojdk Windows binary seems to work now.

@meteorcloudy
Copy link
Member Author

@fmeum I also rebuild binaries for 902a0b5, the binaries look correct this time:

2022-10-31 13:03:41 INFO   Creating (0-1)/1 artifacts
2022-10-31 13:03:41 INFO   Uploading artifact 01842e23-ff45-46f4-9b1d-66617a065234 bazel.exe (46925059 bytes)
2022-10-31 13:03:42 INFO   Successfully uploaded artifact "bazel.exe"
2022-10-31 13:03:43 INFO   Artifact uploads completed successfully
buildkite-agent artifact upload bazel_nojdk.exe
2022-10-31 13:03:44 INFO   Found 1 files that match "bazel_nojdk.exe"
2022-10-31 13:03:44 INFO   Uploading to "gs://bazel-trusted-buildkite-artifacts/01842e1f-1e57-4763-b81c-0d6fbccca2d3", using your agent configuration
2022-10-31 13:03:44 INFO   Creating (0-1)/1 artifacts
2022-10-31 13:03:44 INFO   Uploading artifact 01842e24-0881-440b-8889-12c56c0ca4d4 bazel_nojdk.exe (31119419 bytes)
2022-10-31 13:03:44 INFO   Successfully uploaded artifact "bazel_nojdk.exe"
2022-10-31 13:03:46 INFO   Artifact uploads completed successfully

@meteorcloudy
Copy link
Member Author

I'll close this one, but continue to investigate the issue in bazelbuild/continuous-integration#1408

@meteorcloudy
Copy link
Member Author

@Wyverald and I debugged on this can find out the root cause is a a race condition in https://cs.opensource.google/bazel/bazel/+/master:src/package-bazel.sh, I'll send a fix now

meteorcloudy added a commit to meteorcloudy/bazel that referenced this issue Oct 31, 2022
bazel build -c //src:bazel.exe //src:bazel_nojdk.exe sometimes output
corrupted binaries on Windows.

The reason is when we are executing the genrule for packaging the bazel
zip files, we are writing a "file.list" file into the execroot, however
there is no sandbox on Windows. So two actions are actually sharing the
same path for the "file.list", this PR fixes the issue by writing the
"file.list" file under the tmp dir.

Fixes bazelbuild#16613
@meteorcloudy meteorcloudy removed the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Oct 31, 2022
@meteorcloudy meteorcloudy removed this from the 6.0.0 release blockers milestone Oct 31, 2022
ShreeM01 added a commit that referenced this issue Nov 2, 2022
`bazel build -c //src:bazel.exe //src:bazel_nojdk.exe` sometimes output corrupted binaries on Windows.

The reason is when we are executing the genrule for packaging the bazel zip files, we are writing a "file.list" file into the execroot, however there is no sandbox on Windows. So two actions are actually sharing the same path for the "file.list", this PR fixes the issue by writing the "file.list" file under the tmp dir.

Fixes #16613

Closes #16614.

PiperOrigin-RevId: 485578533
Change-Id: I74b69e58919a463d5cc40abaa6ae4ca36251cdac

Co-authored-by: Yun Peng <pcloudy@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Windows Windows-specific issues and feature requests P1 I'll work on this now. (Assignee required) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants