Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for overlays in source.json #22349

Closed
wants to merge 13 commits into from

Conversation

fzakaria
Copy link
Contributor

@fzakaria fzakaria commented May 13, 2024

This is a continuation of #22155 that adds the newly added 'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay files rather than use patch files when providing MODULE/WORKSPACE/BUILD files.

bazelbuild/bazel-central-registry#1566 has a good discussion of the rationale.

Co-authored-by: Fabian Meumertzheim fabian@meumertzhe.im

This is a continuation of bazelbuild#22155 that adds the newly added
'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay
files rather than use patch files when providing MODULE/WORKSPACE/BUILD
files.

bazelbuild/bazel-central-registry#1566 has a
good discussion of the rationale.
@github-actions github-actions bot added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. awaiting-review PR is awaiting review from an assigned reviewer labels May 13, 2024
Comment on lines 187 to 197
"/modules/baz/3.0/source.json",
"{",
" \"url\": \"https://example.com/archive.jar?with=query\",",
" \"integrity\": \"sha256-bleh\",",
" \"overlay\": {",
" \"BUILD.bazel\": {",
" \"urls\": [\"http://mirror1\", \"http://mirror2\"],",
" \"integrity\": \"sha256-bleh-overlay\"",
" }",
" }",
"}");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose a structure (overlay) slightly different than the http_archive to simplify it.
http_archive is restricted to certain attr type (I.e str_dict_list) but in JSON we can be a lot more ergonomic to combine the integrity and the mirror URLS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just driving by -- I think this is a great idea!

* offered at multiple urls (mirrors) with a single integrity.
* We split up the file here to simplify the dependency.
*/
class RemoteFile {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a record, then it's short enough to be moved into IndexRegistry as an inner class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't move it within IndexRegistry because I use it in ArchiveRepoSpecBuilder ; it would cause circular reference.
I can move it into ArchiveRepoSpecBuilder or keep as separate file?

I changed to record -- I wasn't aware Bazel is on a new JDK.
(How can I tell the version?)

I also had to write a custom serializer since GSON can't set the record fields as they were final
(I commented on the rationale in the code)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bazel uses Java/JDK 21. You can look for --java_language_version in .bazelrc. GSON 2.10.1 has support for records.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool -- I changed one of the tests to use Text blocks also to simplify the JSON processing;
Yay for new Java.

@fmeum
Copy link
Collaborator

fmeum commented May 13, 2024

The failing test (testShowModuleAndExtensionReposFromBaseModule) is pretty sensitive to any kind of changes to the spec builders, so you might have to adjust the number of skipped lines. Let me know if you have problems running the test to capture the full output.

@fzakaria
Copy link
Contributor Author

The failing test (testShowModuleAndExtensionReposFromBaseModule) is pretty sensitive to any kind of changes to the spec builders, so you might have to adjust the number of skipped lines. Let me know if you have problems running the test to capture the full output.

I printed the output but I'm not sure exactly which fields the test wants to validate.
Here is the stdout pretty printed:

['## @bar_from_foo2:',
 '# <builtin>',
 'http_archive(',
 '  name = "bar~",',
 '  urls = '
 '["file:/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/tests_root/tmp9dqgcyja/tmplpgvxv_9/main/archives/bar.2.0.zip"],',
 '  integrity = "sha256-j3wLXYtBrK/WDy97QCU3oI1HB2+0AYPHyjO6mBlbAVw=",',
 '  strip_prefix = "",',
 '  remote_file_urls = {},',
 '  remote_file_integrity = {},',
 '  remote_patches = {},',
 '  remote_patch_strip = 0,',
 ')',
 '# Rule bar~ instantiated at (most recent call last):',
 '#   <builtin> in <toplevel>',
 '# Rule http_archive defined at (most recent call last):',
 '#   '
 '/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/_bazel_fmzakari/627ca31f67019a002c2b9671f4bd938d/external/bazel_tools/tools/build_defs/repo/http.bzl:397:31 '
 'in <toplevel>',
 '',
 '## ext@1.0:',
 '# <builtin>',
 'local_repository(',
 '  name = "ext~",',
 '  path = '
 '"/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/tests_root/tmp9dqgcyja/tmplpgvxv_9/main/projects/ext",',
 ')',
 '# Rule ext~ instantiated at (most recent call last):',
 '#   <builtin> in <toplevel>',
 '',
 '## @my_repo3:',
 '# <builtin>',
 'data_repo(',
 '  name = "ext~~ext~repo3",',
 '  data = "requested repo",',
 ')',
 '# Rule ext~~ext~repo3 instantiated at (most recent call last):',
 '#   <builtin> in <toplevel>',
 '# Rule data_repo defined at (most recent call last):',
 '#   '
 '/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/_bazel_fmzakari/627ca31f67019a002c2b9671f4bd938d/external/ext~/ext.bzl:2:28 '
 'in <toplevel>',
 '',
 '## @my_repo4:',
 '# <builtin>',
 'data_repo(',
 '  name = "ext~~ext~repo4",',
 '  data = "requested repo",',
 ')',
 '# Rule ext~~ext~repo4 instantiated at (most recent call last):',
 '#   <builtin> in <toplevel>',
 '# Rule data_repo defined at (most recent call last):',
 '#   '
 '/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/_bazel_fmzakari/627ca31f67019a002c2b9671f4bd938d/external/ext~/ext.bzl:2:28 '
 'in <toplevel>',
 '',
 '## bar@2.0:',
 '# <builtin>',
 'http_archive(',
 '  name = "bar~",',
 '  urls = '
 '["file:/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/tests_root/tmp9dqgcyja/tmplpgvxv_9/main/archives/bar.2.0.zip"],',
 '  integrity = "sha256-j3wLXYtBrK/WDy97QCU3oI1HB2+0AYPHyjO6mBlbAVw=",',
 '  strip_prefix = "",',
 '  remote_file_urls = {},',
 '  remote_file_integrity = {},',
 '  remote_patches = {},',
 '  remote_patch_strip = 0,',
 ')',
 '# Rule bar~ instantiated at (most recent call last):',
 '#   <builtin> in <toplevel>',
 '# Rule http_archive defined at (most recent call last):',
 '#   '
 '/tmp/bazel-working-directory/_main/_tmp/95ebb5496eb54e209206860635685b44/_bazel_fmzakari/627ca31f67019a002c2b9671f4bd938d/external/bazel_tools/tools/build_defs/repo/http.bzl:397:31 '
 'in <toplevel>',
 '']

@fzakaria
Copy link
Contributor Author

Okay it should be passing now. That was a miserable test to update.

Target //src/test/py/bazel:mod_command_test up-to-date:
  bazel-bin/src/test/py/bazel/mod_command_test
INFO: Elapsed time: 11.439s, Critical Path: 11.14s
INFO: 2 processes: 1 internal, 1 linux-sandbox.
INFO: Build completed successfully, 2 total actions
//src/test/py/bazel:mod_command_test  

@fzakaria fzakaria requested a review from fmeum May 14, 2024 03:36
Copy link
Collaborator

@fmeum fmeum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few stylistic suggestions as well as a GSON update for record support at fzakaria#1, please take a look.

Otherwise LGTM.

Stylistic changes and GSON update
@fzakaria
Copy link
Contributor Author

I have a few stylistic suggestions as well as a GSON update for record support at fzakaria#1, please take a look.

Otherwise LGTM.

Done -- thank you for the contribution to simplify the process.

@fzakaria fzakaria requested a review from fmeum May 14, 2024 16:29
Copy link
Collaborator

@fmeum fmeum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wyverald Could you review for approval?

"overlay": {
"BUILD.bazel": {
"urls": [
"http://mirror1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wyverald One thing I'm wondering about right now: There is no such thing for remote patches, but should we use the mirrors in bazel_registry.json to populate these URLs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see from the test is that mirrors is only used for the main URL of the module.
I also had the same thought that it was a bit confusing --

Since this is a distinct URL it actually doesn't even need to reside within BCR server so it probably needs it's own URLS.

in the simple/common case, it could be a relative file path and just use a single URL.

Maybe a follow up: if the URL is a file:// use the bazel_registry.json mirrors?

I say let's finish the complete end-to-end with BCR first and then do a UX fixup once we test it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taking a more serious look now -- shouldn't the overlay files be served from the registry itself? Just like how today patch files are served from the registry, instead of some random other URL. The only thing we're fetching from arbitrary URLs is the source archive itself.

I'm thinking that the overlay property can just be a JSON object (i.e. a dict) that maps from overlay file path to integrity (just like how patches is a map from patch file name to integrity). The overlay files will reside in the overlay directory next to the patches directory. So if you have the source.json:

{
    "url": "...",
    "integrity": "...",
    "overlay": {
        "BUILD.bazel": "sha256-...",
        "mypkg/BUILD.bazel": "sha256-..."
    }
}

then your module version directory tree in the registry should look like:

|-- MODULE.bazel
|-- source.json
|-- overlay
    |-- BUILD.bazel
    |-- mypkg
        |-- BUILD.bazel

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see this and construct the URL similar to how it's done for patches?

* Used mirrors from bazel_registry.json to determine mirrors
* Changed overlay in source.json to be more simple
@fzakaria fzakaria requested a review from fmeum May 15, 2024 18:36
@fzakaria
Copy link
Contributor Author

@Wyverald I believe I've made the changes requested to simplify the archive JSON structure.

@fzakaria fzakaria closed this May 15, 2024
@fzakaria fzakaria reopened this May 15, 2024
@fzakaria
Copy link
Contributor Author

Ah -- I closed it by accident :(

I also updated the code as required @Wyverald

@github-actions github-actions bot removed the awaiting-review PR is awaiting review from an assigned reviewer label May 15, 2024
@Wyverald Wyverald added the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label May 15, 2024
@Wyverald Wyverald changed the title Add support for remote_files to bzlmod Add support for overlays in source.json May 15, 2024
@fmeum
Copy link
Collaborator

fmeum commented May 16, 2024

@bazel-io fork 7.2.0

@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label May 23, 2024
bazel-io pushed a commit to bazel-io/bazel that referenced this pull request May 23, 2024
This is a continuation of bazelbuild#22155 that adds the newly added 'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay files rather than use patch files when providing MODULE/WORKSPACE/BUILD files.

bazelbuild/bazel-central-registry#1566 has a good discussion of the rationale.

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>

Closes bazelbuild#22349.

PiperOrigin-RevId: 636682112
Change-Id: Ief070985598a7c0f427a98cd3daeb69a0984f7be
github-merge-queue bot pushed a commit that referenced this pull request May 23, 2024
This is a continuation of #22155 that adds the newly added
'remote_files' attribute for http_archive to the bzlmod functionality.

The end goal is to then update BCR to this new functionality to overlay
files rather than use patch files when providing MODULE/WORKSPACE/BUILD
files.

bazelbuild/bazel-central-registry#1566 has a
good discussion of the rationale.

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>

Closes #22349.

PiperOrigin-RevId: 636682112
Change-Id: Ief070985598a7c0f427a98cd3daeb69a0984f7be

Commit
c4167e3

Co-authored-by: Farid Zakaria <fmzakari@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants