Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use cublas in a non-root bazel module? #238

Open
appthumb opened this issue Apr 18, 2024 · 7 comments · May be fixed by #239
Open

How to use cublas in a non-root bazel module? #238

appthumb opened this issue Apr 18, 2024 · 7 comments · May be fixed by #239
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@appthumb
Copy link

appthumb commented Apr 18, 2024

I'm using rules_cuda in a bazel MODULE A, and some of my cuda_library needs to link with -lcubas and -lcublasLt.

Naturally, I'm defining local_cuda like examples/cublas/BUILD.bazel. . In my MODULE.bazel of module A:

bazel_dep(
    name = "rules_cuda",
    version = "0.2.1",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
cuda.local_toolchain(
    name = "local_cuda",
    toolkit_path = "",
)
use_repo(cuda, "local_cuda")

And in my BUILD.bazel, I have the cuda_library target:

cuda_library(
    name = ...,
    srcs = ...,
    deps = [
       "@local_cuda//:cublas",
    ],
    ...
)

This works fine when I build my module A. However, when I have another bazel module B that depends on module A, I cannot build module B, because the local_cuda can only be declared in a root module. I got this error:

ERROR: Traceback (most recent call last):
File "/private/var/tmp/_bazel_dev/67de6cda420db4eb86e6ad3f1fd2b6e4/external/rules_cuda~/cuda/extensions.bzl", line 15, > column 21, in _init
fail("Only the root module may override the path for the local cuda toolchain")
Error in fail: Only the root module may override the path for the local cuda toolchain

This is from this line:

fail("Only the root module may override the path for the local cuda toolchain")
.

Is it possible to use rules_cuda in a bazel module that other modules can depend? I don't really need to customize my cuda path, as the default path works fine with me. Is there a way to avoid the above error?

@cloudhan cloudhan added the enhancement New feature or request label Apr 19, 2024
@cloudhan
Copy link
Collaborator

With current impl, it is not possible.

But it is very reasonable to let upstream projects expose targets that depend on rules_cuda, say, kernels wrapped in a c library with c public interfaces. Current project and downstream projects should be able to use it without pain.

I'll see how we should improve the situation.

@cloudhan cloudhan added the help wanted Extra attention is needed label Apr 19, 2024
@cloudhan cloudhan linked a pull request Apr 19, 2024 that will close this issue
@appthumb
Copy link
Author

Thanks for looking into this! Really appreciated.

Do you mean that my upstream module A uses cuda_library internally, and exposes it with cc_library, and my downstream module B depends on it?

This doesn't seem to work -- it looks as long as I use cuda_library rule in module A, I need to add local_cuda to the MODULE.bazel of module A, and this forces A to be a top-level module.

I tried to remove everything that's referring to local_cuda in module A, so in A's MODULE.bazel file I only have:

bazel_dep(
    name = "rules_cuda",
    version = "0.2.1",
)

and I use coda_library in A's BUILD.bazel file:

load("@rules_cuda//cuda:defs.bzl", "cuda_library")

cuda_library(
    name = "kernel",
    srcs = ["kernel.cu"],
    hdrs = ["kernel.h"],
)

then A cannot compile by bazel. bazel build kernel gives this error:

Analysis of target '//my_project:kernel' failed; build aborted: module extension "toolchain" from "@@rules_cuda~//cuda:extensions.bzl" does not generate repository "local_cuda", yet it is imported as "local_cuda" in the usage at https://bcr.bazel.build/modules/rules_cuda/0.2.1/MODULE.bazel:10:26

This is referring to

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")

Adding local_cuda to A's MODULE.bazel would make bazel compile A, but then module B cannot depend on it.

@appthumb
Copy link
Author

Oh, didn't see a fix is in the making! Looking forward to it 👍

@jsharpe
Copy link
Member

jsharpe commented Apr 19, 2024

I think its possible to just do:

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
use_repo(cuda, "local_cuda")

in B's MODULE.bazel which will make local_cuda available to B - its not ideal but I think this works - I have something similar in one of my projects.

@appthumb
Copy link
Author

yes, this would work to compile B. The problem is that this won't compile A if A has some cuda_library target. This can be annoying, e.g., all the compilation and testing of the cuda code in A now has to be done through module B.

@jsharpe
Copy link
Member

jsharpe commented Apr 21, 2024

You would leave your A module with the use_repo that you had above. the above snippet makes local_cuda visible in B and you can use cuda_library in A or B.

@appthumb
Copy link
Author

Thanks for your response! I kind of get it work for my purpose, by using different MODULE.bazel files in my local bazel registry and in the repo_a. Here's a summary of what I have found so far.

My setup:

  • A bazel_registry folder that serves as my local bazel registry. It has a module repo_a using local_path (https://bazel.build/external/registry#index_registry), so that my repo_b can find repo_a. The bazel registry doesn't serve repo_a's source files, instead it just points to the actual source code directory of repo_a using the local_path feature. Specifically, my bazel_registry contains the subfolder bazel_registry/modules/repo_a/1.0.0 as the registry for repo_a, and especially there is this file bazel_registry/modules/repo_a/1.0.0/MODULE.bazel.

  • A repo_a, which has the repo_a's source code, BUILD.bazel files, and a different MODULE.bazel file.

  • A repo_b, which depends on repo_a through my local bazel_registry.

Now this is the complete file content of bazel_registry/modules/repo_a/1.0.0/MODULE.bazel:

"""repo A."""

module(
    name = "repo_a",
    version = "1.0.0",
)

bazel_dep(
    name = "bazel_skylib",
    version = "1.5.0",
)

bazel_dep(
    name = "rules_cc",
    version = "0.0.9",
)

bazel_dep(
    name = "rules_cuda",
    version = "1.0.0",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
use_repo(cuda, "local_cuda")

This is the complete content of repo_b/MODULE.bazel:

"""repo_b"""

module(
    name = "repo_b",
    version = "1.0.0",
)

bazel_dep(
    name = "bazel_skylib",
    version = "1.5.0",
)

bazel_dep(
    name = "rules_cc",
    version = "0.0.9",
)

bazel_dep(
    name = "repo_a",
    version = "1.0.0",
)

bazel_dep(
    name = "rules_cuda",
    version = "1.0.0",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")

cuda.local_toolchain(
    name = "local_cuda",
    toolkit_path = "",
)
use_repo(cuda, "local_cuda")

This works, and bazel build under repo_b succeeds. Note:

  • The coda.local_toolchain is declared in repo_b/MODULE.bazel, and not in bazel_registry/modules/repo_a/1.0.0/MODULE.bazel. Adding it to the latter will lead to the " fail("Only the root module may override the path for the local cuda toolchain")" error.

  • Whether or not bazel_registry/modules/repo_a/1.0.0/MODULE.bazel has the line use_repo(cuda, "local_cuda") doesn't make a difference.

Now repo_b works fine, but I also want repo_a to work, i.e., doing bazel build under repo_a should succeed. If I use the exact content of bazel_registry/modules/repo_a/1.0.0/MODULE.bazel as repo_a/MODULE.bazel, I will get the following error when I bazel build under repo_a:

failed; build aborted: module extension "toolchain" from "@@rules_cuda~//cuda:extensions.bzl" does not generate repository "local_cuda", yet it is imported as "local_cuda" in the usage at /home/dev/temp/cuda_test/repo_a/MODULE.bazel:23:21

To workaround this, I need to use a slightly different MODULE.bazel content under repo_a. This is the complete content of repo_a/MODULE.bazel:

"""repo A."""

module(
    name = "repo_a",
    version = "1.0.0",
)

bazel_dep(
    name = "bazel_skylib",
    version = "1.5.0",
)

bazel_dep(
    name = "rules_cc",
    version = "0.0.9",
)

bazel_dep(
    name = "rules_cuda",
    version = "1.0.0",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")

cuda.local_toolchain(
    name = "local_cuda",
    toolkit_path = "",
)
use_repo(cuda, "local_cuda")

Note that I add cuda.local_toolchain in it. This makes bazel build under repo_a work without any issue. This won't break the build under repo_b, since the latter uses a different MODULE.bazel file for repo_a.

So far I got both repo_a and repo_b work, by leveraging different MODULE.bazel files between the one in my local bazel registry, and the one in the actual repo, to bypass the requirement that toolchains must be defined at the top-level module.

Not sure if this is the canonical way of setting up local dependencies, and it feels like a hack. It will be nice if we can remove the limitation of toolchains declaration, and that avoids all these tricky situations and the MODULE.bazel files won't have to diverge between the one in the repo vs. the one in the bazel registry.

(PS: I'm using the head version of rules_cuda in this GitHub repository. The rules_cuda in Bazel Central Registry https://registry.bazel.build/modules/rules_cuda is 5 months behind the head version here. To avoid confusion I point to the head version of rules_cuda in my local bazel registry as well, and this is why you can see the version of rules_cuda is 1.0.0 above. I also tried the published version 0.2.1, and the result is the same).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants