Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any possible way to set the host_compiler explicitly instead of inferring from current toolchain? #194

Open
wyhao31 opened this issue Nov 8, 2023 · 19 comments

Comments

@wyhao31
Copy link

wyhao31 commented Nov 8, 2023

This sounds like a weird request, and here's my use case.

I use a customized C++ toolchain in my project. In order to use ccache, I wrap the actually compiling command into a script, and put the script path as tool_path in the toolchain definition.

tool_path(
    name = "gcc",
    path = "/path/to/wrap-script",
),

When using cuda_library, /path/to/wrap-script appears as the parameter after nvcc -ccbin, this will cause nvcc hangs forever. The correct way of using ccache for cuda libraries is to wrap the nvcc call with ccache, not gcc. This is why I want to set the host_compiler explicitly.

I'm not sure whether I'm using it correctly, or there's an existing way to bypass it. Thanks in advance for any suggestion.

@cloudhan
Copy link
Collaborator

cloudhan commented Nov 8, 2023

You need to configure the cuda toolchain manually to achieve it.

@cloudhan
Copy link
Collaborator

cloudhan commented Nov 8, 2023

Out of curiosity, why do you want to use ccache with bazel?

@wyhao31
Copy link
Author

wyhao31 commented Nov 9, 2023

You need to configure the cuda toolchain manually to achieve it.

Could you please point out how to do it manually? I noticed that host_compiler is from cc_toolchain.compiler_executable [1], and cc_toolchain is from find_cpp_toolchain. IIUC, it will use the current configured C++ toolchain directly.

[1] https://github.com/bazel-contrib/rules_cuda/blob/main/cuda/private/actions/compile.bzl#L36
[2] https://github.com/bazel-contrib/rules_cuda/blob/main/cuda/private/rules/cuda_library.bzl#L18

@wyhao31
Copy link
Author

wyhao31 commented Nov 9, 2023

Out of curiosity, why do you want to use ccache with bazel?

We're migrating from cmake to bazel, and we use ccache in cmake setup. Besides local cache, we also use secondary cache (remote cache) for ccache, I know Bazel also has remote cache support, but we'll still use ccache in the beginning during the migration.

@cloudhan
Copy link
Collaborator

cloudhan commented Nov 9, 2023

The toolchain are instantiated from https://github.com/bazel-contrib/rules_cuda/tree/main/cuda/templates

@cloudhan
Copy link
Collaborator

cloudhan commented Nov 9, 2023

BTW, bazel also has local cache

@wyhao31
Copy link
Author

wyhao31 commented Nov 9, 2023

The toolchain are instantiated from https://github.com/bazel-contrib/rules_cuda/tree/main/cuda/templates

I think you're talking about cuda toolchains. The host_compiler I mentioned above is from cc toolchain.

@wyhao31
Copy link
Author

wyhao31 commented Nov 9, 2023

BTW, bazel also has local cache

Maybe I didn't say it clearly. We want to use ccache secondary cache, and we don't want to spend effort setting up bazel remote cache, so we'll still use ccache for some time.

@cloudhan
Copy link
Collaborator

cloudhan commented Nov 9, 2023

Oops, it seems I have remenbered it incorrectly. The _cc_toolchain is an implicit attribute to all cuda rules, but is not generally customizable here. IIRC, there is no way to have more than 1 cc toolchain to be selected as runtime for cc rules, so there is no need to provide a way to configure it, because some actions of cuda rules use cc actions to produce artifact.

@wudisheng
Copy link

wudisheng commented Apr 12, 2024

I'm in a similar situation.

In our use case, we use clang as C/C++ compiler (i.e. cc toolchain resolution chooses clang), but for CUDA code we'd like to use nvcc as the CUDA compiler, which results in errors because it seems that nvcc expects gcc passed to -ccbin. It can be verified by using gcc as C/C++ compiler --- nvcc is happy if so.

But we still want most of our code (non-CUDA code) compiled by clang instead of gcc, so I'd want to find a way to force rules_cuda to use nvcc along with gcc to deal with CUDA code, while using clang for everything else (non-CUDA code).

Currently I have a very ugly solution by defining a series of special cc_toolchain_suite which resolve every possible pair of cpu and compiler to gcc (even if --compiler=clang, etc.), and replace current_cc_toolchain in rules_cuda implementation with it. But it sounds too hacky, and more importantly, when I try to switch to the new platform-based cc toolchain resolution mechanism (where cc_toolchain_suite is deprecated and no-op), I cannot figure out a way to do the same hacking.

Any suggestion?

@wyhao31
Copy link
Author

wyhao31 commented Apr 12, 2024

@wudisheng I think nvcc can accept clang passed to -ccbin.

@cloudhan
Copy link
Collaborator

cloudhan commented Apr 12, 2024

@wudisheng You are describing a different problem. To make nvcc to use clang as host compiler (NOTE: you must ensure the host compiler for cuda and cc compiler for cc rules are the same)

https://github.com/cloudhan/rules_cuda_examples/blob/ebaf56457cb0cfc69ae0a10dfebd2e95fd109594/nccl/.bazelrc#L6-L8

build --flag_alias=cuda_compiler=@rules_cuda//cuda:compiler

# device: clang, host: clang, cc: clang
build:clang --repo_env=CC=clang
build:clang --cuda_compiler=clang

# device: nvcc, host: clang, cc: clang
build:nvcc_clang --repo_env=CC=clang
build:nvcc_clang --cuda_compiler=nvcc

# host: compiler_a, cc: compiler_b
# not supported

Otherwise it should be a bug or misconfiguration.

host_compiler_feature = feature(
name = "host_compiler_path",
enabled = True,
flag_sets = [
flag_set(
actions = [
ACTION_NAMES.cuda_compile,
ACTION_NAMES.device_link,
],
flag_groups = [flag_group(flags = ["-ccbin", "%{host_compiler}"])],
),
],
)

@wyhao31
Copy link
Author

wyhao31 commented Apr 12, 2024

For any reason you want to use a different host_compiler, here's what I did in my project.

  1. Create a patch of rules_cuda, setting host_compiler[1] to a path you want, changing [2] and [3] to make sure correct PATH environment variable is set.
  2. Apply the patch to rules_cuda in WORKSPACE by using patches attribute[4].

I know it's kind of hacky, and I don't recommend people doing it. It just fits my need, and works in my project.

[1] https://github.com/bazel-contrib/rules_cuda/blob/v0.2.1//cuda/private/actions/compile.bzl#L36
[2] https://github.com/bazel-contrib/rules_cuda/blob/v0.2.1/cuda/private/toolchain_configs/nvcc.bzl#L51
[3] https://github.com/bazel-contrib/rules_cuda/blob/v0.2.1/cuda/private/toolchain_configs/nvcc.bzl#L61
[4] https://bazel.build/rules/lib/repo/http#http_archive-patches

@cloudhan
Copy link
Collaborator

cloudhan commented Apr 12, 2024

@wyhao31 I think you can avoid a patch by bazel build --features=-host_compiler_path ..., this will (should, I am not quite sure =) ) disable the --ccbin <cc_toolchain.compiler_path> to be added to the commandline. Then you can rely on @rules_cuda//cuda:copts flag to pass it in as is. :)

Warning

this is not recommended, especially on Windows!

Warning

this will break hermeticity

@wudisheng
Copy link

@wudisheng I think nvcc can accept clang passed to -ccbin.

In our specific environment (e.g. versions), no. And even if it can, we'd like nvcc+gcc, otherwise we could use clang solely without nvcc.

@wyhao31
Copy link
Author

wyhao31 commented Apr 12, 2024

@wyhao31 I think you can avoid a patch by bazel build --features=-host_compiler_path ..., this will (should, I am not quite sure =) ) disable the --ccbin <cc_toolchain.compiler_path> to be added to the commandline. Then you can rely on @rules_cuda//cuda:copts flag to pass it in as is. :)

Thanks! Actually, I tried this approach before, but it turns out not working. The reason is that cuda_compile_action implies host_compiler_path[1], so it seems host_compiler_path cannot be disabled.

[1] https://github.com/bazel-contrib/rules_cuda/blob/v0.2.1/cuda/private/toolchain_configs/nvcc.bzl#L111

@wudisheng
Copy link

@wudisheng You are describing a different problem. To make nvcc to use clang as host compiler (NOTE: you must ensure the host compiler for cuda and cc compiler for cc rules are the same)

https://github.com/cloudhan/rules_cuda_examples/blob/ebaf56457cb0cfc69ae0a10dfebd2e95fd109594/nccl/.bazelrc#L6-L8

build --flag_alias=cuda_compiler=@rules_cuda//cuda:compiler

# device: clang, host: clang, cc: clang
build:clang --repo_env=CC=clang
build:clang --cuda_compiler=clang

# device: nvcc, host: clang, cc: clang
build:nvcc_clang --repo_env=CC=clang
build:nvcc_clang --cuda_compiler=nvcc

# host: compiler_a, cc: compiler_b
# not supported

Otherwise it should be a bug or misconfiguration.

host_compiler_feature = feature(
name = "host_compiler_path",
enabled = True,
flag_sets = [
flag_set(
actions = [
ACTION_NAMES.cuda_compile,
ACTION_NAMES.device_link,
],
flag_groups = [flag_group(flags = ["-ccbin", "%{host_compiler}"])],
),
],
)

The "%{host_compiler}" here is /.../clang, I'd like it because I want everything except cuda code compiled by clang, but for cuda code, I want nvcc -ccbin /.../gcc.

It seems not supported out of box, and I can hack it, I'm trying to explore a better way to integrate it with the new platform-based toolchain resolution.

@cloudhan
Copy link
Collaborator

I'm curious if it is possible to let bazel to use two differently configured toolchains for the same rule (different targets) in a single bazel build ...?

If it is possible, we can implement a toolchain_type for host compiler (and default to cc toolchain as status quo) and make it configurable.

If not possible. Then it should be left as a hack because it will easily break a lot of things...

@wudisheng
Copy link

Technically, yes. The official way is to use transitions, which is considerably difficult.

For legacy --*crosstool_top cc toolchain resolution, I can manually generate a (different) cc_toolchain_suite which has fake mappings from all possible combinations of cpu,compiler to gcc, and assign it to _cc_toolchain attribute in rules_cuda instead of @bazel_tools//tools/cpp:current_cc_toolchain.

Using this way I can get the behavior I described above, but it is not working if I enable platform-based toolchain resolution, because there isn't a way to make a particular rule as a constraint, and config_settings are generally global.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants