cuda: add module #422

bobvanderlinden · 2023-02-20T20:24:35Z

As discussed on Discord, this configuration is needed to run pytorch in devenv on Linux. It was confirmed to work.

I don't have much knowledge of CUDA itself, so I'm unsure what other libraries exactly need. I did find that CUDA_HOME and CUDA_PATH are used by tensorflow.

Confirmations that this works for real projects are welcome!

domenkozar · 2023-02-21T03:44:31Z

We'll need to add toolkit folder to top-level.nix imports.

Is this ever supposed to work on macOS?

bobvanderlinden · 2023-02-21T06:40:39Z

We'll need to add toolkit folder to top-level.nix imports.

👍

Is this ever supposed to work on macOS?

I don't think so, but not sure 😅
https://developer.nvidia.com/nvidia-cuda-toolkit-11_6_0-developer-tools-mac-hosts
Apparently it can be used remotely, so yes the toolkit itself does support macOS. I'm kindof doubting that works for pytorch as it needs libcuda.so which is in the x11 package.

domenkozar · 2023-02-22T06:01:20Z

Let's add an assertion then if !pkgs.stdenv.isLinux.

Signed-off-by: Bob van der Linden <bobvanderlinden@gmail.com>

bobvanderlinden · 2023-02-23T14:37:05Z

Requires #383, as cuda is unfree.

domenkozar · 2023-02-24T07:15:53Z

Would we want to incorporate any feedback from NixOS/nixpkgs#217780 (comment)? Leaving it for the future is also fine :)

SomeoneSerge · 2023-02-27T12:06:39Z

Is this ever supposed to work on macOS?

Is CUDA available on MacOS?
Darwin hasn't been included in meta.platforms for most of cudaPackages, but it's mostly that people focus on Linux

bobvanderlinden · 2023-03-01T08:55:48Z

Hmm, this PR is not really ready. It probably shouldn't support MacOS if cuda in nixpkgs doesn't support it. On Discord it was mentioned this method did work for pytorch, but because of LD_LIBRARY_PATH change of pkgs.gcc-unwrapped the Rust compiler breaks down.

I think LD_LIBRARY_PATH is mostly to workaround a pytorch problem and not so much a CUDA problem.

Using the pytorch package from nixpkgs (and thus the nixpkgs cuda maintainers) doesn't play nicely with poetry (pyproject.toml), so there is not a perfect solution yet.

I am interested in looking into this further to get a good setup for cuda + pytorch + rust, but it's not high on my todo list atm.

I can leave this PR in draft to keep the discussion for devenv centralized, but I can also open a new issue if that's more appropriate.

tfmoraes · 2023-03-28T22:33:12Z

Adding /run/opengl-driver/lib to $LD_LIBRARY_PATH makes CUDA work for me:

> python -c "import torch; print(torch.cuda.is_available())"
True

bobvanderlinden · 2023-03-29T10:21:46Z

Adding /run/opengl-driver/lib to $LD_LIBRARY_PATH makes CUDA work for me:
> python -c "import torch; print(torch.cuda.is_available())"
True

I think that is a fix that should be in NixOS. It makes no sense for other distros nor MacOS. Because of that, I'm not sure whether it should be in devenv.

SomeoneSerge · 2023-03-29T12:52:27Z

I think that is a fix that should be in NixOS. @bobvanderlinden

There's no need for that on NixOS: as long as you use a nix-built pytorch, /run/opengl-driver/lib would already be in the binaries' Runpaths

bobvanderlinden · 2023-04-05T14:57:23Z

Indeed. However, most people want to use pytorch from poetry (having it be part of pyproject.toml). When doing so, you'll run into the LD_LIBRARY_PATH problem, but only on NixOS. Other systems have the OpenGL driver libraries (like Cuda) globally available.

The best of both worlds might be to use poetry2nix instead of poetry itself to make all poetry-defined packages available as Nix packages. That way the torch package can be overridden to link to a different Cuda library explicitly.

It avoids having the need for LD_LIBRARY_PATH as well as pkgs.gcc-unwrapped.lib.

It does have its own downside in that it probably will not work very nicely with poetry cli commands like poetry add. Haven't given this a try yet though, might not be so bad when using direnv properly.

SomeoneSerge · 2023-04-07T00:14:17Z

Indeed. However, most people want to use pytorch from poetry (having it be part of pyproject.toml)

Dunno, I haven't seen these people 😆

Other systems have the ~~OpenGL~~ driver libraries (like Cuda) globally available

This is not exactly correct. Most other systems do indeed merge all libraries into one location. But the reason their pytorch manages to discover e.g. libcudart.so and through the libcuda.so, is that their python has in its header .interp set to a system-specific path like /lib64/ld-linux-x86-64.so (or something), and that linker is configured to look at /etc/ld.so.conf (or something) which is also a system-specific path. And that ld.so.conf will specifically enumerate system-specific paths like /lib, /usr/lib, and /opt/some-nonsense/cuda/lib. In other words, their libcuda.so is as "globally available" as ours. Having that said, maybe we could make integration easier, at risk of occasionally facing some library version mismatches

lizelive · 2023-04-11T01:23:56Z

Dunno, I haven't seen these people

i use torch from poetry most of time because cuda libs are in pypi now and fewest number of package mangers the better.

lizelive · 2023-04-11T01:25:06Z

also because ml packages are updating so fast it's not viable to do with nix system packages

domenkozar · 2023-07-06T11:50:28Z

Could we somehow detect if opengl stuff is wired up and error out with a nice message what to do?

domenkozar · 2024-03-27T07:11:37Z

src/modules/toolkits/cuda.nix

+    enable = lib.mkEnableOption "CUDA toolkit";
+
+    package = lib.mkOption {
+      type = lib.types.package;


Sometimes CUDA can't come from Nix, so we'll have to allow a way to setup FHS in those cases.

It's tricky to get this right (and won't work on macOS), but it's often required.

Sometimes CUDA can't come from Nix,

Interesting. Any specific examples in mind?

SomeoneSerge · 2024-03-27T14:46:50Z

src/modules/toolkits/cuda.nix

+    package = lib.mkOption {
+      type = lib.types.package;
+      description = "Which package of cuda toolkit to use.";
+      default = pkgs.cudatoolkit;


Note: this attribute is almost unmaintained, it's better to use the splayed packages

SomeoneSerge · 2024-03-27T14:47:26Z

src/modules/toolkits/cuda.nix

+    env.LD_LIBRARY_PATH = lib.mkIf pkgs.stdenv.isLinux (
+      lib.makeLibraryPath [
+        pkgs.gcc-unwrapped.lib
+        pkgs.linuxPackages.nvidia_x11


Hm. How is this to be synchronized with config.boot.kernelPackages?

bobvanderlinden added 2 commits February 22, 2023 22:49

cuda: add module

8141a54

cuda: add example

7894a7b

Signed-off-by: Bob van der Linden <bobvanderlinden@gmail.com>

bobvanderlinden force-pushed the pr-cuda branch from 968411f to 7894a7b Compare February 22, 2023 21:52

fixup! cuda: add module

80f45b0

domenkozar force-pushed the main branch 4 times, most recently from e82dccf to 2cae5ec Compare March 2, 2023 11:27

domenkozar mentioned this pull request Apr 11, 2023

Machine learning support #527

Open

domenkozar force-pushed the main branch from e7de27e to 47eb4d1 Compare April 26, 2023 17:03

domenkozar force-pushed the main branch from 3730e4c to 72affdb Compare March 25, 2024 15:39

domenkozar reviewed Mar 27, 2024

View reviewed changes

SomeoneSerge reviewed Mar 27, 2024

View reviewed changes

domenkozar force-pushed the main branch from 149f6dd to 2ee3326 Compare April 19, 2024 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda: add module #422

cuda: add module #422

bobvanderlinden commented Feb 20, 2023

domenkozar commented Feb 21, 2023

bobvanderlinden commented Feb 21, 2023

domenkozar commented Feb 22, 2023

bobvanderlinden commented Feb 23, 2023

domenkozar commented Feb 24, 2023

SomeoneSerge commented Feb 27, 2023

bobvanderlinden commented Mar 1, 2023

tfmoraes commented Mar 28, 2023

bobvanderlinden commented Mar 29, 2023

SomeoneSerge commented Mar 29, 2023

bobvanderlinden commented Apr 5, 2023

SomeoneSerge commented Apr 7, 2023 •

edited

lizelive commented Apr 11, 2023

lizelive commented Apr 11, 2023

domenkozar commented Jul 6, 2023

domenkozar Mar 27, 2024

SomeoneSerge Mar 27, 2024

SomeoneSerge Mar 27, 2024

SomeoneSerge Mar 27, 2024

cuda: add module #422

Are you sure you want to change the base?

cuda: add module #422

Conversation

bobvanderlinden commented Feb 20, 2023

domenkozar commented Feb 21, 2023

bobvanderlinden commented Feb 21, 2023

domenkozar commented Feb 22, 2023

bobvanderlinden commented Feb 23, 2023

domenkozar commented Feb 24, 2023

SomeoneSerge commented Feb 27, 2023

bobvanderlinden commented Mar 1, 2023

tfmoraes commented Mar 28, 2023

bobvanderlinden commented Mar 29, 2023

SomeoneSerge commented Mar 29, 2023

bobvanderlinden commented Apr 5, 2023

SomeoneSerge commented Apr 7, 2023 • edited

lizelive commented Apr 11, 2023

lizelive commented Apr 11, 2023

domenkozar commented Jul 6, 2023

domenkozar Mar 27, 2024

Choose a reason for hiding this comment

SomeoneSerge Mar 27, 2024

Choose a reason for hiding this comment

SomeoneSerge Mar 27, 2024

Choose a reason for hiding this comment

SomeoneSerge Mar 27, 2024

Choose a reason for hiding this comment

SomeoneSerge commented Apr 7, 2023 •

edited