Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link Against tiny-cuda-nn in C++ Program #423

Open
half-potato opened this issue Mar 19, 2024 · 2 comments
Open

Link Against tiny-cuda-nn in C++ Program #423

half-potato opened this issue Mar 19, 2024 · 2 comments

Comments

@half-potato
Copy link

I didn't see any instructions on how to link against tiny-cuda-nn. In fact, it doesn't even have an install. I saw that iNGP just uses add_subdirectory, but I get the following error when I do that with my program:

/home/amai/splinetracers/support/tiny-cuda-nn/include/tiny-cuda-nn/common_device.h(96): error: no suitable constructor exists to convert from "float" to "__half"
   return (half)relu<float>((float)val);
                ^

/home/amai/splinetracers/support/tiny-cuda-nn/include/tiny-cuda-nn/vec.h(214): error: no suitable conversion function from "__half" to "float" exists
   return fmaf(a, b, c);
               ^

Inspecting the code around this error shows that there are some switches based on the CUDA arch, which might not be set correctly:

#ifdef __CUDACC__
inline TCNN_DEVICE __half fma(__half a, __half b, __half c) {
#if defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600
	return __hfma(a, b, c);
#else
	return fmaf(a, b, c);
#endif
}
#endif

Based on the error, I don't think CUDA_ARCH is set correctly. I have an NVIDIA GTX3090 with CUDA 12.3. This corresponds to compute_86, which is greater than 60. I'm not sure why it's not finding the arch correctly. My CMakeList.txt file is pretty cursed though, as it is integrating Optix, CUDA, and Pytorch.

@half-potato
Copy link
Author

I just wanted to add that tiny-cuda-nn does build outside of my project, just not inside it.

@cogwheel
Copy link

cogwheel commented Jun 7, 2024

You need to set a minimum architecture with TCNN_MIN_GPU_ARCH. E.g. I used -DTCNN_MIN_GPU_ARCH=75 to enable features on RTX 2000 series and later cards. And then make sure your cuda compiler is set to build for at least that instruction set.

I recently got TCNN building in our bazel project. See here for details: fbriggs/lifecast_public@aaa1000#diff-aef1b984940ceb407dca09e98a080c07a1cecbb1ae6b386caa0028e03e45bc48 - Note that this does not build any of the MLP code, only the encoders. It should just be a matter of adding the other .cu sources.

Also note: I basically had to reimplement all the functionality from the python bindings in C++ in order to interface with the C++ version of torch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants