Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST/MAINT: special: test with CuPy, make some CUDA fixes #20701

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

steppi
Copy link
Contributor

@steppi steppi commented May 12, 2024

What does this implement/fix?

This PR makes a handful of fixes that are needed for some C++ special functions in scipy/special/special/ to work in CuPy. I've also added a framework for testing that special functions work in CuPy; it autogenerates CuPy ElementwiseKernels. It is incomplete in that it cannot handle ufuncs with signatures like dd*dd->*i for those which return multiple arguments, and it cannot handle gufuncs. I have only added tests for a handful of functions, but plan to follow up with more tests for more functions (and likely more CUDA fixes).

Additional information

@rlucas7, I whipped this up because you had mentioned you have some bandwidth to work on special this weekend, and I currently have no outstanding PRs. @izaid, you may be interested in this as well, but I know you are very busy at the moment.

@steppi steppi requested a review from person142 as a code owner May 12, 2024 03:45
@github-actions github-actions bot added scipy.special C/C++ Items related to the internal C/C++ code base Meson Items related to the introduction of Meson as the new build system for SciPy maintenance Items related to regular maintenance tasks labels May 12, 2024
@izaid
Copy link
Contributor

izaid commented May 12, 2024

This is great! Very useful, glad you went ahead and did it.

At some point, we may want to hook this into the normal tests. Perhaps there is a way to create a mock array API special extension or something. In any case, great.

@steppi
Copy link
Contributor Author

steppi commented May 12, 2024

At some point, we may want to hook this into the normal tests. Perhaps there is a way to create a mock array API special extension or something. In any case, great.

Unfortunately, these tests are truly deserving of the xslow mark. Even just for the handful of functions here it takes over 2 minutes for these tests to complete on my machine.

I also want to point out to everyone other than @izaid, that the name test_xsf_cuda.py is because xsf is the name @izaid and I have settled on for the C++ special function library we are working on which we intend to split off from SciPy. We plan to change the namespace and folder for this library to xsf shortly. At the moment it is called special, which is a bit confusing.

@rlucas7
Copy link
Member

rlucas7 commented May 13, 2024

Thanks @steppi I only saw the email notice for this one late yesterday evening. I will take a look one evening this week if I have bandwidth, o/w it may need to wait for next weekend for me to take a look.

@steppi
Copy link
Contributor Author

steppi commented May 15, 2024

@fancidev, I've made iv_ratio work on CUDA here, following up on #20457 (review).

@lucascolley lucascolley changed the title TST:MAINT Add tests that special functions work in CuPy and make some CUDA fixes TST/MAINT: special: test with CuPy, make some CUDA fixes May 15, 2024
@fancidev
Copy link
Contributor

fancidev commented May 15, 2024

@fancidev, I've made iv_ratio work on CUDA here, following up on #20457 (review).

Thank you very much @steppi !

Are the CUDA tests added by this PR going to be included in the CI workflow? If included, it could be helpful for people without CUDA (like me) to fix the code errors by checking the error messages in the CI logs.

Btw, if you split the code into a separate C++ library for the wider audience, I expect some C++ folks will frown upon changing things in the std namespace (when using CUDA), because that leads to undefined behavior. For better cooperation with other potential user code, it may be better to make an xsf specific namespace (such as xsf::std) to do what you now do with std. In addition, if someone wants to call CUDA version and CPU version of a special function in the same binary, is it possible for them to do so?

@steppi
Copy link
Contributor Author

steppi commented May 15, 2024

Are the CUDA tests added by this PR going to be included in the CI workflow? If included, it could be helpful for people without CUDA (like me) to fix the code errors by checking the error messages in the CI logs.

Not at the moment, the tests are 1) very slow, 2) require a machine with an NVIDIA GPU, and 3) SciPy's test suites already take very long to finish. We've discussed adding GPU CI after splitting the library off though.

Btw, if you split the code into a separate C++ library for the wider audience, I expect some C++ folks will frown upon changing things in the std namespace (when using CUDA), because that leads to undefined behavior. For better cooperation with other potential user code, it may be better to make an xsf specific namespace (such as xsf::std) to do what you now do with std. In addition, if someone wants to call CUDA version and CPU version of a special function in the same binary, is it possible for them to do so?

You're preaching to the choir about injecting things into std and this has already been discussed extensively; it's just a temporary expedient. @izaid proposed doing things this way for now as a temporary shortcut as we get things off the ground. Right now we are just trying to ensure things can be used in both SciPy and CuPy and are focusing on the functions themselves.

In addition, if someone wants to call CUDA version and CPU version of a special function in the same binary, is it possible for them to do so?

When compiling for CUDA, everything is marked as __host__ __device__ and libcudacxx is cross platform, so I think this should work. No one has tested it yet though.

@izaid
Copy link
Contributor

izaid commented May 16, 2024

Btw, if you split the code into a separate C++ library for the wider audience, I expect some C++ folks will frown upon changing things in the std namespace (when using CUDA), because that leads to undefined behavior. For better cooperation with other potential user code, it may be better to make an xsf specific namespace (such as xsf::std) to do what you now do with std. In addition, if someone wants to call CUDA version and CPU version of a special function in the same binary, is it possible for them to do so?

Yeah, I'm not really bothered by this. Yes, theoretically, it's undefined behaviour. In practice, I've not seen it be an issue. I think we should try to make things be as easy for people developing the library as possible, and if that means injecting things into std because CUDA is missing (say) std::complex then so be it. We can reevaluate this as we go, but for now the effort should be expended on the numerical algorithms.

To be clear, we are simply putting things into std which are missing on other platforms. Of course no one should just put random custom things into std.

In addition, if someone wants to call CUDA version and CPU version of a special function in the same binary, is it possible for them to do so?

Yes, that is what __host__ __device__ means generally in CUDA. Also the library is header-only, so there's no binary.

@steppi
Copy link
Contributor Author

steppi commented May 16, 2024

To be clear, we are simply putting things into std which are missing on other platforms. Of course no one should just put random custom things into std.

Yeah, that's an important point I should have mentioned. When compiling for CUDA, there is no std; we're injecting things there so we can use the same namespace in both CUDA and regular C++ while making the C++ code look "normal". I think the only potential issue here is not playing nice with other code which also injects things into std, like cuda_workaround.h in CuPy. This for instance means that our C++ special functions cannot currently be used in most other CUDA code in CuPy outside of special.

@rlucas7
Copy link
Member

rlucas7 commented May 19, 2024

I took a look today. I was able to run the tests which all pass but the tests for cupy are skipped unless run on a gpu device (as described already in the thread). I was going to try to setup a gpu device to run the tests but I'm not able to access the account with gpu devices @steppi and I've been sharing) Albert are you able to access?

If so that means it is something on my end I can work to resolve.

@steppi
Copy link
Contributor Author

steppi commented May 19, 2024

I took a look today. I was able to run the tests which all pass but the tests for cupy are skipped unless run on a gpu device (as described already in the thread). I was going to try to setup a gpu device to run the tests but I'm not able to access the account with gpu devices @steppi and I've been sharing) Albert are you able to access?

If so that means it is something on my end I can work to resolve.

I'm able to access the account without issue. I can dm you on the scipy community slack so we can troubleshoot.

@rlucas7
Copy link
Member

rlucas7 commented May 20, 2024

I took a look today. I was able to run the tests which all pass but the tests for cupy are skipped unless run on a gpu device (as described already in the thread). I was going to try to setup a gpu device to run the tests but I'm not able to access the account with gpu devices @steppi and I've been sharing) Albert are you able to access?
If so that means it is something on my end I can work to resolve.

I'm able to access the account without issue. I can dm you on the scipy community slack so we can troubleshoot.

I think I figured it out this morning, sent the details in response to dm. Thanks! When I get some time this week I'll try to test it in the cuda.

@rlucas7
Copy link
Member

rlucas7 commented May 28, 2024

When I get some time this week I'll try to test it in the cuda.

I setup a clean env using the DL ami in aws and i'm not able to get the test module here test_xsf_cuda.py to be executed by the test harness, there must be something I'm missing @steppi ?

What I've done:

  1. clean install of this PR with cupy and cuda 12.1 on aws dl ami
  2. reran clean build after git clean -xdf

Is there an env var that maybe need to be set for the test_xsf_cuda.py module to execute?

I'm also not sure how to confirm that the cuda kernel's are getting properly compiled with the python dev.py test harness here.

@rlucas7
Copy link
Member

rlucas7 commented May 28, 2024

When I get some time this week I'll try to test it in the cuda.

I setup a clean env using the DL ami in aws and i'm not able to get the test module here test_xsf_cuda.py to be executed by the test harness, there must be something I'm missing @steppi ?

What I've done:

  1. clean install of this PR with cupy and cuda 12.1 on aws dl ami
  2. reran clean build after git clean -xdf

Is there an env var that maybe need to be set for the test_xsf_cuda.py module to execute?

I'm also not sure how to confirm that the cuda kernel's are getting properly compiled with the python dev.py test harness here.

Ah spoke/wrote too soon, I think I figured it out, @steppi I sent you a dm with the particulars, lmk if that looks correct to you and if so I can post the specifics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C/C++ Items related to the internal C/C++ code base maintenance Items related to regular maintenance tasks Meson Items related to the introduction of Meson as the new build system for SciPy scipy.special
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants