CUDA Dispatcher refactor 2: inherit from `dispatcher.Dispatcher` #7815

gmarkall · 2022-02-04T17:54:14Z

(Note: based on #7814 - it should be easier to tell the differences once that is merged - in the meantime this comparison shows the changes unique to this PR)

The main accomplishment of this PR is to make the CUDA dispatcher inherit from numba.core.dispatcher.Dispatcher and reuse as much of its logic as possible, rather than duplicating it (and also to serve as a base for implementing on-disk caching, AOT compilation, better support for the high-level extension API, etc.).

This necessitates a set of individual changes that accompany the main change:

Creation of a CUDACompileResult, which reports its entry point as its id - this is because certain functions expect compile results to have entry points - however, this isn't true for CUDA because you can't enter a CUDA kernel through the C-layer dispatcher. Using the id gives each CUDA compile result a unique identifier.
A small change to compile_result to factor out sanitizing the entries that go into the compile result, so that this logic can be shared between the core and the CUDA target.
Changes to numba.cuda.decorators so it is responsible for handling signatures and compilation - the CUDA dispatcher no longer needs to be aware of signatures, and this behaviour mirrors the CPU target. CUDA still only supports one signature, but this can now be more easily changed in future.
Fixes to the setup of the extensions target option, so that the CUDA dispatcher no longer needs to make a strange "defensive copy".
Two additions to _Kernel so that they have the objectmode and entry_point properties, as if they were compile results - this is because overloads are expected to be compile results by some of the core dispatcher logic.
Some refactors to ForAll that make the terminology more representative (but no functional change).
Fix a very strange "how did it ever work"-style bug in resolve_value_type, where it was wrapping a CPU dispatcher in a CUDA dispatcher instead of wrapping the Python function in a CUDA dispatcher when calling an @njit function from a @cuda.jit function.
Add test of docstrings for CUDA dispatchers (Use importlib to load numba extensions #5209).

Fixes #7741.

Description prior to editing, kept for reference:

Fixes #5902.

To do:

Resolve the issue with _make_finalizer - possibly one of:
- be certain that a dummy finalizer is fine,
- remove the finalization altogether if it's no longer needed (perhaps this is an old Python / Python 2 remnant)
- implement a custom non-dummy finalizer
- make _Kernel pretend to be a CompileResult a bit more by giving it an entry_point.
Add a test that cuda.jit does NOT preserve the original __doc__ and __module__ of the decorated function #5902 is fixed (I have only manually verified it so far).

Current test status: ``` Ran 1266 tests in 102.919s FAILED (failures=1, errors=5, skipped=12, expected failures=7) ```

Test results: ``` Ran 1266 tests in 127.002s FAILED (failures=66, errors=17, skipped=12, expected failures=7) ```

Test results now: ``` Ran 1266 tests in 105.203s OK (skipped=12, expected failures=7) ```

…atcher

…mplate

gmarkall · 2022-02-04T17:54:22Z

gpuci run tests

Also update some comments.

gmarkall · 2022-02-04T22:58:12Z

gpuci run tests

gmarkall · 2022-02-04T23:06:38Z

Resolve the issue with _make_finalizer

Regarding this, my present thinking is that the way to go is to give _Kernel objects an entry_point of None, because they were never inserted into the target context. We still should have the finalizer for device functions, because they are inserted into the target context, and are actual compile results with an entry point.

- For device functions, this works exactly like on the CPU target because device functions are CompileResult objects that are inserted into the target context (and therefore need removing by the finalizer). - For kernels, we give them a dummy entry point because they were never inserted and don't need removing (similar to with object mode functions).

gmarkall · 2022-02-04T23:14:42Z

gpuci run tests

gmarkall · 2022-02-04T23:20:54Z

gpuci run tests

gmarkall · 2022-02-08T11:05:54Z

@stuartarchibald I've merged main into this PR so the diff is now straightforward (compared to what it was 😆)

gpuci run tests

stuartarchibald · 2022-02-09T11:00:53Z

@stuartarchibald I've merged main into this PR so the diff is now straightforward

Thanks @gmarkall, the patch is much easier too look at now!

stuartarchibald

Thanks for the patch, great to see all the prior refactoring effort finally manifest in this PR!

Given the logical changes herein were discussed extensively OOB whilst implementing this, I'm of the view that they are correct! The only comments I have are that there's a couple of typos to look at but otherwise this looks good.

I think due to the nature of the change, it'd be good to build this PR locally and run the RAPIDS test suite against it just to provide another layer of confidence on top of GPU CI and the Numba build farm. Thanks again!!

numba/cuda/compiler.py

numba/cuda/dispatcher.py

Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>

gmarkall · 2022-02-11T09:47:48Z

gpuci run tests

gmarkall · 2022-02-11T09:49:02Z

Status marked as:

"Waiting on CI" for BuildFarm, and
"Waiting on author" for a run with RAPIDS.

gmarkall · 2022-02-11T09:52:37Z

gpuci run tests

stuartarchibald

Thanks for the patch and fixes.

gmarkall · 2022-02-16T00:12:00Z

This passed a run with RAPIDS, so I'll remove "Waiting on author" - I think it's only waiting on the BuildFarm run now?

stuartarchibald · 2022-02-16T10:02:29Z

Buildfarm ID: numba_smoketest_cuda_yaml_117.

stuartarchibald · 2022-02-16T11:26:35Z

Buildfarm ID: numba_smoketest_cuda_yaml_117.

This failed as mainline is currently failing on CUDA due to issues hopefully resolved in #7846

…0220204

gmarkall · 2022-02-17T10:35:53Z

gpuci run tests

gmarkall · 2022-02-17T12:00:25Z

@stuartarchibald I've merged from master and re-tested with CI and gpuCI now that #7846 is merged. Could this have another buildfarm run please?

stuartarchibald · 2022-02-18T11:55:17Z

Buildfarm ID: numba_smoketest_cuda_yaml_119.

stuartarchibald · 2022-02-18T12:21:10Z

Buildfarm ID: numba_smoketest_cuda_yaml_119.

Passed!

stuartarchibald · 2022-02-18T12:22:59Z

This passed a run with RAPIDS, so I'll remove "Waiting on author" - I think it's only waiting on the BuildFarm run now?

@gmarkall Thanks for testing this against RAPIDS, as this patch is passing the test suite on an external code base it increases my confidence that we got this right!

The Numba buildfarm also passed, have marked this as ready to merged!

gmarkall added 11 commits February 4, 2022 12:18

CUDA Dispatcher: inherit from _DispatcherBase

d7f58ed

Current test status: ``` Ran 1266 tests in 102.919s FAILED (failures=1, errors=5, skipped=12, expected failures=7) ```

[WIP] Fix CUDA target tests

31c1a64

Some tidy-up

d9649f8

[WIP] CUDA Dispatcher inherits from uber_Dispatcher

70dbacd

Test results: ``` Ran 1266 tests in 127.002s FAILED (failures=66, errors=17, skipped=12, expected failures=7) ```

Wire in target options in CUDA dispatcher

922624f

Test results now: ``` Ran 1266 tests in 105.203s OK (skipped=12, expected failures=7) ```

Delete some dead code

ed2e17a

Remove sigs from cuda dispatcher

8238610

Some refactoring

53e98fb

Some renaming to closer align ForAll with reality

e307e9a

Delete _search_new_conversions and nopython_signatures from CUDA disp…

703d222

…atcher

Remove CUDA Dispatchers disable_compile and add a note to get_call_te…

13aa50b

…mplate

gmarkall added 5 commits February 4, 2022 21:57

Clarify comment on CUDA Compile Result

a11269e

Remove duplication in compile_result functions

18c553f

Refactor / tidy up decorators.py

3ebcf26

Rename CUDA dispatcher to CUDADispatcher

4f5b361

Also update some comments.

Clarify some comments

078f2d9

gmarkall added 2 commits February 4, 2022 23:11

Fix ref to CUDADispatcher in docs

778f7e2

Add test for Issue numba#5902

4d9cff7

gmarkall mentioned this pull request Feb 4, 2022

Add wrapper to cuda.jit #7384

Closed

gmarkall changed the title ~~[WIP] CUDA Dispatcher refactor 2~~ CUDA Dispatcher refactor 2: inherit from dispatcher.Dispatcher Feb 4, 2022

gmarkall added 3 - Ready for Review CUDA CUDA related issue/PR labels Feb 4, 2022

gmarkall added this to the Numba 0.56 RC milestone Feb 4, 2022

gmarkall marked this pull request as ready for review February 4, 2022 23:39

gmarkall added the Effort - medium Medium size effort needed label Feb 8, 2022

stuartarchibald reviewed Feb 11, 2022

View reviewed changes

numba/cuda/compiler.py Outdated Show resolved Hide resolved

numba/cuda/compiler.py Outdated Show resolved Hide resolved

numba/cuda/dispatcher.py Outdated Show resolved Hide resolved

stuartarchibald added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Feb 11, 2022

Fix typos (PR numba#7815 review)

ef5b4dc

Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>

gmarkall added the 4 - Waiting on CI Review etc done, waiting for CI to finish label Feb 11, 2022

Fix line length in cuda/compiler.py

499623f

stuartarchibald approved these changes Feb 11, 2022

View reviewed changes

gmarkall added Pending BuildFarm For PRs that have been reviewed but pending a push through our buildfarm and removed 4 - Waiting on author Waiting for author to respond to review labels Feb 16, 2022

Merge remote-tracking branch 'numba/main' into cuda-dispatcher-base-2…

17f9585

…0220204

sklam merged commit 27426b5 into numba:main Feb 18, 2022

gmarkall mentioned this pull request Apr 6, 2022

remove the roadmap from the sphinx based docs #7963

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Dispatcher refactor 2: inherit from `dispatcher.Dispatcher` #7815

CUDA Dispatcher refactor 2: inherit from `dispatcher.Dispatcher` #7815

gmarkall commented Feb 4, 2022 •

edited

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 8, 2022

stuartarchibald commented Feb 9, 2022

stuartarchibald left a comment

gmarkall commented Feb 11, 2022

gmarkall commented Feb 11, 2022

gmarkall commented Feb 11, 2022

stuartarchibald left a comment

gmarkall commented Feb 16, 2022

stuartarchibald commented Feb 16, 2022

stuartarchibald commented Feb 16, 2022

gmarkall commented Feb 17, 2022

gmarkall commented Feb 17, 2022

stuartarchibald commented Feb 18, 2022

stuartarchibald commented Feb 18, 2022

stuartarchibald commented Feb 18, 2022

CUDA Dispatcher refactor 2: inherit from dispatcher.Dispatcher #7815

CUDA Dispatcher refactor 2: inherit from dispatcher.Dispatcher #7815

Conversation

gmarkall commented Feb 4, 2022 • edited

Description prior to editing, kept for reference:

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 4, 2022

gmarkall commented Feb 8, 2022

stuartarchibald commented Feb 9, 2022

stuartarchibald left a comment

Choose a reason for hiding this comment

gmarkall commented Feb 11, 2022

gmarkall commented Feb 11, 2022

gmarkall commented Feb 11, 2022

stuartarchibald left a comment

Choose a reason for hiding this comment

gmarkall commented Feb 16, 2022

stuartarchibald commented Feb 16, 2022

stuartarchibald commented Feb 16, 2022

gmarkall commented Feb 17, 2022

gmarkall commented Feb 17, 2022

stuartarchibald commented Feb 18, 2022

stuartarchibald commented Feb 18, 2022

stuartarchibald commented Feb 18, 2022

CUDA Dispatcher refactor 2: inherit from `dispatcher.Dispatcher` #7815

CUDA Dispatcher refactor 2: inherit from `dispatcher.Dispatcher` #7815

gmarkall commented Feb 4, 2022 •

edited