Add Vulkan driver API #624

dcmvdbekerom · 2023-10-09T16:36:32Z

Description

This pull request is to add Vulkan driver API. This adds support for all vendors (Nvidia/AMD/Intel/etc.) on all platforms (Windows/Linux/Mac).

For current PR:

Cuda functionality (technically not urgently needed):

Lower priority:

Convert GLSL to C++/CUDA
Replace CuFFT with VkFFT with CUDA backend
implement fft_backend
Use VkFFT for compilation of GPU code
Get bindings from GLSL (as opposed to hardcoded in gpu.py)
Bind descriptor set only once (as opposed to with every pipeline)
Dynamic scaling / remove max T and p / rewrite command buffer on the fly
Add pocketfft binary for multithreaded CPU FFT's

currently failing to init shader, due to my vulkan version not being sufficiently high

(to be removed once everything works)

moved back to the regular strides (don't need +2)

specialization cant be used for array lengths, becuase the layout can't be changed, so instead I use one large variable length buffer that is manually carved out.

dcmvdbekerom · 2023-10-15T14:23:14Z

Vulkan is working! It is still quite a bit slower, ~100ms (Vulkan) vs ~4ms (Cuda) on a Nvidia card, but I'm quite confident this can be solved through a better understanding of Vulkan and how to run pipelines efficiently.

Still lot's of open points, but the basic functionality is there.

You can change the GPU that's used by changing the deviceID keyword in line 108 of radis.gpu.gpu.py (changing the device will obviously be possible through keywords to spectrumfactory in the future).

At least for a dedicated card: -Nvidia RTX2070 = 4ms. Integrated cards are still much slower: -Radeon Vega 8: 30ms. -Intel UHD 630: 80ms.

minouHub · 2023-10-18T07:10:53Z

Here is a test I tried.

from radis import SpectrumFactory, plot_diff

sf = SpectrumFactory(
    2150,
    2450,  # cm-1
    molecule="CO2",
    isotope="1,2,3",
    wstep=0.002,
)

sf.fetch_databank("hitemp")

T = 1500.0  # K
p = 1.0  # bar
x = 0.8
l = 0.2  # cm
w_slit = 0.5  # cm-1

# s_cpu = sf.eq_spectrum(
#     name="CPU",
#     Tgas=T,
#     pressure=p,
#     mole_fraction=x,
#     path_length=l,
# )
# s_cpu.apply_slit(w_slit, unit="cm-1")

s_gpu = sf.eq_spectrum_gpu(
    name="GPU",
    Tgas=T,
    pressure=p,
    mole_fraction=x,
    path_length=l,
    backend="gpu-cuda",
)
s_gpu.apply_slit(w_slit, unit="cm-1")
# plot_diff(s_cpu, s_gpu, var="emissivity", wunit="nm", method="diff")

In ``C:\Users\Nicolas Minesi\.radisdb\hitemp`` keep only relevant input files:
CO2-02_02125-02250_HITEMP2010.hdf5
CO2-02_02250-02500_HITEMP2010.hdf5
Selected card (deviceID=0):
[X] 0: Intel(R) UHD Graphics
[ ] 1: NVIDIA RTX A1000 Laptop GPU

Traceback (most recent call last):

  File C:\Anaconda\envs\radis-vulkan2\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File c:\users\nicolas minesi\python\examples\plot_gpu.py:49
    s_gpu = sf.eq_spectrum_gpu(

  File c:\users\nicolas minesi\python\radis\lbl\factory.py:1144 in eq_spectrum_gpu
    gpu_init(

  File c:\users\nicolas minesi\python\radis\gpu\gpu.py:214 in gpu_init
    app.schedule_shader(

  File c:\users\nicolas minesi\python\radis\gpu\vulkan\vulkan_compute_lib.py:84 in schedule_shader
    pipeline, pipelineLayout, computeShaderModule = self.createComputePipeline(

  File c:\users\nicolas minesi\python\radis\gpu\vulkan\vulkan_compute_lib.py:426 in createComputePipeline
    if len(pipelines) == 1:

TypeError: cdata of type 'struct VkPipeline_T *' has no len()

I really like the list of available GPU in the prompt btw. I tried with both GPU and it gave the same output.

dcmvdbekerom · 2023-10-18T11:30:33Z

It looks like the driver wasn't able to load (one of) the shaders (this is what kernels are called in Vulkan). This can happen because Vulkan can use extensions that may or may not be present on your particular device. In the future we can test for this at runtime, but only once we know what the problem is. Which extensions are present depend both on the hardware capabilities as well as the Vulkan version.

My suspicion is that EXT_scalar_block_layout is the culprit. To test this, we first need to check your capabilities with the Vulkan capability viewer

Then can you report back what is the value of Properties->Core 1.0->apiVersion? And what is the value of Features->Core 1.2->scalarBlockLayout? (this should be True or False). Please check these values for both of your cards (change with "Select device")

dcmvdbekerom · 2023-11-02T19:37:23Z

Now also works on linux! Unfortunately not on Travis :(

This is to get the .so to run on travis

erwanp · 2023-11-04T16:31:15Z

@dcmvdbekerom aren't the test failing just because you aren't running on a Travis-GPU enabled instance ?

It can be activated here:

https://docs.travis-ci.com/user/reference/overview/#gpu-vm-instance-size

dcmvdbekerom · 2023-11-05T20:23:33Z

Linux has a software (=CPU) renderer called llvmpipe, which is installed automatically with the vulkan driver package mesa-vulkan-driver. The problem was that llvm-pipe is only available for Ubuntu 20.04 and up, whereas Travis CI was running Ubuntu 16.04. After spending a full day trying to compile llvm-pipe for 16.04, I found you can change Travis to 20.04 🙈. A silver lining is that the compiled library vkfft_vulkan, needed to do the GPU FFT, can now be used down to 16.04 as long as it is used with a hardware GPU.

So in short... The tests are now working! Still failing for the moment, but at least they are now testing the actual code. The first GPU test passes, the second (multiple GPU plots) still has issues, likely related to memory leaks.

After moving from Ubuntu 16.04 -> 20.04 for the CI server, Cantera doesn't work anymore. Hopefully adding this back in will fix the issue...

codecov-commenter · 2023-11-13T11:56:15Z

Codecov Report

Merging #624 (8704926) into develop (cefd9a2) will increase coverage by 0.48%.
Report is 5 commits behind head on develop.
The diff coverage is 69.84%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #624      +/-   ##
===========================================
+ Coverage    72.98%   73.46%   +0.48%     
===========================================
  Files          148      150       +2     
  Lines        21066    21464     +398     
===========================================
+ Hits         15374    15769     +395     
- Misses        5692     5695       +3

this is needed when a single valued dataframe is converted to float

It turns out the FFT output buffer needs to be zeroed before use. Also extended the test to N plots; sometimes the error didn't appear so more plots were needed.

dcmvdbekerom · 2023-11-14T22:12:01Z

Tests are now passing! This means that the Vulkan driver is working on Linux and tested on the CI server.
As briefly mentioned above, in absence of GPU hardware the Linux Vulkan drivers default to a CPU renderer llvm-pipe, which is used by Travis to run the GPU tests.

There was an issue with some erratic output that was produced about ~50% of the time. This was related to not zeroing the buffers before executing the FFT's. Currently the buffers are zeroed during iteration, which may be excessive -- zeroing during initialization might suffice.

minouHub · 2023-11-16T09:23:24Z

I tried to do a GPU computation with exomol. It failed because the self broadening is not imported correctly. @dcmvdbekerom can you add that to the to-do list. I'll take a look if you want me to

Create vkfft_vulkan.dll

de135ff

dcmvdbekerom marked this pull request as draft October 9, 2023 16:36

dcmvdbekerom added 20 commits October 9, 2023 19:03

initial commit

996a4cf

shaders compile

d5ec2f8

lot's of updates

1203a5f

currently failing to init shader, due to my vulkan version not being sufficiently high

Update gpu.py

a3686db

add test dummy spectrum

99583f9

(to be removed once everything works)

progress on dummy spectrum test

8fbf2c9

convolution works, slight difference in peak height

a0c9eda

vkfft_vulkan.dll updated

ea4e471

moved back to the regular strides (don't need +2)

getting ready to calc LDM on GPU

93846e8

homemade atomicAdd works

1ad482d

working vulkan test!

229dea8

specialization cant be used for array lengths, becuase the layout can't be changed, so instead I use one large variable length buffer that is manually carved out.

start from stable test shaders

c09eac7

start from stable shaders

ce5a0d0

split up iter / init

5d9aa99

more progress

6cca1ca

Working test spectrum (part.fun. and coll.dep.broadening)

88371cd

test spectrum up to date with real spectrum

4b9d923

Vulkan GPU calculations working!

033756d

clean up

ad2cf71

Update .gitignore

184069a

Vulkan now as fast as CUDA

41a605c

At least for a dedicated card: -Nvidia RTX2070 = 4ms. Integrated cards are still much slower: -Radeon Vega 8: 30ms. -Intel UHD 630: 80ms.

dcmvdbekerom mentioned this pull request Oct 17, 2023

GPU Vulkan backend #617

Open

dcmvdbekerom added 3 commits October 19, 2023 00:25

add timestamps

6699041

add path to app

02a4544

app keyword no longer required for buffer creation

2d99637

dcmvdbekerom added 4 commits November 2, 2023 23:18

recompiled without libc

3f22673

This is to get the .so to run on travis

update travis glibc version and install vulkan drivers

0ee6a64

Update .travis.yml

519cd01

compile vkfft_vulkan.so with glibc v2.27

252710c

dcmvdbekerom added 8 commits November 5, 2023 16:03

Merge branch 'develop' into Vulkan-driver

5dfbc80

updated travis gpu test

d6c7358

Update .travis.yml

f9ade20

print vulkan version

6f6b01f

move vulkan dependency from requirements to environment

df718c5

Update requirements.txt

d8bdaaf

Update environment.yml

3b6575e

vulkan goes back to requirements.txt

44f7190

dcmvdbekerom added 4 commits November 12, 2023 21:40

Fix memory leaks in VkFFT app

79e840d

Update environment.yml

87e7af1

After moving from Ubuntu 16.04 -> 20.04 for the CI server, Cantera doesn't work anymore. Hopefully adding this back in will fix the issue...

downgrade sundials for Cantera compatibility

f9a9be6

Update .travis.yml

4a8e96d

dcmvdbekerom added 3 commits November 14, 2023 22:45

Propagate verbosity

fdd36a9

Fix ser.iloc[0] deprecation

6740309

this is needed when a single valued dataframe is converted to float

Linux Vulkan bug solved!

8704926

It turns out the FFT output buffer needs to be zeroed before use. Also extended the test to N plots; sometimes the error didn't appear so more plots were needed.

minouHub added 2 commits November 15, 2023 22:00

Tmax grid gpu is taken from maximum of partition functions

8302ebb

moved import at correct location

3b7be77

Update .readthedocs.yml

99327e0

minouHub modified the milestones: 0.15, 0.16 Mar 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Vulkan driver API #624

Add Vulkan driver API #624

dcmvdbekerom commented Oct 9, 2023 •

edited

dcmvdbekerom commented Oct 15, 2023

minouHub commented Oct 18, 2023

dcmvdbekerom commented Oct 18, 2023

dcmvdbekerom commented Nov 2, 2023

erwanp commented Nov 4, 2023

dcmvdbekerom commented Nov 5, 2023 •

edited

codecov-commenter commented Nov 13, 2023 •

edited

dcmvdbekerom commented Nov 14, 2023 •

edited

minouHub commented Nov 16, 2023

Add Vulkan driver API #624

Are you sure you want to change the base?

Add Vulkan driver API #624

Conversation

dcmvdbekerom commented Oct 9, 2023 • edited

Description

dcmvdbekerom commented Oct 15, 2023

minouHub commented Oct 18, 2023

dcmvdbekerom commented Oct 18, 2023

dcmvdbekerom commented Nov 2, 2023

erwanp commented Nov 4, 2023

dcmvdbekerom commented Nov 5, 2023 • edited

codecov-commenter commented Nov 13, 2023 • edited

Codecov Report

dcmvdbekerom commented Nov 14, 2023 • edited

minouHub commented Nov 16, 2023

dcmvdbekerom commented Oct 9, 2023 •

edited

dcmvdbekerom commented Nov 5, 2023 •

edited

codecov-commenter commented Nov 13, 2023 •

edited

dcmvdbekerom commented Nov 14, 2023 •

edited