Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] gpu_info incorrect if statement for specific driver version #22

Open
austinmw opened this issue May 2, 2022 · 2 comments
Open

Comments

@austinmw
Copy link

austinmw commented May 2, 2022

With my specific combination of NVIDIA driver and CUDA version, the function gpu_info has an incompatible/incorrect if-else statement. It turns out that in this case, nvidia-smi prints in the following manner:

Mon May  2 18:25:57 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                    0 |
| N/A   31C    P0    23W / 300W |     11MiB / 16160MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+  

And therefore should have the if rather than the else logic applied.

@Felix-Petersen
Copy link
Contributor

I introduced this logic in the update from 0.8.0 to 0.9.0 because it is typically correct on other systems (see the printout below).
I suggest you install version 0.8.0 in your specific case. If we know what is the specific criterion distinguishing between the two and three line printout, we can adjust the condition, respectively. Maybe it is for you because of an old GPU or maybe they changed it with a more recent CUDA version?!

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  RTX A6000           Off  | 00000000:01:00.0 Off |                  Off |
| 30%   49C    P2   109W / 300W |   6122MiB / 48685MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

@austinmw
Copy link
Author

austinmw commented Jun 14, 2022

I think it may have been changed with more recent versions. I'm using a V100 GPU. Also I think NVIDIA driver is at v512.95 currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants