Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTX3080 / CUDA 11.0 support #34

Open
mavanmanen opened this issue Jan 13, 2021 · 10 comments · May be fixed by #146
Open

RTX3080 / CUDA 11.0 support #34

mavanmanen opened this issue Jan 13, 2021 · 10 comments · May be fixed by #146

Comments

@mavanmanen
Copy link

Really wanted to try this out but sadly won't be able to because only CUDA 10.x is supported. Has anyone found a way to get this working with CUDA 11?

@simon-rob
Copy link

simon-rob commented Jan 14, 2021

@mavanmanen I got it working on Ubuntu with a RTX 3090 CUDA 11.1 via the following

Assuming you have successfully compiled TensorFlow 2.x locally

change all: (in .py files)

import tensorflow as tf

To:

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

To convert pkl files to tensorflow 2.x run from the StyeFlow folder:

sed -i -e 's/import tensorflow as tf/import tensorflow.compat.v1 as tf\ntf.disable_v2_behavior()/g' .stylegan2-cache/62233eb618af2672cd321cfb46c72c6a_http___d36zk2xti64re0.cloudfront.net_stylegan2_networks_stylegan2-ffhq-config-f.pkl

sed -i -e 's/Network architectures used in the StyleGAN2 paper./tyleGAN2 paper./g' .stylegan2-cache/62233eb618af2672cd321cfb46c72c6a_http___d36zk2xti64re0.cloudfront.net_stylegan2_networks_stylegan2-ffhq-config-f.pkl

Change: line 133 of nnlib/tflib/custom_ops.py

compile_opts += ' --compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=0

To:

compile_opts += ' --compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=1

comment out the following in: dnnlib/tflib/tfutil.py

#import tensorflow.contrib   # requires TensorFlow 1.x!
#tf.contrib = tensorflow.contrib

`

@mavanmanen
Copy link
Author

@simon-rob That's great, any idea if this will work on windows too?

@justinjohn0306
Copy link

justinjohn0306 commented Jan 17, 2021 via email

@mavanmanen
Copy link
Author

mavanmanen commented Jan 20, 2021

Sadly I am now stuck with the following error:

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(174): error: expected an expression

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(174): error: no instance of constructor "tensorflow::register_op::OpDefBuilderWrapper::OpDefBuilderWrapper" matches the argument list
            argument types are: (const char [13], __nv_bool)

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(175): error: expected an expression

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(175): error: expected an expression

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(175): error: expected a type specifier

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(175): error: expected an expression

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(176): error: expected an expression

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(176): error: expected an expression

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(176): error: expected a type specifier

C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu(176): error: expected an expression

10 errors detected in the compilation of "C:/Users/mavan/Downloads/StyleFlow-Windows-10-master/dnnlib/tflib/ops/fused_bias_act.cu".
_pywrap_tensorflow_internal.lib
fused_bias_act.cu

@JakeHeintz
Copy link

Having the same error as @mavanmanen

@dmalyavin
Copy link

@simon-rob would you or anyone else who managed to get it to work on a 3090 be able to put the sequence in which you setup your environment? Thank you!

@frosty3907
Copy link

Also interested in this.

@fungtion
Copy link

fungtion commented Apr 9, 2021

@dmalyavin you can try to replace tf stylegan2 with pytorch stylegan2 .

@Minabsapi
Copy link

@mavanmanen I got it working on Ubuntu with a RTX 3090 CUDA 11.1 via the following

Assuming you have successfully compiled TensorFlow 2.x locally

change all: (in .py files)

import tensorflow as tf

To:

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

To convert pkl files to tensorflow 2.x run from the StyeFlow folder:

sed -i -e 's/import tensorflow as tf/import tensorflow.compat.v1 as tf\ntf.disable_v2_behavior()/g' .stylegan2-cache/62233eb618af2672cd321cfb46c72c6a_http___d36zk2xti64re0.cloudfront.net_stylegan2_networks_stylegan2-ffhq-config-f.pkl

sed -i -e 's/Network architectures used in the StyleGAN2 paper./tyleGAN2 paper./g' .stylegan2-cache/62233eb618af2672cd321cfb46c72c6a_http___d36zk2xti64re0.cloudfront.net_stylegan2_networks_stylegan2-ffhq-config-f.pkl

Change: line 133 of nnlib/tflib/custom_ops.py

compile_opts += ' --compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=0

To:

compile_opts += ' --compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=1

comment out the following in: dnnlib/tflib/tfutil.py

#import tensorflow.contrib   # requires TensorFlow 1.x!
#tf.contrib = tensorflow.contrib

`

This trick didn't work for me on CUDA 11.4

2021-08-07 23:42:29.450490: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:From /home/minabsapi/anaconda3/envs/StyleFlow/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
----------------- Options ---------------
                batchSize: 1                             
          checkpoints_dir: ./checkpoints                 
                 dataroot: ./data/datasetX               
                  gpu_ids: 0                             
     max_result_snapshots: 30                            
                    model: xxxx                          
                     name: XXXX                          
              network_pkl: gdrive:networks/stylegan2-ffhq-config-f.pkl
            only_for_test: ...                           
                    phase: test                          
----------------- End -------------------
Loading networks from "gdrive:networks/stylegan2-ffhq-config-f.pkl"...
Downloading http://d36zk2xti64re0.cloudfront.net/stylegan2/networks/stylegan2-ffhq-config-f.pkl ... done
2021-08-07 23:46:33.377141: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-08-07 23:46:33.473857: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-07 23:46:33.474461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GT 730 computeCapability: 3.5
coreClock: 0.9015GHz coreCount: 2 deviceMemorySize: 1,95GiB deviceMemoryBandwidth: 13,41GiB/s
2021-08-07 23:46:33.474609: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-07 23:46:34.353215: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-08-07 23:46:34.353418: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-08-07 23:46:34.529012: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-08-07 23:46:34.625099: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-08-07 23:46:34.784685: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-08-07 23:46:34.878989: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-08-07 23:46:34.953630: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-08-07 23:46:34.953909: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-07 23:46:34.954475: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-07 23:46:34.966209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-08-07 23:46:34.981201: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-08-07 23:46:41.748710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-07 23:46:41.748773: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-08-07 23:46:41.748796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-08-07 23:46:41.799672: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-07 23:46:41.800308: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-07 23:46:41.800839: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-07 23:46:41.801291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 801 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GT 730, pci bus id: 0000:01:00.0, compute capability: 3.5)
Traceback (most recent call last):
  File "main.py", line 367, in <module>
    ex = ExWindow(opt)
  File "main.py", line 42, in __init__
    self.EX = Ex(opt)
  File "main.py", line 72, in __init__
    self.init_deep_model(opt)
  File "main.py", line 107, in init_deep_model
    self.model = Build_model(self.opt)
  File "/home/minabsapi/Téléchargements/StyleFlow/utils.py", line 202, in __init__
    _G, _D, Gs = pretrained_networks.load_networks(network_pkl)
  File "/home/minabsapi/Téléchargements/StyleFlow/pretrained_networks.py", line 76, in load_networks
    G, D, Gs = pickle.load(stream, encoding='latin1')
  File "/home/minabsapi/Téléchargements/StyleFlow/dnnlib/tflib/network.py", line 299, in __setstate__
    self._init_graph()
  File "/home/minabsapi/Téléchargements/StyleFlow/dnnlib/tflib/network.py", line 156, in _init_graph
    out_expr = self._build_func(*self.input_templates, **build_kwargs)
  File "<string>", line 451, in G_synthesis_stylegan2
AttributeError: module 'tensorflow' has no attribute 'get_variable'

GNU/Linux Ubuntu 20.04
tensorflow 2.5.0

@alicedingyueming
Copy link

tectur

I have the same problem when using stylegan2 in TF.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants