Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cublas runtime error : library not initialized at /home/username/torch/extra/cutorch/lib/THC/THCGeneral.c:405 #147

Open
flyingintoskyq opened this issue May 16, 2021 · 0 comments

Comments

@flyingintoskyq
Copy link

Hi~ When I apply a pre-trained model using DATA_ROOT=./datasets/ae_photos name=style_cezanne_pretrained model=one_direction_test phase=test loadSize=256 fineSize=256 resize_or_crop="scale_width" th test.lua, I got the problem: cublas runtime error : library not initialized at /home/myuser/torch/extra/cutorch/lib/THC/THCGeneral.c:405.

The whole message is below:

------------------- Options -------------------	
                DATA_ROOT: ./datasets/ae_photos	
               align_data: 0	
             aspect_ratio: 1	
                batchSize: 1	
                cache_dir: ./cache	
          checkpoints_dir: ./checkpoints	
           continue_train: 1	
                    cudnn: 1	
                  display: 1	
               display_id: 200	
                 fineSize: 256	
                     flip: 0	
                      gpu: 1	
                 how_many: all	
                 input_nc: 3	
                 loadSize: 256	
                    model: one_direction_test	
                 nThreads: 1	
                     name: style_cezanne_pretrained	
                     norm: instance	
                output_nc: 3	
                    phase: test	
           resize_or_crop: scale_width	
              results_dir: ./results/	
           serial_batches: 1	
                     test: 1	
          which_direction: AtoB	
              which_epoch: latest	
-----------------------------------------------	
GPU Mode	
{
  cudnn : 1
  results_dir : "./results/"
  resize_or_crop : "scale_width"
  name : "style_cezanne_pretrained"
  which_direction : "AtoB"
  visual_dir : "/home/flyintoskyq/Desktop/CycleGAN-master/checkpoints/style_cezanne_pretrained/visuals"
  phase : "test"
  batchSize : 1
  fineSize : 256
  continue_train : 1
  nThreads : 1
  aspect_ratio : 1
  loadSize : 256
  gpu : 1
  test : 1
  DATA_ROOT : "./datasets/ae_photos"
  align_data : 0
  which_epoch : "latest"
  model : "one_direction_test"
  cache_dir : "./cache"
  norm : "instance"
  how_many : "all"
  input_nc : 3
  display : 1
  output_nc : 3
  flip : 0
  checkpoints_dir : "./checkpoints"
  display_id : 200
  serial_batches : 1
}
DataLoader UnalignedDataLoader was created.	
Starting donkey with id: 1 seed: 8350
table: 0x401f9d88
table: 0x419bc8a0
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
cmd..wc -L '/tmp/lua_KFQ4nU' |cut -f1 -d' '
205 samples found......................... 0/205 .......................................]  ETA: 0ms | Step: 0ms         
Updating classList and imageClass appropriately
 [======================================== 1/1 ========================================>]  Tot: 0ms | Step: 0ms         
Cleaning up temporary files
Dataset Size A: 	205	
Starting donkey with id: 1 seed: 7589
table: 0x41cbb000
table: 0x4143a480
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
cmd..wc -L '/tmp/lua_TAkCVd' |cut -f1 -d' '
205 samples found......................... 0/205 .......................................]  ETA: 0ms | Step: 0ms         
Updating classList and imageClass appropriately
 [======================================== 1/1 ========================================>]  Tot: 0ms | Step: 0ms         
Cleaning up temporary files
Dataset Size B: 	205	
use InstanceNormalization	
loading previously trained model (/home/flyintoskyq/Desktop/CycleGAN-master/checkpoints/style_cezanne_pretrained/latest_net_G.t7)	
use InstanceNormalization	
---------- # Learnable Parameters --------------	
G_A = 2855811	
------------------------------------------------
processing batch 1	
pathsA	{
  1 : "40.jpg"
}
pathsB	nil	
/home/flyintoskyq/torch/install/bin/luajit: ...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
/home/flyintoskyq/torch/install/share/lua/5.1/nn/THNN.lua:110: cublas runtime error : library not initialized at /home/flyintoskyq/torch/extra/cutorch/lib/THC/THCGeneral.c:405
stack traceback:
	[C]: in function 'v'
	/home/flyintoskyq/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput'
	...yq/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...yq/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76>
	[C]: in function 'xpcall'
	...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...lyintoskyq/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./models/one_direction_test_model.lua:52: in function 'Forward'
	test.lua:100: in main chunk
	[C]: in function 'dofile'
	...skyq/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	...lyintoskyq/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./models/one_direction_test_model.lua:52: in function 'Forward'
	test.lua:100: in main chunk
	[C]: in function 'dofile'
	...skyq/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

However, if I apply CPU mode instead of GPU mode, it works properly.
So, is my gpu memory not enough? How to solve the problem? Could you please give me any advice?
My environment information: Ubuntu 16.04, Nvidia GeForce RTX 2060, gpu memory 5896MB, cuda v10.1, cudnn v7.6.4.

Thanks a lot! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant