Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: (NotFound) The kernel (fused_conv2d_add_act) with key (GPU, Undefined(AnyLayout), uint8) is not found and GPU kernel cannot fallback to CPU one. (at ../paddle/fluid/framework/phi_utils.cc:140) #675

Open
CodeRic28 opened this issue Jan 27, 2024 · 1 comment
Assignees

Comments

@CodeRic28
Copy link

CodeRic28 commented Jan 27, 2024

I am trying to run action recognition in real time using webcam/external camera but I'm getting this error.

Code:


from ppvideo import PaddleVideo
import cv2
import numpy as np


clas = PaddleVideo(model_file= './inference/TSM/TSM.pdmodel',params_file = './inference/TSM/TSM.pdiparams',
                   label_name_path='./data/marshall101/annotations/classInd.txt',use_gpu=True)
cap = cv2.VideoCapture(0)
while True:
    success, frame = cap.read()

    if not success:
        break
    resized = np.array(cv2.resize(frame, (448, 448)))
    result = clas.predict(resized)
    print(result)

Error:

RuntimeError: (NotFound) The kernel (fused_conv2d_add_act) with key (GPU, Undefined(AnyLayout), uint8) is not found and GPU kernel cannot fallback to CPU one. (at ../paddle/fluid/framework/phi_utils.cc:140)
  [operator < fused_conv2d_add_act > error]

Complete information:

  warnings.warn("Setuptools is replacing distutils.")
Warning! No module named 'ppdet', [paddledet] package and it's dependencies is required for AVA.
Inference models that Paddle provides are listed as follows:

{'ppTSM', 'TSM', 'TSN'} 

Using user-specified model and params!
process params are as follows: 
Namespace(model_name='', video_file='', use_gpu=True, num_seg=8, seg_len=1, short_size=256, target_size=224, normalize=True, model_file='./inference/TSM/TSM.pdmodel', params_file='./inference/TSM/TSM.pdiparams', batch_size=1, use_fp16=False, ir_optim=True, use_tensorrt=False, gpu_mem=8000, top_k=1, enable_mkldnn=False, label_name_path='./data/marshall101/annotations/classInd.txt')
E0127 13:22:03.130740 139652 analysis_predictor.cc:1894] Allocate too much memory for the GPU memory pool, assigned 8000 MB
E0127 13:22:03.130764 139652 analysis_predictor.cc:1897] Try to shink the value by setting AnalysisConfig::EnableUseGpu(...)
--- Running analysis [ir_graph_build_pass]
I0127 13:22:03.141615 139652 executor.cc:187] Old Executor is Running.
--- Running analysis [ir_analysis_pass]
--- Running IR pass [map_op_to_another_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [constant_folding_pass]
I0127 13:22:03.204072 139652 fuse_pass_base.cc:59] ---  detected 1 subgraphs
--- Running IR pass [silu_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0127 13:22:03.240100 139652 fuse_pass_base.cc:59] ---  detected 53 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [vit_attention_fuse_pass]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
I0127 13:22:03.545256 139652 fuse_pass_base.cc:59] ---  detected 1 subgraphs
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
--- Running IR pass [matmul_scale_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
I0127 13:22:03.557291 139652 fuse_pass_base.cc:59] ---  detected 1 subgraphs
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
I0127 13:22:03.573035 139652 fuse_pass_base.cc:59] ---  detected 33 subgraphs
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
I0127 13:22:03.580469 139652 fuse_pass_base.cc:59] ---  detected 16 subgraphs
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0127 13:22:03.581303 139652 fuse_pass_base.cc:59] ---  detected 4 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [fused_conv2d_add_act_layout_transfer_pass]
--- Running IR pass [transfer_layout_elim_pass]
I0127 13:22:03.582509 139652 transfer_layout_elim_pass.cc:346] move down 0 transfer_layout
I0127 13:22:03.582515 139652 transfer_layout_elim_pass.cc:347] eliminate 0 pair of transfer_layout
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [identity_op_clean_pass]
I0127 13:22:03.583701 139652 fuse_pass_base.cc:59] ---  detected 1 subgraphs
--- Running IR pass [inplace_op_var_pass]
I0127 13:22:03.583911 139652 fuse_pass_base.cc:59] ---  detected 2 subgraphs
--- Running analysis [save_optimized_model_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0127 13:22:03.584363 139652 ir_params_sync_among_devices_pass.cc:53] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0127 13:22:03.648994 139652 memory_optimize_pass.cc:118] The persistable params in main graph are : 90.3642MB
I0127 13:22:03.649510 139652 memory_optimize_pass.cc:246] Cluster name : relu_2.tmp_0  size: 802816
I0127 13:22:03.649515 139652 memory_optimize_pass.cc:246] Cluster name : relu_6.tmp_0  size: 3211264
I0127 13:22:03.649518 139652 memory_optimize_pass.cc:246] Cluster name : relu_10.tmp_0  size: 1605632
I0127 13:22:03.649520 139652 memory_optimize_pass.cc:246] Cluster name : relu_9.tmp_0  size: 3211264
I0127 13:22:03.649523 139652 memory_optimize_pass.cc:246] Cluster name : data_batch_0  size: 4816896
--- Running analysis [ir_graph_to_program_pass]
I0127 13:22:03.660600 139652 analysis_predictor.cc:1838] ======= optimize end =======
I0127 13:22:03.661157 139652 naive_executor.cc:200] ---  skip [feed], feed -> data_batch_0
I0127 13:22:03.661641 139652 naive_executor.cc:200] ---  skip [mean_0.tmp_0], fetch -> fetch
W0127 13:22:03.996502 139652 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 12.0
W0127 13:22:03.997149 139652 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.

Traceback (most recent call last):
  File "/home/[user]/projects/action_paddle/paddle_video/PaddleVideo/my_predict.py", line 38, in <module>
    result = clas.predict(resized)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/[user]/projects/action_paddle/venv/lib/python3.11/site-packages/ppvideo/tools/paddlevideo_clas.py", line 311, in predict
    self.predictor.run()
RuntimeError: (NotFound) The kernel (fused_conv2d_add_act) with key (GPU, Undefined(AnyLayout), uint8) is not found and GPU kernel cannot fallback to CPU one. (at ../paddle/fluid/framework/phi_utils.cc:140)
  [operator < fused_conv2d_add_act > error]

Error when use_gpu=False:

             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/[user]/projects/action_paddle/venv/lib/python3.11/site-packages/ppvideo/tools/paddlevideo_clas.py", line 311, in predict
    self.predictor.run()
NotImplementedError: In user code:

    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/tools/export_model.py", line 267, in <module>
      main()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/tools/export_model.py", line 258, in main
      paddle.jit.save(
    File "<decorator-gen-383>", line 2, in save
      
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
      return wrapped_func(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/api.py", line 809, in wrapper
      func(layer, path, input_spec, **configs)
    File "<decorator-gen-382>", line 2, in save
      
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
      return wrapped_func(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 68, in __impl__
      return func(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/api.py", line 1104, in save
      static_func.concrete_program_specify_input_spec(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 986, in concrete_program_specify_input_spec
      concrete_program, _ = self.get_concrete_program(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 875, in get_concrete_program
      concrete_program, partial_program_layer = self._program_cache[
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1648, in __getitem__
      self._caches[item_id] = self._build_once(item)
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1575, in _build_once
      concrete_program = ConcreteProgram.from_func_spec(
    File "<decorator-gen-378>", line 2, in from_func_spec
      
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
      return wrapped_func(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 68, in __impl__
      return func(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1339, in from_func_spec
      outputs = static_func(*inputs)
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 48, in forward
      if mode == 'train':
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
      out = _run_py_ifelse(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 50, in forward
      elif mode == 'valid':
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
      out = _run_py_ifelse(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 52, in forward
      elif mode == 'test':
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
      out = _run_py_ifelse(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 54, in forward
      elif mode == 'infer':
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
      out = _run_py_ifelse(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 55, in forward
      return self.infer_step(data_batch)
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py", line 68, in infer_step
      cls_score = self.forward_net(imgs)
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py", line 30, in forward_net
      if self.backbone is not None:
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
      out = _run_py_ifelse(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py", line 31, in forward_net
      feature = self.backbone(imgs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1431, in __call__
      return self._dygraph_call_func(*inputs, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1410, in _dygraph_call_func
      outputs = self.forward(*inputs, **kwargs)
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 351, in forward
      for block in self.block_list:
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 162, in convert_while_loop
      _run_py_while(cond, body, getter, setter)
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 231, in _run_py_while
      body()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 352, in forward
      y = block(y)
    File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1431, in __call__
      return self._dygraph_call_func(*inputs, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1410, in _dygraph_call_func
      outputs = self.forward(*inputs, **kwargs)
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 127, in forward
      if paddle.is_compiled_with_custom_device('npu'):
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
      out = _run_py_ifelse(
    File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 156, in forward
      shifts = F.temporal_shift(inputs,
    File "/usr/local/lib/python3.10/dist-packages/paddle/nn/functional/extension.py", line 317, in temporal_shift
      helper.append_op(
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/layer_helper.py", line 44, in append_op
      return self.main_program.current_block().append_op(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/framework.py", line 4467, in append_op
      op = Operator(
    File "/usr/local/lib/python3.10/dist-packages/paddle/base/framework.py", line 3016, in __init__
      for frame in traceback.extract_stack():

    UnimplementedError: There are no kernels which are registered in the temporal_shift operator.
      [Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at ../paddle/fluid/framework/operator.cc:2268)
      [operator < temporal_shift > error]

Note: I've trained the model in Google Colab and trying to inference in real time locally

@westfish
Copy link
Collaborator

westfish commented Feb 2, 2024

这个错误表明您的环境可能缺少对输入数据类型或操作的支持。确保您的环境安装有正确版本的PaddlePaddle,并且您的GPU支持所需的操作。

另外,如果有图像、视频理解和生成的需求,可以使用我们新的跨模态工具: https://github.com/PaddlePaddle/PaddleMIX/tree/develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants