You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RuntimeError: (NotFound) The kernel (fused_conv2d_add_act) with key (GPU, Undefined(AnyLayout), uint8) is not found and GPU kernel cannot fallback to CPU one. (at ../paddle/fluid/framework/phi_utils.cc:140)
#675
Open
CodeRic28 opened this issue
Jan 27, 2024
· 1 comment
I am trying to run action recognition in real time using webcam/external camera but I'm getting this error.
Code:
from ppvideo import PaddleVideo
import cv2
import numpy as np
clas = PaddleVideo(model_file= './inference/TSM/TSM.pdmodel',params_file = './inference/TSM/TSM.pdiparams',
label_name_path='./data/marshall101/annotations/classInd.txt',use_gpu=True)
cap = cv2.VideoCapture(0)
while True:
success, frame = cap.read()
if not success:
break
resized = np.array(cv2.resize(frame, (448, 448)))
result = clas.predict(resized)
print(result)
Error:
RuntimeError: (NotFound) The kernel (fused_conv2d_add_act) with key (GPU, Undefined(AnyLayout), uint8) is not found and GPU kernel cannot fallback to CPU one. (at ../paddle/fluid/framework/phi_utils.cc:140)
[operator < fused_conv2d_add_act > error]
Complete information:
warnings.warn("Setuptools is replacing distutils.")
Warning! No module named 'ppdet', [paddledet] package and it's dependencies is required for AVA.
Inference models that Paddle provides are listed as follows:
{'ppTSM', 'TSM', 'TSN'}
Using user-specified model and params!
process params are as follows:
Namespace(model_name='', video_file='', use_gpu=True, num_seg=8, seg_len=1, short_size=256, target_size=224, normalize=True, model_file='./inference/TSM/TSM.pdmodel', params_file='./inference/TSM/TSM.pdiparams', batch_size=1, use_fp16=False, ir_optim=True, use_tensorrt=False, gpu_mem=8000, top_k=1, enable_mkldnn=False, label_name_path='./data/marshall101/annotations/classInd.txt')
E0127 13:22:03.130740 139652 analysis_predictor.cc:1894] Allocate too much memory for the GPU memory pool, assigned 8000 MB
E0127 13:22:03.130764 139652 analysis_predictor.cc:1897] Try to shink the value by setting AnalysisConfig::EnableUseGpu(...)
--- Running analysis [ir_graph_build_pass]
I0127 13:22:03.141615 139652 executor.cc:187] Old Executor is Running.
--- Running analysis [ir_analysis_pass]
--- Running IR pass [map_op_to_another_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [constant_folding_pass]
I0127 13:22:03.204072 139652 fuse_pass_base.cc:59] --- detected 1 subgraphs
--- Running IR pass [silu_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0127 13:22:03.240100 139652 fuse_pass_base.cc:59] --- detected 53 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [vit_attention_fuse_pass]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
I0127 13:22:03.545256 139652 fuse_pass_base.cc:59] --- detected 1 subgraphs
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
--- Running IR pass [matmul_scale_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
I0127 13:22:03.557291 139652 fuse_pass_base.cc:59] --- detected 1 subgraphs
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
I0127 13:22:03.573035 139652 fuse_pass_base.cc:59] --- detected 33 subgraphs
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
I0127 13:22:03.580469 139652 fuse_pass_base.cc:59] --- detected 16 subgraphs
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0127 13:22:03.581303 139652 fuse_pass_base.cc:59] --- detected 4 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [fused_conv2d_add_act_layout_transfer_pass]
--- Running IR pass [transfer_layout_elim_pass]
I0127 13:22:03.582509 139652 transfer_layout_elim_pass.cc:346] move down 0 transfer_layout
I0127 13:22:03.582515 139652 transfer_layout_elim_pass.cc:347] eliminate 0 pair of transfer_layout
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [identity_op_clean_pass]
I0127 13:22:03.583701 139652 fuse_pass_base.cc:59] --- detected 1 subgraphs
--- Running IR pass [inplace_op_var_pass]
I0127 13:22:03.583911 139652 fuse_pass_base.cc:59] --- detected 2 subgraphs
--- Running analysis [save_optimized_model_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0127 13:22:03.584363 139652 ir_params_sync_among_devices_pass.cc:53] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0127 13:22:03.648994 139652 memory_optimize_pass.cc:118] The persistable params in main graph are : 90.3642MB
I0127 13:22:03.649510 139652 memory_optimize_pass.cc:246] Cluster name : relu_2.tmp_0 size: 802816
I0127 13:22:03.649515 139652 memory_optimize_pass.cc:246] Cluster name : relu_6.tmp_0 size: 3211264
I0127 13:22:03.649518 139652 memory_optimize_pass.cc:246] Cluster name : relu_10.tmp_0 size: 1605632
I0127 13:22:03.649520 139652 memory_optimize_pass.cc:246] Cluster name : relu_9.tmp_0 size: 3211264
I0127 13:22:03.649523 139652 memory_optimize_pass.cc:246] Cluster name : data_batch_0 size: 4816896
--- Running analysis [ir_graph_to_program_pass]
I0127 13:22:03.660600 139652 analysis_predictor.cc:1838] ======= optimize end =======
I0127 13:22:03.661157 139652 naive_executor.cc:200] --- skip [feed], feed -> data_batch_0
I0127 13:22:03.661641 139652 naive_executor.cc:200] --- skip [mean_0.tmp_0], fetch -> fetch
W0127 13:22:03.996502 139652 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 12.0
W0127 13:22:03.997149 139652 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
Traceback (most recent call last):
File "/home/[user]/projects/action_paddle/paddle_video/PaddleVideo/my_predict.py", line 38, in <module>
result = clas.predict(resized)
^^^^^^^^^^^^^^^^^^^^^
File "/home/[user]/projects/action_paddle/venv/lib/python3.11/site-packages/ppvideo/tools/paddlevideo_clas.py", line 311, in predict
self.predictor.run()
RuntimeError: (NotFound) The kernel (fused_conv2d_add_act) with key (GPU, Undefined(AnyLayout), uint8) is not found and GPU kernel cannot fallback to CPU one. (at ../paddle/fluid/framework/phi_utils.cc:140)
[operator < fused_conv2d_add_act > error]
Error when use_gpu=False:
^^^^^^^^^^^^^^^^^^^^^
File "/home/[user]/projects/action_paddle/venv/lib/python3.11/site-packages/ppvideo/tools/paddlevideo_clas.py", line 311, in predict
self.predictor.run()
NotImplementedError: In user code:
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/tools/export_model.py", line 267, in <module>
main()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/tools/export_model.py", line 258, in main
paddle.jit.save(
File "<decorator-gen-383>", line 2, in save
File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/api.py", line 809, in wrapper
func(layer, path, input_spec, **configs)
File "<decorator-gen-382>", line 2, in save
File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 68, in __impl__
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/api.py", line 1104, in save
static_func.concrete_program_specify_input_spec(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 986, in concrete_program_specify_input_spec
concrete_program, _ = self.get_concrete_program(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 875, in get_concrete_program
concrete_program, partial_program_layer = self._program_cache[
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1648, in __getitem__
self._caches[item_id] = self._build_once(item)
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1575, in _build_once
concrete_program = ConcreteProgram.from_func_spec(
File "<decorator-gen-378>", line 2, in from_func_spec
File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 68, in __impl__
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/program_translator.py", line 1339, in from_func_spec
outputs = static_func(*inputs)
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 48, in forward
if mode == 'train':
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
out = _run_py_ifelse(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
py_outs = true_fn() if pred else false_fn()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 50, in forward
elif mode == 'valid':
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
out = _run_py_ifelse(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
py_outs = true_fn() if pred else false_fn()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 52, in forward
elif mode == 'test':
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
out = _run_py_ifelse(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
py_outs = true_fn() if pred else false_fn()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 54, in forward
elif mode == 'infer':
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
out = _run_py_ifelse(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
py_outs = true_fn() if pred else false_fn()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/base.py", line 55, in forward
return self.infer_step(data_batch)
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py", line 68, in infer_step
cls_score = self.forward_net(imgs)
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py", line 30, in forward_net
if self.backbone is not None:
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
out = _run_py_ifelse(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
py_outs = true_fn() if pred else false_fn()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/framework/recognizers/recognizer2d.py", line 31, in forward_net
feature = self.backbone(imgs)
File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1431, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1410, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 351, in forward
for block in self.block_list:
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 162, in convert_while_loop
_run_py_while(cond, body, getter, setter)
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 231, in _run_py_while
body()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 352, in forward
y = block(y)
File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1431, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1410, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 127, in forward
if paddle.is_compiled_with_custom_device('npu'):
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 398, in convert_ifelse
out = _run_py_ifelse(
File "/usr/local/lib/python3.10/dist-packages/paddle/jit/dy2static/convert_operators.py", line 487, in _run_py_ifelse
py_outs = true_fn() if pred else false_fn()
File "/content/gdrive/MyDrive/paddle_video/PaddleVideo/paddlevideo/modeling/backbones/resnet_tsm.py", line 156, in forward
shifts = F.temporal_shift(inputs,
File "/usr/local/lib/python3.10/dist-packages/paddle/nn/functional/extension.py", line 317, in temporal_shift
helper.append_op(
File "/usr/local/lib/python3.10/dist-packages/paddle/base/layer_helper.py", line 44, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/framework.py", line 4467, in append_op
op = Operator(
File "/usr/local/lib/python3.10/dist-packages/paddle/base/framework.py", line 3016, in __init__
for frame in traceback.extract_stack():
UnimplementedError: There are no kernels which are registered in the temporal_shift operator.
[Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at ../paddle/fluid/framework/operator.cc:2268)
[operator < temporal_shift > error]
Note: I've trained the model in Google Colab and trying to inference in real time locally
The text was updated successfully, but these errors were encountered:
I am trying to run action recognition in real time using webcam/external camera but I'm getting this error.
Code:
Error:
Complete information:
Error when
use_gpu=False
:Note: I've trained the model in Google Colab and trying to inference in real time locally
The text was updated successfully, but these errors were encountered: