You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug:
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
when I use NNI to prune my customed transformer ,it first looks great with these
**[2024-03-08 11:11:12] Update indirect mask for call_function: truediv,
[2024-03-08 11:11:12] Update indirect mask for call_function: sqrt,
[2024-03-08 11:11:12] Update indirect mask for call_function: getitem_13,
[2024-03-08 11:11:12] Update indirect mask for call_function: getattr_3,
[2024-03-08 11:11:12] Update indirect mask for call_method: transpose_2, output mask: 0.0000
[2024-03-08 11:11:12] Update indirect mask for call_method: view_2, output mask: 0.0000
[2024-03-08 11:11:12] Update indirect mask for call_module: encoder_encoder_layers_0_attention_value_projection, weight: 0.0000 bias: 0.0000 , output mask: 0.0000 **
until it throw an issue
Traceback (most recent call last):
File "F:\研究生学习文件\研二\时序预测算法\transformer\pythonProject2\0305\4.py", line 219, in
ModelSpeedup(model, dummy_input, masks).speedup_model()
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\nni\compression\speedup\model_speedup.py", line 435, in speedup_model
self.update_indirect_sparsity()
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\nni\compression\speedup\model_speedup.py", line 306, in update_indirect_sparsity
self.node_infos[node].mask_updater.indirect_update_process(self, node)
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\nni\compression\speedup\mask_updater.py", line 160, in indirect_update_process
output = getattr(model_speedup, node.op)(node.target, args_cloned, kwargs_cloned)
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\torch\fx\interpreter.py", line 289, in call_method
return getattr(self_obj, target)(*args_tail, **kwargs)
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
from nni.compression.pruning import MovementPruner
from nni.compression.speedup import ModelSpeedup
from nni.compression.utils.external.external_replacer import TransformersAttentionReplacer
Describe the bug:
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
when I use NNI to prune my customed transformer ,it first looks great with these
**[2024-03-08 11:11:12] Update indirect mask for call_function: truediv,
[2024-03-08 11:11:12] Update indirect mask for call_function: sqrt,
[2024-03-08 11:11:12] Update indirect mask for call_function: getitem_13,
[2024-03-08 11:11:12] Update indirect mask for call_function: getattr_3,
[2024-03-08 11:11:12] Update indirect mask for call_method: transpose_2, output mask: 0.0000
[2024-03-08 11:11:12] Update indirect mask for call_method: view_2, output mask: 0.0000
[2024-03-08 11:11:12] Update indirect mask for call_module: encoder_encoder_layers_0_attention_value_projection, weight: 0.0000 bias: 0.0000 , output mask: 0.0000 **
until it throw an issue
Traceback (most recent call last):
File "F:\研究生学习文件\研二\时序预测算法\transformer\pythonProject2\0305\4.py", line 219, in
ModelSpeedup(model, dummy_input, masks).speedup_model()
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\nni\compression\speedup\model_speedup.py", line 435, in speedup_model
self.update_indirect_sparsity()
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\nni\compression\speedup\model_speedup.py", line 306, in update_indirect_sparsity
self.node_infos[node].mask_updater.indirect_update_process(self, node)
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\nni\compression\speedup\mask_updater.py", line 160, in indirect_update_process
output = getattr(model_speedup, node.op)(node.target, args_cloned, kwargs_cloned)
File "E:\ANACONDA\Anaconda\envs\torch\lib\site-packages\torch\fx\interpreter.py", line 289, in call_method
return getattr(self_obj, target)(*args_tail, **kwargs)
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
我在使用NNI对自己定义的transformer模型进行剪枝的时候报出这个错误,我尝试用L1NormPruner和MovementPruner进行,并且参考了NNI官方对transformer模型的剪枝案例(没有使用案例中的知识蒸馏),都尝试无果,会在speedup的过程中报出以上错误,我无法判断是我对NNI的设置有问题还是我自己定义的transformer模型不符合NNI的标准,故而寻求帮助
Environment:
Reproduce the problem
模型剪枝
from nni.compression.pruning import MovementPruner
from nni.compression.speedup import ModelSpeedup
from nni.compression.utils.external.external_replacer import TransformersAttentionReplacer
print(model)
config_list = [{
'op_types': ['Linear'],
'op_names_re': ['encoder.encoder_layers.0.attention.*'],
'sparse_threshold': 0.1,
'granularity': [4, 4]
}]
pruner = MovementPruner(model, config_list, evaluator, warmup_step=10, cooldown_begin_step=20, regular_scale=20)
pruner.compress(40, 4)
print(model)
pruner.unwrap_model()
masks = pruner.get_masks()
dummy_input = (torch.randint(0, 1, (32, 16, 1)).to(device).float(), torch.randint(0, 1, (32, 16, 1)).to(device).float())
replacer = TransformersAttentionReplacer(model)
ModelSpeedup(model, dummy_input, masks).speedup_model()
CustomTransformer(
(embedding): Linear(in_features=1, out_features=64, bias=True)
(positional_encoding): PositionalEncoding(
(dropout): Dropout(p=0, inplace=False)
)
(encoder): Encoder(
(encoder_layers): ModuleList(
(0): Encoderlayer(
(attention): AttentionLayer(
(inner_attention): FullAttention(
(dropout): Dropout(p=0.1, inplace=False)
)
(query_projection): Linear(in_features=64, out_features=64, bias=True)
(key_projection): Linear(in_features=64, out_features=64, bias=True)
(value_projection): Linear(in_features=64, out_features=64, bias=True)
(out_projection): Linear(in_features=64, out_features=64, bias=True)
)
(norm1): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
(linear): Linear(in_features=64, out_features=64, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(linear_layers): ModuleList(
(0): Linear(in_features=64, out_features=64, bias=True)
)
)
(decoder): Decoder(
(decoder_layers): ModuleList(
(0): Decoderlayer(
(self_attention): AttentionLayer(
(inner_attention): FullAttention(
(dropout): Dropout(p=0.1, inplace=False)
)
(query_projection): Linear(in_features=64, out_features=64, bias=True)
(key_projection): Linear(in_features=64, out_features=64, bias=True)
(value_projection): Linear(in_features=64, out_features=64, bias=True)
(out_projection): Linear(in_features=64, out_features=64, bias=True)
)
(cross_attention): AttentionLayer(
(inner_attention): FullAttention(
(dropout): Dropout(p=0.1, inplace=False)
)
(query_projection): Linear(in_features=64, out_features=64, bias=True)
(key_projection): Linear(in_features=64, out_features=64, bias=True)
(value_projection): Linear(in_features=64, out_features=64, bias=True)
(out_projection): Linear(in_features=64, out_features=64, bias=True)
)
(norm1): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
(linear1): Linear(in_features=64, out_features=256, bias=True)
(linear2): Linear(in_features=256, out_features=64, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(norm): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
)
(fc_in): Linear(in_features=64, out_features=64, bias=True)
(relu): ReLU()
(dropout): Dropout(p=0.1, inplace=False)
(fc_out): Linear(in_features=64, out_features=1, bias=True)
)
The text was updated successfully, but these errors were encountered: