Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoder中不使用mask?而且自注意力计算中的mask计算方式是不是有误? #10

Open
wjx-git opened this issue Dec 3, 2019 · 1 comment

Comments

@wjx-git
Copy link

wjx-git commented Dec 3, 2019

在 bert_model.py 中第92行,
encoder_class = Encoder(self.d_model, self.d_k, self.d_v, self.sequence_length, self.h, self.batch_size,
self.num_layer, self.input_representation, self.input_representation,
dropout_keep_prob=self.dropout_keep_prob,
use_residual_conn=self.use_residual_conn)
参数mask为何没有赋值,意思是默认不用掩模?但编码器中掩模操作是必须的吧。

在 multi_head_attention.py中第82行,
mask = tf.expand_dims(self.mask, axis=-1) # [batch,sequence_length,1]
mask = tf.expand_dims(mask, axis=1) # [batch,1,sequence_length,1]
dot_product = dot_product + mask # [batch,h,sequence_length,1]

掩模操作怎么会是直接相加呢?

@brightmart
Copy link
Owner

能否提交上来一个正确的呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants