Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bert processor #143

Open
xuezzz opened this issue May 18, 2020 · 4 comments
Open

bert processor #143

xuezzz opened this issue May 18, 2020 · 4 comments
Labels
question Further information is requested

Comments

@xuezzz
Copy link

xuezzz commented May 18, 2020

When i use bert processor to tranform my dataset, there will appear a warning:

Token indices sequence length is longer than the specified maximum sequence length for this model (694 > 512). Running this sequence through the model will result in indexing errors.

But my dataset don't have so long sequence!
And it will lead to an error when training. Do you know how to solve it? Thanks!
Matchzoo version 1.1.1

@xuezzz xuezzz added the question Further information is requested label May 18, 2020
@RoshanGurung93
Copy link

BERT has max sequence token length less than 512. I think you have token length greater than 512. Check the token length. Quick solution could be: if you have paragraph of 800, split into two 400 paragraph length and then tokenize.

@xumingying0612
Copy link

你好,我想问下,当我使用bert训练模型时候,在最后trainer.run()这里,老是报“Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)”这个错,一直没找出到底该改动哪里,能否给解决一下啊 非常非常感谢~~

@Chriskuei
Copy link
Member

你好,我想问下,当我使用bert训练模型时候,在最后trainer.run()这里,老是报“Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)”这个错,一直没找出到底该改动哪里,能否给解决一下啊 非常非常感谢~~

Please provide more details, e.g. code snippets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants