Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation outputs differ with different batch sizes #2208

Open
anderleich opened this issue Sep 22, 2022 · 5 comments
Open

Translation outputs differ with different batch sizes #2208

anderleich opened this issue Sep 22, 2022 · 5 comments

Comments

@anderleich
Copy link
Contributor

Hi,

I've recently noticed that I get slightly different translation results when translating with different batch sizes. I guess this is not expected...

For example,

Batch size 100 --> PRED SCORE: -6.0490, PRED PPL: 423.68 NB SENTENCES: 5000
Batch size 150 --> PRED SCORE: -5.2232, PRED PPL: 185.54 NB SENTENCES: 5000

I'm using the latest version of OpenNMT-py

Thanks

@vince62s
Copy link
Member

Are you using a length penalty ?
I think this is expected with the new exit condition but the actual translations (text) is it different ?

@anderleich
Copy link
Contributor Author

anderleich commented Sep 22, 2022

Hi @vince62s ,

I'm not using length penalty.
Yes the actual text translations are different. There are slight differences in BLEU scores too

@vince62s
Copy link
Member

If you want to be sure change the exit condition here:
https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/translate/beam_search.py#L192
replace self.beam_size by self.n_best
In theory, with the new condition, you should have slightly better scores.

@anderleich
Copy link
Contributor Author

What I don't really understand is why I get different translations depending on the batch_size. Does this mean that a certain sample is influenced by other samples in the batch? I guess this shouldn't be the expected behaviour as comparing systems would depend on selecting the same batch_size.

@robertBrnnn
Copy link
Contributor

@anderleich I noticed this some time ago, see this comment #2039 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants