Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于计算评价指标 #56

Open
huanghonggit opened this issue Feb 23, 2021 · 4 comments
Open

关于计算评价指标 #56

huanghonggit opened this issue Feb 23, 2021 · 4 comments

Comments

@huanghonggit
Copy link

请问已经生成STC_result.txt,想生成像Readme 评价指标表格中的 PPL | BLEU-2 | BLEU-4 | Dist-1 | Dist-2 | Greedy Matching | Embedding Average 这些值,请问你们是怎么计算的?

@lemon234071
Copy link
Member

lemon234071 commented Feb 23, 2021

您看下历史的issue, 这些指标的计算方式之前有人问过#53

@Ultraman-Orb
Copy link

Ultraman-Orb commented Mar 21, 2021

您好,请问在评测指标中(infer.py)中的超参数,您设置的max_history是多少,我在评测中,max_history是30,top_p是0,temperate是1,导致bleu2是34.多,bleu4是17点多,所以想请问一下您设置的max_history是多少,以及还想确认一下其他的参数,看看是不是参数的问题,测试集采用的是stc_test.json。

@lemon234071
Copy link
Member

您好,请问在评测指标中(infer.py)中的超参数,您设置的max_history是多少,我在评测中,max_history是30,top_p是0,temperate是1,导致bleu2是34.多,bleu4是17点多,所以想请问一下您设置的max_history是多少,以及还想确认一下其他的参数,看看是不是参数的问题,测试集采用的是stc_test.json。

论文里有报呀,用的STC只有pair级别数据, 所以max_history不影响结果, top p 0.9, temperature 0.7。

@zhao1402072392
Copy link

请问已经生成STC_result.txt,想生成像Readme 评价指标表格中的 PPL | BLEU-2 | BLEU-4 | Dist-1 | Dist-2 | Greedy Matching | Embedding Average 这些值,请问你们是怎么计算的?

请问你评测指标的代码的结果符合论文里的了吗,可以分享一下吗,我的一直些问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants