Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ccksminutf.txt、pkubase-mention2ent.txt和pkubase-paraphrase.txt这三个文件如何生成 #53

Open
haozheng61 opened this issue Dec 4, 2020 · 2 comments

Comments

@haozheng61
Copy link

我之前用的是您在gAnswer中文问答的github中提供的一个非常小的子图数据,在data\pkubase\paraphrase文件夹路径下的这三个文件夹下的三个文件 :ccksminutf.txt、pkubase-mention2ent.txt和pkubase-paraphrase.txt

问题1:

ccksminutf.txt文件是用来干什么的,以及这个文件是如何生成的???

问题2:

pkubase-mention2ent.txt和pkubase-paraphrase.txt这两个文件我知道分别是实体链接和谓词链接文件,那这两个文件是如何生成的???

@haozheng61
Copy link
Author

求大哥解答

@nicklin96
Copy link
Collaborator

您好!
ccksminutf.txt的内容是对CCKS问题集进行分词然后词频统计得到的。
pkubase-mention2ent.txt是pkubase这个数据集里本来就有的,但也有人工增补的内容
pkubase-paraphrase.txt里的内容是人工写的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants