Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about wiki qa dataset #129

Open
RenShuhuai-Andy opened this issue Jan 17, 2020 · 0 comments
Open

Question about wiki qa dataset #129

RenShuhuai-Andy opened this issue Jan 17, 2020 · 0 comments
Labels
question Further information is requested

Comments

@RenShuhuai-Andy
Copy link

RenShuhuai-Andy commented Jan 17, 2020

I make some analysis on wiki qa dataset:

  • training set:
    Left num: 2118; Right num: 18841;Relation num: 20360;positive example (with label 1) num: 1040(5.1%
  • dev set:
    Left num: 296;Right num: 2708;Relation num: 2733;positive example num: 140(5.12%
  • test set:
    Left num: 633;Right num: 5961;Relation num: 6165;positive example num: 293(4.75%

I wonder if this is the official way to combine question and answer, because the proportion of positive examples in three set is only 5%, which means if a model outputs 0 forever, it can achieve 95% accuracy? And the best performence of BERT on this dataset is just 95%. The proportion of positive and negative examples is too imbalance?

@RenShuhuai-Andy RenShuhuai-Andy added the question Further information is requested label Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant