You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First I wanted to thank you for this great resource !
As a way to try to further improve it I wanted to point out some cases that I observed on the dataset
Some questions point to a part of a token (often a character from a word) as part of an answer.
I find this weird, especially because many times the word that contains the answer is out of context.
Some Exemples from train-v2.json
Question ID ||| Question ||| Answer ||| Word that contains the answer.
56cf609aaab44d1400b89187 |||| At what age did Chopin start playing publicly? ||| 7 |||| 1817
56ce750daab44d1400b887b4 |||| In how many colors is the current iPod Touch available? ||| 5 |||| 2015
56d0f47a17492d1400aab69d |||| How many total CDs has Kanye West released in his career so far? ||| 7 |||| 2007
56d1042317492d1400aab72f |||| How many times was The College Dropout's release put off? ||| 3 |||| 2003
Here the annotators take dates and extract numbers from them and mark them as answers.
I gave only 4 examples but there are many more.
I can give you the entire list, or help you fix it, if you are interested in correcting this issue !
Moreover, I believe this raises a question on the possibility of out of context answers.
If I ask How many total CDs has Kanye West released in his career so far?
and the paragraph says Kanye has 7 cars, I believe marking 7 as correct is wrong despite there is an exact match in the strings.
I was wondering, how do you currently address this issues ?
Thank you very much,
The text was updated successfully, but these errors were encountered:
Hello,
First I wanted to thank you for this great resource !
As a way to try to further improve it I wanted to point out some cases that I observed on the dataset
Some questions point to a part of a token (often a character from a word) as part of an answer.
I find this weird, especially because many times the word that contains the answer is out of context.
Some Exemples from train-v2.json
Here the annotators take dates and extract numbers from them and mark them as answers.
I gave only 4 examples but there are many more.
I can give you the entire list, or help you fix it, if you are interested in correcting this issue !
Moreover, I believe this raises a question on the possibility of out of context answers.
If I ask
How many total CDs has Kanye West released in his career so far?
and the paragraph says
Kanye has 7 cars
, I believe marking 7 as correct is wrong despite there is an exact match in the strings.I was wondering, how do you currently address this issues ?
Thank you very much,
The text was updated successfully, but these errors were encountered: