Skip to content

mandarjoshi90/triviaqa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

  • This repo contains code for the paper Mandar Joshi, Eunsol Choi, Daniel Weld, Luke Zettlemoyer.

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension In Association for Computational Linguistics (ACL) 2017, Vancouver, Canada.

Requirements

General

  • Python 3. You should be able to run the evaluation scripts using Python 2.7 if you take care of unicode in utils.utils.py.
  • BiDAF requires Python 3 -- check the original repository for more details.

Python Packages

  • tensorflow (only if you want to run BiDAF, verified on r0.11)
  • nltk
  • tqdm

Evaluation

The dataset file parameter refers to files in the qa directory of the data (e.g., wikipedia-dev.json). For file format, check out the sample directory in the repo.

python3 -m evaluation.triviaqa_evaluation --dataset_file samples/triviaqa_sample.json --prediction_file samples/sample_predictions.json

Miscellaneous

  • If you have a SQuAD model and want to run on TriviaQA, please refer to utils.convert_to_squad_format.py