Skip to content

LLCampos/pyEvaluator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyEvaluator

A library that helps you evaluate the performance of annotator systems, for example.

Imagine that you are developing a system to automaticaly detect mentions of animals in text. For testing this system you compare the terms annotated by your system with the terms that should be annotated (gold standard). Assume that you are doing this for just one document and that your gold standard is:

gold_standard = ['dolphin', 'parrot', 'spider', 'gorilla', 'cats']

And that your system annotated the following terms:

test_annotations = ['parrot', 'banana', 'gorilla', 'basket']

Then you can test your system this way:

from pyEvaluator.Evaluator import Evaluator

# Arguments should be sets
ev = Evaluator(gold_terms=set(gold_standard), pred_terms=set(test_annotations))

print "Precision: {}".format(ev.precision())
print "Recall: {}".format(ev.recall())
print "F1-Score: {}".format(ev.f1_score())
print
print "True Positives: {}".format(ev.true_positives())
print "False Positives: {}".format(ev.false_positives())
print "False Negatives: {}".format(ev.false_negatives())

The output will be:

Precision: 0.5
Recall: 0.4
F1-Score: 0.444444444444

True Positives: set(['gorilla', 'parrot'])
False Positives: set(['basket', 'banana'])
False Negatives: set(['cats', 'dolphin', 'spider'])

Installation

pip install git+https://github.com/LLCampos/pyEvaluator

About

A library that helps you evaluate the performance of annotator systems, for example. Calculates evaluation metrics like precision, recall and F1-score.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages