Skip to content

The “Celebrity” corpus consists of 150 news articles annotated with three semantic relations of the biographic domain. The corpus is provided in two formats, a CoNLL-like format (plain-text files with tabular-separated values) and an XML-based format. Files in the XML-based format can be loaded with https://github.com/DFKI-NLP/recon.

Notifications You must be signed in to change notification settings

DFKI-NLP/celebrity-corpus

Repository files navigation

Celebrity Corpus

The “Celebrity” corpus consists of 150 news articles annotated with three semantic relations of the biographic domain. The corpus is provided in two formats, a CoNLL-like format (plain-text files with tabular-separated values) and an XML-based format. Files in the XML-based format can be loaded with the Recon tool.

Use

The DFKI Celebrity Corpus is released as CC-BY NC 4.0. If you use this data, you should cite the accompanying paper:

Annotating Relation Mentions in Tabloid Press. Hong Li, Sebastian Krause, Feiyu Xu, and Hans Uszkoreit. Proceedings of LREC, 2014. (bib) (pdf)

About

The “Celebrity” corpus consists of 150 news articles annotated with three semantic relations of the biographic domain. The corpus is provided in two formats, a CoNLL-like format (plain-text files with tabular-separated values) and an XML-based format. Files in the XML-based format can be loaded with https://github.com/DFKI-NLP/recon.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published