TeDDi

This is the repository for the Text Data Diversity Sample (TeDDi Sample), a part of the Swiss National Science Foundation funded project: Non-randomness in Morphological Diversity: A Computational Approach Based on Multilingual Corpora.

This repository contains the corpus data and code that processes and analyzes it. This is currently a work in progress.

If you use TeDDi, please cite as:

Steven Moran, Christian Bentz, Ximena Gutierrez-Vasques, Olga Pelloni, and Tanja Samardzic. 2022. TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1150–1158, Marseille, France. European Language Resources Association. Online: https://aclanthology.org/2022.lrec-1.123/

To contribute code or data to the repository, please first refer to our guidelines on contributing.

Different data formats available for direct download.

Main Contributors (alphabetical order):

Bentz, Christian
Gutierrez-Vasques, Ximena
Moran, Steven
Samardžić, Tanja
Sozinova, Olga

Language-specific contributors (alphabetical order):

Kalessa, Jule (Paiwan)
Mächler, Alina
Rood, David S. (Wichita)
Roth, Rainer (Wari')

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Name		Name	Last commit message	Last commit date
Latest commit History 661 Commits
Corpus		Corpus
Crawlers		Crawlers
Database		Database
LangInfo		LangInfo
Reports		Reports
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
header_template.tsv		header_template.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corpus

Corpus

Crawlers

Crawlers

Database

Database

LangInfo

LangInfo

Reports

Reports

.gitignore

.gitignore

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

header_template.tsv

header_template.tsv

Repository files navigation

TeDDi

About

Releases

Packages

Contributors 6

Languages

License

MorphDiv/TeDDi_sample

Folders and files

Latest commit

History

Repository files navigation

TeDDi

About

Topics

Resources

License

Stars

Watchers

Forks

Languages