Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BENCHMARK DATASET REQUEST] dutch-cola #419

Open
1 of 8 tasks
BramVanroy opened this issue Apr 24, 2024 · 0 comments · May be fixed by #421
Open
1 of 8 tasks

[BENCHMARK DATASET REQUEST] dutch-cola #419

BramVanroy opened this issue Apr 24, 2024 · 0 comments · May be fixed by #421
Labels
benchmark dataset request Request to add a new benchmark dataset

Comments

@BramVanroy
Copy link
Contributor

Dataset name

GroNLP/dutch-cola

Dataset link

https://huggingface.co/datasets/GroNLP/dutch-cola

Dataset languages

  • Danish
  • Swedish
  • Norwegian (Bokmål or Nynorsk)
  • Icelandic
  • Faroese
  • German
  • Dutch
  • English

Describe the dataset

Dutch CoLA is a corpus of linguistic acceptability for Dutch: a dataset consisting of sentences in Dutch, each marked as either acceptable (class 1) or unacceptable (class 0). These sentences are collected from existing descriptions of Dutch grammar (see sources below) with expert-annotated acceptability labels.

I might add it through a PR when I find the time.

@BramVanroy BramVanroy added the benchmark dataset request Request to add a new benchmark dataset label Apr 24, 2024
@BramVanroy BramVanroy linked a pull request Apr 24, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark dataset request Request to add a new benchmark dataset
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant