Code for Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
-
Updated
May 25, 2024 - Python
Code for Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
Corpus processing library
Corpus Processing Library
Bitextor generates translation memories from multilingual websites
Corpus processing library
Corpus processing library
Corpus Processing Library
Corpus processing library
Corpus processing library
A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Solves basic Russian NLP tasks, API for lower level Natasha projects
Several benchmarks on sentence splitting and language identification
Document preprocessing scripts for the Nature of EU Rules project
A flexible sentence segmentation library using CRF model and regex rules
Sentence segmenter for legal texts
NLP tools, word segmentation, sentence segmentation, New-Word-Discovery,新词发现
A sentence segmentation library with wide language support optimized for speed and utility.
Underthesea - Vietnamese NLP Toolkit
AIHub 한국어 데이터 전처리: 한국어 문장 분리
Add a description, image, and links to the sentence-segmentation topic page so that developers can more easily learn about it.
To associate your repository with the sentence-segmentation topic, visit your repo's landing page and select "manage topics."