Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Vietnamese tokenizer for Meilisearch #174

Open
kimyvgy opened this issue Dec 26, 2022 · 2 comments
Open

Implement Vietnamese tokenizer for Meilisearch #174

kimyvgy opened this issue Dec 26, 2022 · 2 comments

Comments

@kimyvgy
Copy link

kimyvgy commented Dec 26, 2022

Hello. It seems Meilisearch doesn’t have the tokenizer for Vietnamese, does It? I would like to implement a tokenizer for Vietnamese.

@anle-ct
Copy link

anle-ct commented May 4, 2023

I'm waiting for the Vietnamese tokenizer :D

@ManyTheFish
Copy link
Member

Hello @kimyvgy and @anle-ct,
If you have any idea about Rust Library that could enhance the Vietnamese Language support I'd be interested for some feedback about them.

This repository is really open to contribution, for instance, another contributor is currently implementing a Khmer Language segmenter, so don't hesitate to do the same for your own Language, I'd be pleased to help you in your work! I put below the link to the contributing file where you can find some tutorials for implementing a specialized normalizer or a segmenter:
https://github.com/meilisearch/charabia/blob/main/CONTRIBUTING.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants