Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

the number of words in a one-million-token corpus is only 15173? #1355

Open
MengfeiShen opened this issue Dec 12, 2023 · 0 comments
Open

the number of words in a one-million-token corpus is only 15173? #1355

MengfeiShen opened this issue Dec 12, 2023 · 0 comments

Comments

@MengfeiShen
Copy link

when I use fasttext.train_unsupervised function to learn word vectors, it shows that the number of words is 15173.
However, there are more than one million tokens in my training texts.
I don't know why.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant