Replicating Results #1336

MHDBST · 2023-06-07T18:52:31Z

I'm working on a classification task with fastText library, and I am trying to replicate the same results over different runs. I have set the following parameters and the seed is set to 40, but different runs result in different accuracies over dev set. The difference is significant in a way than in one run the accuracy is 90%, while in the other it is 75%. I'm not sure whether it's because of running on CPU and using multi thread functionality or there is any other way to replicate the results. Any guide on this?

fasttext.train_supervised(input=train_path, minCount=3, wordNgrams=4, minn=1, maxn=6, lr=0.001, dim=300, epoch=50, seed=40)

The text was updated successfully, but these errors were encountered:

SDAravind · 2023-07-15T22:21:32Z

Yes, even I have the same issue, do you use autotune using the validation file parameter?

FYI - There's no seed parameter fasttext parameter

MHDBST · 2023-07-18T00:39:13Z

@SDAravind maybe its not mentioned in the wiki page for some reason, but this parameter is defined.

fastText/python/fasttext_module/fasttext/FastText.py

Line 522 in 440f46a

'seed', 'autotuneValidationFile', 'autotuneMetric',

SDAravind · 2023-08-02T13:39:38Z

@MHDBST - It resulted in error for me when I set the seed parameter.

As an alternate approach, I would use fasttext generate sentence vector method for text vectorisation along with scikit-learn MLPClassifier or any other estimator for consistent results (set random state to some value of your choice).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicating Results #1336

Replicating Results #1336

MHDBST commented Jun 7, 2023

SDAravind commented Jul 15, 2023

MHDBST commented Jul 18, 2023

SDAravind commented Aug 2, 2023 •

edited

Replicating Results #1336

Replicating Results #1336

Comments

MHDBST commented Jun 7, 2023

SDAravind commented Jul 15, 2023

MHDBST commented Jul 18, 2023

SDAravind commented Aug 2, 2023 • edited

SDAravind commented Aug 2, 2023 •

edited