Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of trainer/runer and number of training epochs #141

Open
littlewine opened this issue May 7, 2020 · 1 comment
Open

Use of trainer/runer and number of training epochs #141

littlewine opened this issue May 7, 2020 · 1 comment
Labels
question Further information is requested

Comments

@littlewine
Copy link

Hi, I have a question regarding choosing the epochs and doing hyperparameter tuning in general.

I am currently using matchzoo.trainers.trainer to train my models with the default number of epochs(=10).

Does this always end training in epoch=10, or it keeps some sort of checkpoints and then restores the checkpoint/model in the epoch were the validation result is best? This is not very clear to me from the documentation, and there's a lot of confusion given that there are different tutorials/documentations in matchzoo and matchzoo-py.

Apart from that, my question is:

  • If training stops always on the 10th epoch, how can I make it stop and restore the model that achieves the best results based on a metric from the validation score? Ideally, I would like to do this with checkpoints, rather than using matchzoo.auto.tuner.tuner and re-training the model over and over, or some sort of other hacky solution. I guess there should be already something in place to do that.

  • If the trainer indeed restores the checkpoint with the highest score, after the 10 epochs are finished running: Which metric is used to determine the highest score? Is it just the first metric in the list of task.metrics?

Thank you for your help!

@littlewine littlewine added the question Further information is requested label May 7, 2020
@faneshion
Copy link
Member

@littlewine have you addressed this issue? In fact, the epoch number to save the checkpoints could be set in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants