Custom callbacks for metrics, saving checkpoints #2575

Garfounkel · 2024-03-22T22:29:27Z

I need to compute custom metrics during training. I first thought it would be as easy as adding my own metric function to some callback, but I couldn't find anything like this in the doc or in issues. I would be fine just having a callback when a new checkpoint is saved, or when the validation step is running.

Workaround

My current workaround is to run a second process that's constantly watching over the directory of models for any new checkpoint. When a new one is found, it executes my metrics calculation.

It works, but it's really not ideal for multiple reasons:

The evaluation process is not synched with the training process, meaning if the training process is killed, the evaluation process might continue to run. This means more work is needed to also maintain that process lifetime.
Because the evaluation process is run during the training steps, it cannot use the same GPU, meaning this process either needs its own dedicated GPU, or it needs to run on CPU. It would be much better to be able to run it on the same GPU, in between the training steps. This would also be faster because the model wouldn't need to be loaded in memory every time.

Question

How can I add a callback during the validation step or after a new checkpoint? If there is no current out-of-box solution, I would be fine doing a PR if you can give me some pointers to what would need to be changed.

vince62s · 2024-03-23T08:12:18Z

look at this: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/utils/scoring_utils.py
and the way it is instanced and called from Trainer.py
This is what we use to compute BLEU score at validation time.

Garfounkel · 2024-03-23T13:00:33Z

Got it, thanks. This gives me a good idea how I could hack my own custom evaluation mechanism directly into the ScoringPreparator. But just to be clear, you're suggesting that I do that because there's no callback system at the moment ?

Thinking out loud, it seems that ScoringPreparator is indeed the correct place for me to inject my logic, but it feels very hacky. I don't want to compute my metric on the valid set, I have my own test set that I use for this metric. I also need to communicate with a web service to push scores and predictions for further analysis.

Basically my point is: sure I could inject all of that into ScoringPreparator, but what makes more sense to me is a custom callback that I could add in the config file, similar to how custom transforms are implemented.

Does that make sense? Do you think there's need for such a feature?

vince62s · 2024-03-23T14:05:04Z

I am not sure to understand what exactly you are looking for. You mentioned earlier that you want to compute a metric based on a "saved checkpoint" meaning it has to happen when ever you save a checkpoint, but compute your metric on what data ?
if it is not based on a checkpoint but must happen during training then you can look at where we report the stats (ppl, accuracy, ..)
Again I am not clear about what you are asking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom callbacks for metrics, saving checkpoints #2575

Custom callbacks for metrics, saving checkpoints #2575

Garfounkel commented Mar 22, 2024 •

edited

vince62s commented Mar 23, 2024

Garfounkel commented Mar 23, 2024

vince62s commented Mar 23, 2024

Custom callbacks for metrics, saving checkpoints #2575

Custom callbacks for metrics, saving checkpoints #2575

Comments

Garfounkel commented Mar 22, 2024 • edited

Workaround

Question

vince62s commented Mar 23, 2024

Garfounkel commented Mar 23, 2024

vince62s commented Mar 23, 2024

Garfounkel commented Mar 22, 2024 •

edited