Skip to content

Releases: joeynmt/joeynmt

v2.3

25 Jan 21:19
Compare
Choose a tag to compare

v2.2

15 Jan 19:16
c1226c7
Compare
Choose a tag to compare

v2.1

18 Sep 19:17
32eef89
Compare
Choose a tag to compare
  • upgrade to python 3.10, torch 1.12
  • replace Automated Mixed Precision from NVIDA's amp to Pytorch's amp package
  • replace discord.py with pycord in the Discord Bot demo
  • data Iterator refactoring (#189, #190, #191)
  • migrate to pytorch's torch.testing.assert_close to check tensors in unittests
  • add wmt14 ende / deen benchmark trained with joey v2 from scratch
  • bugfixes

Joey NMT 2.0

02 Jun 17:28
e0cfa7d
Compare
Choose a tag to compare

Breaking changes:

  • upgrade to python 3.9, torch 1.11
  • torchtext.legacy dependencies are completely replaced by torch.utils.data
  • joeynmt/tokenizers.py: handles tokenization internally (also supports bpe-dropout!)
  • joeynmt/datasets.py: loads data from plaintext, tsv, and huggingface's datasets
  • scripts/build_vocab.py: trains subwords, creates joint vocab
  • enhancement in decoding
  • scoring with hypotheses or references
  • repetition penalty, ngram blocker
  • attention plots for transformers
  • yapf, isort, flake8 introduced
  • bugfixes, minor refactoring

Requirements update

18 Jan 02:25
Compare
Choose a tag to compare

Six >= 1.12

Beam search & checkpointing improvements, dependency update

18 Jan 02:17
Compare
Choose a tag to compare
  • upgrade to sacrebleu 2.0, python 3.7, torch 1.8
  • bug fixes:
    • heaps in checkpoint maintenance #153
    • beam search stopping criterion #149
    • removing final BPE merge markers in hypotheses (dsfsi/masakhane-web#33)
    • keeping best and last ckpts #136
    • using utf encoding when opening files #150
  • f-style formatting

n-best decoding, checkpointing, dependency updates

14 Apr 03:40
Compare
Choose a tag to compare

You can now retrieve the n-best outputs during inference (rather than just the one best translation) and track the latest checkpoint (for continuing training). We also added a colab for training a small translation model on the Tatoeba task. Now operating on Torch v1.8.0 and using deprecated Torchtext dataset implementations from v0.9.

1.0

31 Oct 12:54
7da62dc
Compare
Choose a tag to compare
1.0

Additions:

  • Multi-GPU processing
  • Data loading improvements
  • Tokenizer integration
  • Japanese benchmarks

Pre-release v0.9

28 Jul 10:59
e55b615
Compare
Choose a tag to compare
Pre-release v0.9 Pre-release
Pre-release

Stable recurrent and Transformer models. Minor changes and refactoring might happen before v1.0.