Skip to content

Final project for course on deep learning for nlp (IA376E/1s2020 @ Unicamp)

License

Notifications You must be signed in to change notification settings

dl4nlp-rg/search-with-dense-vectors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Search with dense vectors

Open In Colab License

Final project for course on deep learning for nlp (IA376E/1s2020 @ Unicamp). This is an implementation of a Two Tower model for solving the problem of document retrieval (and passage ranking) in the dataset MSMarco. The project also uses queries generated using doc2query algotithm. The project is implemented using PyTorch and PyTorch Lighning, deep learning frameworks for Python.

Docs (portuguese)

The final article and the plan of work can be found in docs/.

Usage

One can import the model in python or use as a script.

Training

Example of training using model as module:

from src.model import TwoTower
from pytorch_lightning import Trainer

model = TwoTower(**model_args)

trainer = Trainer(**trainer_args)
trainer.fit(model)

Example of training using train script:

   python -m src.train --gpus 1 --batch_size 32

There's also a colab notebook showing the usage in notebooks/train.ipynb and notebooks/example.ipynb.

References

Releases

No releases published

Packages

No packages published