Neural-Machine-Translation_Transliteration

An Intelligent Approach for Translation / Transliteration using Neural Networks

This translation approach is based on Recurrent Neural Networks (RNNs) which are the type of Neural Networks to be used when dealing with sequences of input like videos, sound or text like in our case.

For the data, I used the bible-corpus, you have to download the corresponding raw XML files and place them in the directory (data/bible-corpus/raw/) then extract the text from these files : you can use the Jupyter Notebook (word-character embedding/XMLparser.ipynb) to help you in this task, then save the results in the directory (data/bible-corpus/pre-processed/) and finaly run the script (createEmbeddings.sh) to generate the embeddings in the directory (data/bible-corpus/processed/).

By the way, I used Fasttext for the embeddings.

The script (word-character embedding/getEmbedding.py) reads a word or a character from the user and checks if the embedding is already saved in the SQLite database (word-character embedding/embeddingDB.db), otherwise, it computes it using Fasttext even if it's not found in the training corpus! in this case, it will generate the closest embedding based on the word's characters.

The Jupyter Notebook translate_dev.ipynb explains the whole pipeline which starts by reading in the training data, tokenization, embedding then building and training the model.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
UML Diagrams		UML Diagrams
data/bible-corpus		data/bible-corpus
README.md		README.md
createEmbeddings.sh		createEmbeddings.sh
db.py		db.py
getEmbedding.py		getEmbedding.py
getEmbedding.sh		getEmbedding.sh
translate_dev.ipynb		translate_dev.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UML Diagrams

UML Diagrams

data/bible-corpus

data/bible-corpus

README.md

README.md

createEmbeddings.sh

createEmbeddings.sh

db.py

db.py

getEmbedding.py

getEmbedding.py

getEmbedding.sh

getEmbedding.sh

translate_dev.ipynb

translate_dev.ipynb

Repository files navigation

Neural-Machine-Translation_Transliteration

About

Releases

Packages

Languages

stoufa/Neural-Machine-Translation_Transliteration

Folders and files

Latest commit

History

Repository files navigation

Neural-Machine-Translation_Transliteration

About

Topics

Resources

Stars

Watchers

Forks

Languages