MC-OCR

Introduction

In this project, we aimed at extracting required fields in Vietnamese receipts captured by mobile devices 😄 We successfully built the flow and simply containerize as well as deploy the system on a webpage using streamlit. Everything is ready to use now! The beblow image is the main pipeline of ours, which includes background subtraction(Maskrcnn), invoice alignment(craft+resnet34), text detection(craft), text recognition(vietocr) and key information extraction(graphsage).

About the dataset, we utilized MC-OCR 2021. In general, the training set has 1155 images and the corresponding key fields, texts as the labels. Especially, this dataset is quite complex when having various backgrouds, as well as low quality images ... So EDA and proprcessing task are required to get good model performance!

More about Graphsage model, this is an improvement version of the original graph neural network which not only laverages the node attribute infomation from adjust nodes but also generates a representation for a new data which has not ever been seen previously. In detail, firstly, the graph is splitted into k levels based on the distance from the current node. Then, feature of this node is updated by summarizing the embedding of its neighbors, we used mean aggregated operator for this step. Excuting this way multiple times helping the information to propagate back from the furthest level. In our deployed model, we stacked consecutively 5 graphsage layers with relu activation. A simple fully connected layer is used on the top of our model to predict a probabilistic vector for key classification. The updating processing as well as node classification is illustrated as the following image:

Usage

You can easily run the project by running the below commands, but note that you already had docker in your computer:

git clone https://github.com/manhph2211/MC-OCR.git
cd MC-OCR
docker build -t "app" .
docker run app

Demo

References

Thanks to the authors:

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
background_subtraction		background_subtraction
data		data
deployment		deployment
image_rotation		image_rotation
key_info_extraction		key_info_extraction
text_detection		text_detection
text_recognition		text_recognition
utils		utils
.codetogether.ignore		.codetogether.ignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
setup.sh		setup.sh

manhph2211/MC-OCR

Folders and files

Latest commit

History

Repository files navigation

MC-OCR

Introduction

Usage

Demo

References

About

Topics

Resources

Stars

Watchers

Forks

Languages