UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

[ Paper ] [ Website ] [ Dataset (OpenDataLab)] [ Dataset (Hugging Face) ] [Models (Hugging Face)]

Welcome to the official repository of UniMERNet, a solution that converts images of mathematical expressions into LaTeX, suitable for a wide range of real-world scenarios.

News 🚀🚀🚀

2024.05.06 🎉🎉 Open-sourced UniMER dataset, including UniMER-1M for model training and UniMER-Test for MER evaluation.
2024.05.06 🎉🎉 Added Streamlit formula recognition demo and provided local deployment App.
2024.04.24 🎉🎉 Paper now available on ArXiv.
2024.04.24 🎉🎉 Inference code and checkpoints have been released.

Demo Video

DirectRecognition.mp4

MunualSelection.mp4

Quick Start

Clone the repo and download the model

git clone https://github.com/opendatalab/UniMERNet.git

cd UniMERNet/models
# Download the model and tokenizer individually or use git-lfs
git lfs install
git clone https://huggingface.co/wanderkid/unimernet

Installation

conda create -n unimernet python=3.10

conda activate unimernet

pip install --upgrade unimernet

Running UniMERNet

Streamlit Application: For an interactive and user-friendly experience, use our Streamlit-based GUI. This application allows real-time formula recognition and rendering.
```
unimernet_gui
```
Ensure you have the latest version of UniMERNet installed (pip install --upgrade unimernet) for the streamlit GUI application.
Command-line Demo: Predict LaTeX code from an image.
```
python demo.py
```
Jupyter Notebook Demo: Recognize and render formula from an image.
```
jupyter-lab ./demo.ipynb
```

Performance Comparison (BLEU) with SOTA Methods.

UniMERNet significantly outperforms mainstream models in recognizing real-world mathematical expressions, demonstrating superior performance across Simple Printed Expressions (SPE), Complex Printed Expressions (CPE), Screen-Captured Expressions (SCE), and Handwritten Expressions (HWE), as evidenced by the comparative BLEU Score evaluation.

Visualization Result with Different Methods.

UniMERNet excels in visual recognition of challenging samples, outperforming other methods.

UniMER Dataset

Introduction

The UniMER dataset is a specialized collection curated to advance the field of Mathematical Expression Recognition (MER). It encompasses the comprehensive UniMER-1M training set, featuring over one million instances that represent a diverse and intricate range of mathematical expressions, coupled with the UniMER Test Set, meticulously designed to benchmark MER models against real-world scenarios. The dataset details are as follows:

UniMER-1M Training Set:

Total Samples: 1,061,791 Latex-Image pairs
Composition: A balanced mix of concise and complex, extended formula expressions
Aim: To train robust, high-accuracy MER models, enhancing recognition precision and generalization

UniMER Test Set:

Total Samples: 23,757, categorized into four types of expressions:
- Simple Printed Expressions (SPE): 6,762 samples
- Complex Printed Expressions (CPE): 5,921 samples
- Screen Capture Expressions (SCE): 4,742 samples
- Handwritten Expressions (HWE): 6,332 samples
Purpose: To provide a thorough evaluation of MER models across a spectrum of real-world conditions

Dataset Download

You can download the dataset from OpenDataLab (recommended for users in China) or HuggingFace.

TODO

Release inference code and checkpoints of UniMERNet.
Release UniMER-1M and UniMER-Test.
Open-source the Streamlit formula recognition GUI application.
Release the training code for UniMERNet.

Citation

If you find our models / code / papers useful in your research, please consider giving us a star ⭐ and citing our work 📝, thank you :)

@misc{wang2024unimernet,
      title={UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition}, 
      author={Bin Wang and Zhuangcheng Gu and Chao Xu and Bo Zhang and Botian Shi and Conghui He},
      year={2024},
      eprint={2404.15254},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgements

VIGC. The model framework is dependent on VIGC.
Texify. A mainstream MER algorithm, UniMERNet data processing refers to Texify.
Latex-OCR. Another mainstream MER algorithm.
Donut. The UniMERNet's Transformer Encoder-Decoder are referenced from Donut.
Nougat. The tokenizer uses Nougat.

Contact Us

If you have any questions, comments, or suggestions, please do not hesitate to contact us at wangbin@pjlab.org.cn.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
asset		asset
configs		configs
models		models
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
demo.py		demo.py
pyproject.toml		pyproject.toml
run_unimernet_app.py		run_unimernet_app.py
unimernet_app.py		unimernet_app.py

License

opendatalab/UniMERNet

Folders and files

Latest commit

History

Repository files navigation

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

News 🚀🚀🚀

Demo Video

Quick Start

Clone the repo and download the model

Installation

Running UniMERNet

Performance Comparison (BLEU) with SOTA Methods.

Visualization Result with Different Methods.

UniMER Dataset

Introduction

Dataset Download

TODO

Citation

Acknowledgements

Contact Us

License

About

Resources

License

Stars

Watchers

Forks

Languages