A toolbox of OCR models, algorithms, and pipelines based on MindSpore
-
Updated
Jun 7, 2024 - Python
A toolbox of OCR models, algorithms, and pipelines based on MindSpore
A Repo For Document AI
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
Integrate AI-powered Document Analysis Pipelines
Compute benchmark of table structure recognition.
版面分析 | 表格识别 | 文档方向分类
Build a RAG preprocessing pipeline
Deep learning, Convolutional neural networks, Image processing, Document processing, Table detection, Page object detection, Table classification. https://www.sciencedirect.com/science/article/pii/S0925231221018142
利用Swin-Unet(Swin Transformer Unet)实现对文档图片里表格结构的识别,Swin-unet (Swin Transformer Unet) is used to identify the document table structure
A curated list of resources dedicated to table recognition
Extract Tabular data from Image to Excel files
ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
This is a survey on the topic of table recognition
GNN based program that extracts information (structure + data) form a table image.
Table Structure Recognition
Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)
A Data Extration Web App that converts Images to Tables.
Table Structure Recognition (TSR) solution
Add a description, image, and links to the table-recognition topic page so that developers can more easily learn about it.
To associate your repository with the table-recognition topic, visit your repo's landing page and select "manage topics."