OCR

Project

OCR stands for Optical Character Recognition software. As part of a student project, our objective is to make a program capable of extracting text from images. It must be written in the C language, rely on a neural network and be used through a GUI. To learn more, please take a look at the book of specifications.

Usage

Dependencies: SDL2, SDL2_image, GTK+ 3 and Hunspell.

Clone this repository with git clone git@github.com:NoneOfAllOfTheAbove/OCR.git.
Compile the project by running the command make in the project folder.
Execute the program with ./bin/OCR.

Features

Currently implemented:

Advanced preprocessing (efficient binarization, noise canceling, contrast enhancement)
Detect paragraphs, lines, words and characters
A pretrained neural network to recognize characters
Simple GUI to load an image and export its extracted text
Postprocessing step (spell check)

Features we are working on:

De-skew
Improve segmentation (export as HTML)
Retrain the neural network

Contributing

Refer to CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 180 Commits
docs		docs
resources		resources
src		src
.gitignore		.gitignore
AUTHORS		AUTHORS
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

resources

resources

src

src

.gitignore

.gitignore

AUTHORS

AUTHORS

CONTRIBUTING.md

CONTRIBUTING.md

Makefile

Makefile

README.md

README.md

Repository files navigation

OCR

Project

Usage

Features

Contributing

About

Releases

Packages

Contributors 3

Languages

NoneOfAllOfTheAbove/ocr

Folders and files

Latest commit

History

Repository files navigation

OCR

Project

Usage

Features

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Languages