text-extraction

Retrieve data from two different websites, loading them into the PostgreSQL database using Python, and combine them to get and present new information

postgresql text-extraction constraints data-statictics extract-data data-conversion python-scrapy python-connector categorize-products join-query

Updated Dec 5, 2023
Python

Jaha96 / tesseract-quick-implementation

Star

Tesseract-OCR quick implementation. Linked with stack-overflow question

tesseract text-extraction tesseract-ocr pyinstaller tesseract-4 tesseract-python

Updated Nov 26, 2019
HTML

Aalaa4444 / Text_Processing-and-Unique_Word_Extraction_fromHTML

Star

Extract text content from an HTML page, process it, and extract unique words from the processed text. This notebook utilizes various text processing techniques including cleaning, normalization, tokenization, lemmatization or stemming, and stop words removal.

tokenizer text-extraction requests data-extraction beautifulsoup text-processing tokenization stemming lemmatization stopwords-removal text-cleaning text-normalization extract-html text-tokenization text-lemmatization

Updated Apr 5, 2024
Jupyter Notebook

Lanjkn / Text-Extractor

Star

Api to get text from multiple types of files

api text-extraction file-processing

Updated Mar 14, 2024
Python

nikolay-malygin / snap-text

Star

A simple web application built with React which allows to upload images containing text, select the language of the text for recognition, and extract the text from the image. As quick as a finger snap - SnapText.

react reactjs web-application text-extraction text-recognition copy-to-clipboard multi-language-support simple-app copy-text-to-clipboard text-extraction-from-image copy-result

Updated Dec 10, 2023
HTML

swingfox / ViTeX

Star

[Thesis] Video Text Extraction

image-processing text-extraction ocr-recognition image-filters blob-detection

Updated Mar 6, 2016
C#

prateeksahu147 / OCR-PDF-Web-Scraper

Star

Engine for automated the process of scraping PDFs into local and convert those PDFs into text by performing OCR.

opencv ocr python3 text-extraction data-preprocessing webscraping data-preparation

Updated Jul 14, 2022

jhw296 / BookScanner

Star

PyQt5를 사용한 간단한 도서 스캐너 프로젝트 (바코드 인식과 텍스트 추출을 통한 도서 정보를 검색 및 표시)

opencv pyqt5 image-processing text-extraction recognizes-images barcode-scanner

Updated Jun 15, 2023
Python

pedrocardoz0 / body-snatcher

Star

custom github action to parse issue body

workflow automation text-extraction issue-parser github-actions

Updated Feb 12, 2023
TypeScript

Asraf2asif / SummifyAI

Star

Harnesses the power of OpenAI's to revolutionize the way you consume information. Say goodbye to information overload and hello to quick and comprehensive understanding. Let our AI-Powered Content Summarizer extract the key insights from any text, allowing you to focus on what matters most.

nodejs app ai postcss chatbot text-extraction openai summarizer summarize vitejs openai-api ai-powered content-summarizer summifyai

Updated Aug 17, 2023
JavaScript

andrea-liliana / gato-encerrado-Hackathon-RIIAA

Star

AI solution that analyses thousands of typewritten documents in order to solve forced disappearances in Mexico.

ai text-extraction image-segmentation entity-recognition

Updated Aug 24, 2021
Jupyter Notebook

RoshanJulius / Business-Card-Reader

Star

The Business Card Reader is a Python application that utilizes computer vision techniques and optical character recognition (OCR) to extract text information from business cards. It provides an intuitive interface to capture an image of the business card, process it, and extract the text for further use.

opencv computer-vision text-extraction opencv-python pytesseract