PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
Updated
May 23, 2024 - Python
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Export definitions, and notes regarding how they work, for extracting data from MySchoolSask (an implementation of Follett Aspen)
Singer Tap for dbt API v2 built with the Meltano SDK
PDFix SDK samples for Java Maven. PDF manipulation, content extraction, conversion , accessibility and more...
Easily scrape 10,000+ email messages in one hour, helping you quickly increase your customers Extracts data from (LinkedIn, Facebook, Instagram, Youtube, Pinterest, Twitter) Perfect search by specific Keywords Ready-to-use Social Network Data Scraper Software to get started instantly 100% Include source code and install file
Basic data extraction from website GEIPAN
Singer tap for the StackExchange API
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
Web scraping para extrair dados de produtos, tradução utilizando o LibreTranslate, tratamento dos dados e classificação de produtos em categorias utilizando um modelo de IA treinado com TensorFlow .
Extracts data from a spreadsheet and outputs its contents to a '.SQL' file. Data extraction tool useful for people using SQL Server Express with no access to SSMS addon and import wizard.
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
SQLiteDiskExplorer enables you to explore, catalog, and batch extract SQLite files from disks and removable media.
Crawly, a high-level web crawling & scraping framework for Elixir.
A tool to replace data in a Unity Asset Bundle from modified files.
Mercy is an open-source Rust crate and CLI designed for building cybersecurity utilities and projects.
This UiPath project automates the process of extracting data from an Excel sheet and filling out a Google Form with the extracted information.
Extract structured data from any unstructured web page
Add a description, image, and links to the extract-data topic page so that developers can more easily learn about it.
To associate your repository with the extract-data topic, visit your repo's landing page and select "manage topics."