Interact, analyze and structure massive text, image, embedding, audio and video datasets
-
Updated
Jun 5, 2024 - Python
Interact, analyze and structure massive text, image, embedding, audio and video datasets
Remove duplicates from MASSIVE wordlist, without sorting it (for dictionary-based password cracking)
Interactive code for image similarity using SIFT algorithm
Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.
The Panako acoustic fingerprinting system.
A collection of free-text bug reports for duplicate issue identification
Duplicates finder for various source code formats.
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
OpenStaticAnalyzer is a source code analyzer tool, which can perform deep static analysis of the source code of complex systems.
Detecting near-duplicate videos by aggregating features from intermediate CNN layers
Advanced similarity and duplicate source code proof of concept for our research efforts.
Vidupe is a program that can find duplicate and similar video files. V1.211 released on 2019-09-18, Windows exe here:
Advanced Duplicate File Finder for Python
Advanced similarity and duplicate source code at scale.
CLI utility to find near duplicate images and remove all but the best copy.
Filter, Sort & Delete Duplicate Files Recursively
Find similar audio files easily
An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
A uniquely crafted image viewer and editor with options to organize files, and maintain large lists of image files for slideshows, dupes detection or other purposes.
File Checksum Integrity Verifier & Duplicate File Finder written in C++ Qt
Add a description, image, and links to the duplicate-detection topic page so that developers can more easily learn about it.
To associate your repository with the duplicate-detection topic, visit your repo's landing page and select "manage topics."