Print FastCDC rolling hash chunks and checksums.
-
Updated
Nov 27, 2022 - Python
Print FastCDC rolling hash chunks and checksums.
python script to analyze dedup usage in btrfs
Yet another tool to find and remove duplicate files.
Find (partial content) duplicate files.
A CLI tool for images analysis: checking image integrity, images deduplication, image retrieval.
BenSP is a suite of parameterizable benchmarks for stream parallelism which is used to evaluate stream processing characteristics.
Detect and optionally delete duplicate files in a directory tree
Analyse 2 paths to found identical files and hard link them to save space
Remove local files that are duplicates of files in another path
Sift duplicate whitespaces away!
Project to take two similar zipfiles, and to dedupe files that have the same tiemstamp in the older file.
📄【优爱酷可视化网站网页数据采集系统】 采用先进的可视化采集技术,智能识别网页元素类型,如:图片、文字、链接、HTML 、文件等,支持运行Javascript脚本、应用正则表达式、自动滚屏、自动翻页、打开弹出窗口并采集数据,支持数据自动去重、仿人工间歇暂停防IP阻塞、自动保存等采集设置;支持浏览器Cookie和缓存等浏览器设置;支持代理轮换科学上网采集;支持“类别/关键字”;支持图像重命名等; 更可支持多线程采集等高级采集选项设置,vip版还可支持定时计划采集。
Golang structured logging (slog) deduplication and sorting for use with json logging
distill large scale web page text
String deduplication package for Go
Add a description, image, and links to the dedup topic page so that developers can more easily learn about it.
To associate your repository with the dedup topic, visit your repo's landing page and select "manage topics."