Skip to content

Useful PDF tools to work with PDF translation platforms.

Notifications You must be signed in to change notification settings

XieJiSS/pdf-tools

Repository files navigation

pdf-tools

Install PyPDF2 and pdfrw via python3 -m pip install -r requirements.txt.

Also, install qpdf before using any script. Make sure qpdf can be found directly in your PATH.

If you need to use to_image.py, you'll also need to have poppler installed in your PATH as well.

remove_image.py combines PyPDF2 and pdfrw's functionalities to achieve a better result. All scripts uses qpdf to produce a linearized PDF result that displays faster under bad network condition.

$ python3 remove_image.py "./pdfs/Space OCR.pdf"
$ ls -alh ./pdfs
-rw-r--r--   1 host  staff    25M Nov  3 14:24 Space OCR.pdf
-rw-r--r--   1 host  staff   9.7M Nov  7 11:20 image_removed_pypdf2_only.pdf
-rw-r--r--   1 host  staff   956K Nov  7 11:20 image_removed_pypdf2_pdfrw.pdf
$ python3 remove_image.py --aggressive "./pdfs/Space OCR.pdf"
$ ls -alh ./pdfs/image_removed_pypdf2_aggressive_pdfrw.pdf
-rw-r--r--   1 host  staff   699K Nov  7 11:21 image_removed_pypdf2_aggressive_pdfrw.pdf

Note that the PDF produced by side_by_side.py works better with PDF readers that supports Two-Page View, e.g. Preview.app.

About

Useful PDF tools to work with PDF translation platforms.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages