New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When I convert TIFF to PDF, the PDF size is 10 times that of TIFF #6453
Comments
当我将TIFF转换为PDF时,PDF大小是TIFF的10倍 #tiff path path='XXX';
image = Image.open(path)
image.save(path, save_all=True) |
A note for others - according to Google, the previous comment just translates to the first comment. If I run your code over one of our test images, https://github.com/python-pillow/Pillow/blob/main/Tests/images/hopper.tif, I get a PDF that is almost 10 times smaller. So your situation does not apply to all TIFFs. Could you upload a copy of your image? |
image = Image.open('tif\\20220720170924738.TIF')
image.save('tif\\dst\\20220720170924738.pdf', save_all=True) source : 20220720170924738.TIF 8.51kb files: https://github.com/344672699/Pillow/blob/main/20220720170924738.rar |
Thank you for your help |
python: 3.8 |
The compression used in your TIFF image is "group4". https://en.wikipedia.org/wiki/Group_4_compression
When the PDF is saved by Pillow, the "DCTDecode" filter is used. https://www.gemboxsoftware.com/pdf/docs/GemBox.Pdf.Filters.PdfDCTDecodeFilter.html
For comparison, I tried converting your 9kb TIFF to PDF using ImageMagick. It came out as 12kb, rather than Pillow's 185kb. Looking at that PDF, it used the "CCITTFaxDecode" filter. This looks to also be using group4 compression, so that is why it is so similar to your original image size. Because "group4" compression is dedicated for only black and white images, it doesn't seem surprising that it is smaller. I mentioned that DCTDecode is for JPEG images, and Pillow is converting your image to a JPEG before saving it in the PDF file. If I convert your TIFF image to JPEG images using ImageMagick, they come out as a 63kb and a 113kb image. 63kb + 113kb = 176kb, close to the size of the final PDF. So the answer to your question is that Pillow is not using the compression method dedicated to black-and-white images, but one that allows for more colours. |
感谢你的帮助。讲解的非常详细。谢谢。 Thank you for your help. The explanations were very detailed. thank you. However, I still have a question. When I use pillow, can I customize the "ccittfaxdecode" filter? Or is there any other setting method to reduce my PDF to "12KB" like ImageMagick, or as small as the PDF converted by iText. If not, does it mean that I can't use pillow, or only receive PDF of more than 10 times its converted size? |
At the moment, that option doesn't exist in Pillow, no. If you would like, this issue can be left open as a request for someone to add the feature. |
Alternatively, you may be interested in img2pdf. The following generates a 10kb file. import img2pdf
with open("out.pdf", "wb") as f:
f.write(img2pdf.convert('20220720170924738.tif')) |
I've created PR #6470 to resolve this. |
When I convert TIFF to PDF, the PDF size is 10 times that of TIFF
Why is that?
The text was updated successfully, but these errors were encountered: