Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFs and 1 bit pixels #1775

Closed
wfbradley opened this issue Mar 21, 2016 · 5 comments · Fixed by #5430
Closed

PDFs and 1 bit pixels #1775

wfbradley opened this issue Mar 21, 2016 · 5 comments · Fixed by #5430
Labels
Bug Any unexpected behavior, until confirmed feature. Conversion
Projects
Milestone

Comments

@wfbradley
Copy link

wfbradley commented Mar 21, 2016

I posted this question on Stack Overflow before realizing that perhaps I should post it here.

I'm trying to create a PDF of a complicated black-and-white rectangular grid, nominally 7 inches wide by 10 inches tall. I'm using Pillow to do the conversion. If I specify a PNG, everything works fine (that is, the aspect ratio is 7 by 10). If I specify a PDF, the output is 200 inches wide and 0.04 inches tall. What's going on?

Incidentally, if I set mode='RGB', then the output PDF dimensions are correct, but Pillow performs a JPEG encoding of the image. JPEG converts my sharp black and white grid into smoother shades of gray (which is anathema for my application).

import PIL
from PIL import Image

# Let's consider a 7" x 10" rectangle of pixels
inches_wide=7
inches_tall=10
dpi=200
num_pixel_rows=inches_tall*dpi
num_pixel_cols=inches_wide*dpi

# Make an image
mode='1'
img = Image.new( mode, (num_pixel_cols,num_pixel_rows))
img.info['dpi'] = (dpi,dpi)

# Saving it as PNG, we get a black rectangle, as expected.
img.save('big_rectangle.png','png',resolution=dpi)
# Saving it as a PDF, we get a 200" x 0.04" all-white line?!
img.save('big_rectangle.pdf','pdf',resolution=dpi)

# Versioning info:
print 'PILLOW',PIL.PILLOW_VERSION
print 'PIL',PIL.VERSION
#  My output:
#  PILLOW 3.1.1
#  PIL 1.1.7
@aclark4life
Copy link
Member

Not sure what's going on… should we call this a bug?

@aclark4life aclark4life added the Bug Any unexpected behavior, until confirmed feature. label Jan 7, 2017
@aclark4life aclark4life added this to the Future milestone Jan 7, 2017
@nawagers
Copy link

nawagers commented Nov 7, 2017

So I investigated this a bit and it is a bug that existed pre-fork as best I can tell. The lines in question are in PdfImagePlugin.py(line:173):

if filter == "/ASCIIHexDecode":
    if bits == 1:
        # FIXME: the hex encoder doesn't support packed 1-bit
        # images; do things the hard way...
        data = im.tobytes("raw", "1")
        im = Image.new("L", (len(data), 1), None)
        im.putdata(data)
    ImageFile._save(im, op, [("hex", (0, 0)+im.size, 0, im.mode)])

The problem lies in the im = Image.new() call. The size is specified as (len(data),1) which is (350000, 1) in the example code. It works to change it to im.size, but that has no compression.

I would assume anyone using mode 1 is probably wanting to keep their output file super small. The ASCIIHex filter has no 'image' compression or data compression and generates a 5.5 MB file in the example. The equivalent PNG is 421 Bytes. The ideal situation would be for someone to implement one of the lossless algorithms and supply a flag to the save() function to use DCT or lossless encoding. Options are:

  1. Change to im.size and convert to mode "L" as now
  2. Convert to mode "P" further upstream (uses ASCIIHex instead of DCT)
  3. Generate some sort error on mode "1", forcing user to convert to P

@radarhere
Copy link
Member

#3827 has been merged, so the dimensions problem should now be fixed. However, there seems to also be a request for compression here.

@radarhere
Copy link
Member

radarhere commented Apr 23, 2021

I've created PR #5430 to add DCTDecode compression for 1-bit PDFs, reducing the 5mb PDF here to 34kb.

Pillow automation moved this from In progress to Closed Apr 25, 2021
@radarhere
Copy link
Member

I've created PR #6470, which uses CCITTTaxDecode compression, reducing the PDF down to 1kb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Any unexpected behavior, until confirmed feature. Conversion
Projects
Pillow
  
Closed
Development

Successfully merging a pull request may close this issue.

4 participants