Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_file_blp.py fails with libjpeg 9e since 9.3.0 #6741

Closed
kulikjak opened this issue Nov 15, 2022 · 13 comments · Fixed by #6767
Closed

test_file_blp.py fails with libjpeg 9e since 9.3.0 #6741

kulikjak opened this issue Nov 15, 2022 · 13 comments · Fixed by #6767
Labels

Comments

@kulikjak
Copy link
Contributor

What did you do?

I built Pillow 9.3.0 as downloaded from PyPI and ran the test suite with the following command:
/usr/bin/python3.X -m pytest Tests/test_file_blp.py

What did you expect to happen?

Every test passes.

What actually happened?

test_load_blp1 failed with the following error:

    def test_load_blp1():
        with Image.open("Tests/images/blp/blp1_jpeg.blp") as im:
>           assert_image_equal_tofile(im, "Tests/images/blp/blp1_jpeg.png")

Tests/test_file_blp.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
Tests/helper.py:105: in assert_image_equal_tofile
    assert_image_equal(a, img, msg)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

a = <PIL.BlpImagePlugin.BlpImageFile image mode=RGB size=256x256 at 0x7FF952EF3AD10>
b = <PIL.PngImagePlugin.PngImageFile image mode=RGB size=256x256 at 0x7FF952EF3AF10>, msg = None

    def assert_image_equal(a, b, msg=None):
        assert a.mode == b.mode, msg or f"got mode {repr(a.mode)}, expected {repr(b.mode)}"
        assert a.size == b.size, msg or f"got size {repr(a.size)}, expected {repr(b.size)}"
        if a.tobytes() != b.tobytes():
            if HAS_UPLOADER:
                try:
                    url = test_image_results.upload(a, b)
                    logger.error(f"Url for test images: {url}")
                except Exception:
                    pass
    
>           assert False, msg or "got different content"
E           AssertionError: got different content
E           assert False

I don't see this failure in 9.2.0 - it seems that f7363c1 caused this issue. When I revert it, the issue is gone (when I revert only the BlpImagePlugin.py change, pillow segfaults on newly added test_file_blp.py code, but it passes the assert_image_equal_tofile part this failure is about.

This happens on both SPARC and Intel meaning than endianness is most likely not the issue.

What are your OS, Python and Pillow versions?

  • OS: Oracle Solaris 11.4 (both on SPARC and x86)
  • Python: 3.7.14, 3.9.14 and 3.11.0
  • Pillow: 9.3.0
@radarhere
Copy link
Member

Hi. Just to help us reproduce this, what version of libjpeg are you using? python3.X -m PIL | grep "JPEG support"

@kulikjak
Copy link
Contributor Author

JPEG support ok, compiled for 9.0 - specifically it's 9e.

I will try running it with a different libjpeg and see if that helps things.

@radarhere
Copy link
Member

No need to do that.

JPEG images are allowed to be loaded slightly differently. So presuming that the difference is not too great, the test just needs to be relaxed to allow for variation.

If you change test_file_blp.py, replacing

assert_image_equal_tofile(im, "Tests/images/blp/blp1_jpeg.png")

with

with Image.open("Tests/images/blp/blp1_jpeg.png") as expected:
    assert_image_similar(im, expected., 0.001)

the error should change to report an average pixel difference. What is that number?

@kulikjak
Copy link
Contributor Author

kulikjak commented Nov 16, 2022

I am getting the following:

E           AssertionError:  average pixel value difference 537.3142 > epsilon 0.0010
E           assert 0.001 >= 537.3141937255859

Doesn't seem like a small difference :(.

@kulikjak
Copy link
Contributor Author

I just tested it on older machine where we still have libjpeg v6b2 and I see no issue there.

@radarhere
Copy link
Member

radarhere commented Nov 17, 2022

Testing, I find that 9d is fine, but 9e is not. Interesting. The problem is not Solaris at least.

These are the images with 9d and 9e.

9d 9e
d e

Here is the extracted JPEG

in

And here is what it is resaved by Pillow. So this is essentially a JPEG problem.

9d 9e
d e

@radarhere
Copy link
Member

The reason that this wasn't detected sooner is that Pillow tests with libjpeg-turbo.

@kulikjak
Copy link
Contributor Author

Thanks for the investigation! I renamed the issue to reflect that it's not Solaris but the latest libjpeg that causes this.

Interesting that when I revert f7363c1, it works with 9e as well. But I guess that that might be just a coincidence.

So, should this issue be reported to libjpeg maintainers instead?

@kulikjak kulikjak changed the title test_file_blp.py fails on Solaris since 9.3.0 test_file_blp.py fails with libjpeg 9e since 9.3.0 Nov 22, 2022
@radarhere
Copy link
Member

Do you know where to do that? https://sourceforge.net/p/libjpeg/bugs/ hardly seems active.

@radarhere
Copy link
Member

I've created a reproduction not using Pillow, but jpeglib - https://github.com/radarhere/Pillow/tree/6741. This is a library that allow easy switching between libjpeg versions.

import jpeglib
jpeglib.version.set('9d')
im = jpeglib.read_spatial("extracted.jpg")
print("9d", im.spatial[0, 0])
jpeglib.version.set('9e')
im = jpeglib.read_spatial("extracted.jpg")
print("9e", im.spatial[0, 0])

gives

9d [  1  50   1 255]
9e [255 136 255 255]

@radarhere
Copy link
Member

I've submitted an enquiry using the contact form at https://jpegclub.org/reference/contact/

@kulikjak
Copy link
Contributor Author

Thanks - I also don't know what the correct way to report issues is - https://sourceforge.net/p/libjpeg/bugs/ seems to have one bug filed not that long ago (and with a reaction), but the contact form might also be a good idea.

@radarhere
Copy link
Member

I received a reply. In essence, they have decided to no longer support this particular type of JPEG.

I've created PR #6767 to workaround this, by saying that if we determine that a JPEG is CMYK when reading BLP, then it is CMYK.

Thank you for feedback.

There has been old criticism of the JPEG format that it would not specify the colorspace properly.
In some prior version this issue has been fixed for the 3-component colorspaces (YCC/RGB) in file jdapimin.c, function default_decompress_parms():

  switch (cinfo->num_components) {
  ...

  case 3:
    ...

    /* For robust detection of standard colorspaces
     * regardless of the presence of special markers,
     * check component IDs from SOF marker first.
     */
    ...

After this fix I also received false color reports.
I changed the component IDs in the transmitted file to the correct values by a binary (hexadecimal) editor and then the file displayd correctly.
What has happened? Previously JPEG files with RGB colorspace were primarily recognized by the presence of a custom Adobe marker. RGB JPEG files generated by Adobe software use the component identifiers 1,2,3 which are reserved by JFIF (JPEG File Interchange Format) conventions for the YCbCr colorspace. This Adobe method is therefore wrong. I suspect that Adobe programmers used some "black box" JPEG module from third parties and thus could not access the SOF component IDs, and their "solution" was to introduce the custom marker.
IJG libjpeg software, on the other hand, has always written RGB JPEG files with special SOF component IDs ('R', 'G', 'B')
in file jcparam.c function jpeg_set_colorspace(), beside emitting the custom Adobe marker, so that the files are marked correctly and also work with Adobe software.
Note: we do NOT want to depend on custom Adobe markers for proper standard colorspace detection, that's why the bogus Adobe files are no longer supported.

The new libjpeg version 9e simply applies the same fix to the 4-component colorspaces (YCCK/CMYK).
So you get different colors with wrong SOF component IDs (files generated by bogus Adobe software, for example).
In your case the image was created by apparently bogus "Intel JPEG Library" according to the COM (Comment) marker in the file, and I have attached the modified version with corrected component IDs in the SOF (Start Of Frame) and SOS (Start Of Scan) markers.
Here you can find CMYK JPEG images on the Internet with correct IDs:

Pink Carpet Magazine's

https://pinkcarpetmagazine.wordpress.com/2010/05/15/63-festival-de-cannes/_cmo2106-jpg_cmyk/
https://pinkcarpetmagazine.wordpress.com/2010/05/15/63-festival-de-cannes/_cmo2294-jpg_cmyk/

63 Festival de Cannes 12-23 Mai 2010

These images display the same colors with any libjpeg version, and such kind of images are generated by any libjpeg version when selecting CMYK JPEG output.

Regards
Guido Vollbeding
Organizer Independent JPEG Group

CMYK_corrected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants