Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tp_richcompare handler for Imaging_Type/ImagingCore #7260

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

Yay295
Copy link
Contributor

@Yay295 Yay295 commented Jul 5, 2023

Imaging_Type/ImagingCore is exposed by Image.getdata(), so this allows you to properly compare the result of that method without it being an identity comparison. This should also speed up comparison between Image objects, because now the comparison is done in C instead of passing the data back to Python for it to be compared.

Tests/test_lib_image.py Outdated Show resolved Hide resolved
@radarhere
Copy link
Member

Valgrind is currently failing - https://github.com/python-pillow/Pillow/actions/runs/5460694184/jobs/9937936242?pr=7260

Also, I went to test the speed of image equality with this, radarhere@98e02aa... and found https://github.com/radarhere/Pillow/actions/runs/5461893489/jobs/9940492884#step:8:4787

Fatal Python error: bool_dealloc: deallocating True or False: bug likely caused by a refcount error in a C extension

@Yay295
Copy link
Contributor Author

Yay295 commented Jul 5, 2023

oops, Py_NewRef wasn't added until Python 3.10.

} else if (!strcmp(mode, "LA") || !strcmp(mode, "La") || !strcmp(mode, "PA")) {
// These modes have two channels in four bytes,
// so we have to ignore the middle two bytes.
mask = 0xff0000ff;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to ignore the middle two bytes, they should be equal to the first two. See

static void
rgb2la(UINT8 *out, const UINT8 *in, int xsize) {
int x;
for (x = 0; x < xsize; x++, in += 4, out += 4) {
/* ITU-R Recommendation 601-2 (assuming nonlinear RGB) */
out[0] = out[1] = out[2] = L24(in) >> 16;
out[3] = 255;
}
}
for example.

I initially wondered if you were adding this mask for speed, but looking at the next block, removing this mask would allow memcmp to be used, which I expect would be faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"they should be equal to the first two" You'd think so, but they're not: https://github.com/Yay295/Pillow/actions/runs/5471546420/jobs/9962793523#step:8:1757

This is why I added those last two tests. Except I couldn't find a way to set this up from Python so I marked them to skip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some logging: https://github.com/Yay295/Pillow/actions/runs/5472920560/jobs/9965699214#step:8:1759

After im.reduce() the data is 0xffb8b8b8, but after im.crop().reduce() the data is 0xff0000b8.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should be equal to the first two

They set to the same with rgb2la conversion, but this doesn't mean that they should be equal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new method to ImagingCore to allow directly setting the internal bytes. All of the tests work now.

also, RGBX technically has 4 channels, so it shouldn't be masked.
also, fix the mask for modes with three channels.

@pytest.mark.parametrize(
("mode", "rawmode"),
(("RGB", "RGBX"), ("YCbCr", "YCbCrX"), ("HSV", None), ("LAB", None)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more semantically.

Suggested change
(("RGB", "RGBX"), ("YCbCr", "YCbCrX"), ("HSV", None), ("LAB", None)),
[("RGB", "RGBX"), ("YCbCr", "YCbCrX"), ("HSV", None), ("LAB", None)],

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A previous decision was to use tuples. #6525 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've provided a detailed answer there. Hope @radarhere will agree with me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example you can look at the Django settings documentation. Settings are almost always immutable, but both lists and tuples are used for different things. Tuples are used for ADMINS items (first value is a name, second is an email), LANGUAGES items, SECURE_PROXY_SSL_HEADER (first value is a name, second is a value of header). While lists are used for all sorts of… well… lists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not overly concerned about which is used, and I don't think performance considerations are exactly a priority for tests, so I'm happy to go with @homm's preference here.



@pytest.mark.skip(reason="no way to directly set C bytes from Python")
@pytest.mark.parametrize("mode", ("LA", "La", "PA"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize("mode", ("LA", "La", "PA"))
@pytest.mark.parametrize("mode", ["LA", "La", "PA"])

src/_imaging.c Outdated
@@ -3534,6 +3534,34 @@ _save_ppm(ImagingObject *self, PyObject *args) {
return Py_None;
}

static PyObject *
_set_internal_pixel_bytes(ImagingObject *self, PyObject *args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think this is a good idea to add core method just for testing. The problems are:

  • It needs maintenance in future
  • It exposes internals which will be harder to change
  • This method itself doesn't tested

However, for a long time I suffer from lack of functionality just to change internal model of the image without copying. Probably we can add something like ImagingCore.rewrite_mode() which will work within the same pixelsize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exposes internals which will be harder to change

I don't think it exposes anything that wouldn't also be exposed by ImagingCore.rewrite_mode().

This method itself doesn't tested

That can be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we can add something like ImagingCore.rewrite_mode() which will work within the same pixelsize.

Unfortunately, pixelsize/linesize are not available anywhere except on an image - there is no list to loop through that has this information.

Copy link
Member

@homm homm Jul 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, right. Maybe create minimal image in the target mode just to check it's pixel size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would work. I still don't see how this is any better than the function I already added though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make Image.mode a property, but Image.mode is currently writable, while Image.im.mode is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created #7271 to look into that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To unblock this PR I'd be happy with tests for modes we can test without core changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ImagingCore.rewrite_mode() there are none, because ImagingCore.rewrite_mode() cannot update the mode that is currently returned from Image.mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created Yay295#8 as a suggestion.

Use single integer color instead of adding set_internal_pixel_bytes()
assert img_a.im != img_b.im


@pytest.mark.parametrize("mode", [mode for mode in mode_names_not_bgr if mode != "1"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is 1 excluded here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if you scroll up a bit, there's test_not_equal_mode_1 just for mode 1 images, with comments explaining why it's different.

Basically, it's easier to create a mode 1 image from bytes if we use rawmode "1;8".

Comment on lines +68 to +69
# alternatively, random.randbytes() in Python 3.9
data = secrets.token_bytes(num_img_bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from oss-fuzz, Pillow doesn't use random data in the test suite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different image modes need different amounts of bytes to create them, so this seemed like the best way to get enough bytes, and then get a second set of bytes that are different.

|| _compare_pixels(
palette_a->mode,
1,
palette_a->size * 4,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this multiplied by 4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The third parameter is the line size, but palettes don't have a line size, so it has to be calculated from the number of pixels and the pixel size. Palettes also don't have a stored pixel size, but currently it's always 4. It's not good to hardcode it like this, but there isn't anywhere to actually get it from.

Comment on lines +24 to +28
# Image.frombytes() doesn't work with BGR modes:
# unknown raw mode for given image mode
# "BGR;15",
# "BGR;16",
# "BGR;24",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed these errors in #7303, but have since discovered that is not sufficient to get this PR working for those modes. The problem is that those modes are using their lines efficiently, rather than spacing out their data into 4 bytes per pixel.

im->pixelsize = 3;
im->linesize = (xsize * 3 + 3) & -4;

@radarhere radarhere mentioned this pull request Apr 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants