Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Support storing reference images as hashes #177

Open
JoseAlcerreca opened this issue Oct 9, 2023 · 6 comments
Open

[Discussion] Support storing reference images as hashes #177

JoseAlcerreca opened this issue Oct 9, 2023 · 6 comments

Comments

@JoseAlcerreca
Copy link

JoseAlcerreca commented Oct 9, 2023

Not really a feature request yet, but I think this would be a good place to discuss this idea by Alex Vanyo:

If you use a diff threshold of 0, instead of storing the reference images as PNG, store a hash of the file.

The way it works is:

  • The record task takes new screenshots and stores their hashes in their corresponding files. For example in the same folder with an md5 extension: screenshots/ForYouScreenPopulatedAndLoading_foldable.png.md5
  • The verification task takes new screenshots, hashes them and compares with the existing files

👍 It eliminates the problem with large files and how to store them (git LFS, cloud buckets...)
👎 It makes development less intuitive because screenshots no longer live alongside the code (but this is true with cloud buckets, different branches and arguably git LFS)
👎 Reports are tricky because Roborazzi can't know what the base branch is (the commit that was used to generate the existing reference images).

This is doable on CI (see my prototype and example PR), but:

👎 there's no easy way to run screenshot tests locally
👎 👎 the PR doesn't show any screenshots so you would have to do something like takahirom/roborazzi-compare-on-github-comment-sample#1 which complicates the workflow even more.

Some very crude ideas:

  • Store the commit ID that was used to generate each screenshot. Roborazzi could run a command to check out the commit to generate the reports. This is not great.
  • Create a Github action (and Bitrise, Bitbucket...?) that takes care of everything so at least the CI the dev experience is decent.
@takahirom
Copy link
Owner

takahirom commented Oct 9, 2023

As you mentioned in this issue, in general app development, it's challenging to manage changes if you can't see what's different between images. Therefore, storing images as hashes might not be practical. However, for UI library development where not a single pixel changes, hashing could be useful. Even so, having some way to view the diff when changes do occur would be beneficial. 👀
Hashing could be offered as an option through gradle.properties and record options. I'm open to adding this feature, although I can't quite envision how it would work in practice.

@alexvanyo
Copy link

It might be interesting if the hash output can be done by Roborazzi in addition to storing the image itself, as opposed to it being an exclusive choice.

I could see an approach where Roborazzi records both the image and the hash (or just one or the other, depending on configuration). Then during verification for a test with a diff threshold of 0, it can use all available golden information:

  • if a golden image is present, then it is used directly to verify visual report if the test fails. If a golden hash is also present, maybe sanity check that the golden hash is the hash for the golden image?
  • if the golden image isn't present, but a golden hash is, then run the test based on the resulting hash value. If the check fails, the test will fail, but there's no "before" image to nicely use in a report
  • if neither the golden hash nor the golden image is present, then the test fails due to missing goldens

What happens to the generated golden images in version control system could be left up to the project: maybe they choose to check in the raw images anyway directly, maybe they use Git LFS, maybe they prevent the images being committed using .gitconfig or a similar mechanism (and then undertake a more complicated CI setup to regenerate the old images using a base branch, and additional work to make it visible in the pull request).

@takahirom
Copy link
Owner

takahirom commented Oct 12, 2023

Thank you. I'll think about it while making a prototype. I'm wondering whether to base the MD5 calculation on the image's pixels or the file's binary. I'm unsure about what to use as the seed. Do you have any recommendations?

@JoseAlcerreca JoseAlcerreca changed the title [Discussion] Support storing references images as hashes [Discussion] Support storing reference images as hashes Oct 19, 2023
@takahirom
Copy link
Owner

takahirom commented Oct 20, 2023

I've noticed that the environment can greatly affect image pixels. Thus, it would be ideal if we could customize it in a manner similar to DropBox's ImageComparator, especially since we use the maxDistance from DropBox's SimpleImageComparator.

data class Color(val r: Float, val g: Float, val b: Float, val a: Float = 1.0f)

interface Image {
  val width: Int
  val height: Int
  fun getPixel(x: Int, y: Int): Color
}

interface ImageHashComperator {

  data class HashResult(
    val hashString: String
  )

  fun hash(image: Image, mask: Mask? = null): HashResult
  fun areSimilar(hashResultA:HashResult, hashResultB:HashResult): Boolean = hashResultA == hashResultB

@takahirom
Copy link
Owner

I'm not sure if it's feasible, but I've heard that by using pHash (perceptual hash), you can determine if images are similar based on how close their hash values are. Additionally, with the advancements in generative AI and compression algorithms, there might be other possibilities.

@takahirom
Copy link
Owner

I've created a prototype of this feature. Some tests still need to be added, but I believe the implementation will resemble the current one. The logic is somewhat complex; however, I'm open to introducing it if there's a team interested. Without users for this feature, I'm hesitant to integrate it. If you're considering it, please leave a reaction.
https://github.com/takahirom/roborazzi/pull/204/files#diff-7ba98c05ac15006c23589beeee900e7836c1b28e167b6cd85e55effe788c2736R6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants