Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid CRC for framing modes #25

Open
monoxgas opened this issue Oct 15, 2019 · 3 comments
Open

Invalid CRC for framing modes #25

monoxgas opened this issue Oct 15, 2019 · 3 comments

Comments

@monoxgas
Copy link

monoxgas commented Oct 15, 2019

snzip/crc32.h

Line 15 in 809c6f2

unsigned int crc = ~calculate_crc32c(~0, (const unsigned char *)buf, len);

It would appear that while calculating the masked CRC in the function above, you perform a bitwise NOT operator on the result before masking. I'm unsure what formats require this, if any (possibly snappy-in-java), but this creates a mismatched CRC for most other public implementations of framing2. Specifically:

I was hesitant to make a PR until I verified which formats might require it, but it should be an easy fix (I'd be happy to do it). Possibly two different functions, one which inverts and one which does not.

@monoxgas
Copy link
Author

Also, for quick reference, the CRC-32 section of the current framing2 spec: https://github.com/google/snappy/blob/e9e11b84e629c3e06fbaa4f0a86de02ceb9d6992/framing_format.txt#L39

@monoxgas
Copy link
Author

monoxgas commented Oct 15, 2019

The plot thickens.

Upon further inspection, I believe the implementation here is indeed correct and it's the Mozilla repo which is at fault. I also erroneously claimed that python-snappy disagrees with the CRC checksums currently being computed, however they do indeed match.

Specifically, the formal spec of CRC-32/ISCSI (CRC-32C) states that the initialization value should be 0xffffffff/~0 and that the final value should be complemented : x ^ 0xffffffff or x = ~x. This is exactly what is happening here:

snzip/crc32.h

Line 15 in 809c6f2

unsigned int crc = ~calculate_crc32c(~0, (const unsigned char *)buf, len);

References:

However this still creates a small pickle, as there is now technically another "flavour" for this single compression format. Given that the difference is so small, I would hesitate to create an entirely new format such as mozilla-framing, but whatever fits the project best. An alternative solution might be to compare both the correct checksum and it's complement. It's technically inaccurate, but might save some headache and should never cause any real issues.

@kubo
Copy link
Owner

kubo commented Oct 16, 2019

I made a small program to compare the results of the implementation with CRC examples.

$ git clone https://github.com/kubo/snzip.git
$ cd snzip
$ ./autogen.sh
$ ./configure
$ wget https://gist.githubusercontent.com/kubo/c6bc9f038e3dd66bfb37da4f8596d119/raw/b97e2a7d325fc15edbc149a036f0605f79a2e22d/crc32_test.c
  ...omitted...
$ gcc -o crc32_test crc32_test.c crc32.c crc32_sse4_2.c -msse4.2
$ ./crc32_test 
32 bytes of zeroes: OK
32 bytes of ones: OK
32 bytes of incrementing 00..1f: OK
32 bytes of decrementing 1f..00: OK
An iSCSI - SCSI Read (10) Command PDU: OK

The results are same with the CRC examples.

As for your two ideas, I prefer the former. However I postpone my decision for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants