Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Instance-ID algrorithm to BLAKE3 #87

Open
titusz opened this issue Jun 30, 2020 · 5 comments
Open

Change Instance-ID algrorithm to BLAKE3 #87

titusz opened this issue Jun 30, 2020 · 5 comments

Comments

@titusz
Copy link
Member

titusz commented Jun 30, 2020

BLAKE3 turns out to be the ideal cryptographic hash for the Instance-ID. As stated by its developers BLAKE3 is:

  • Much faster than MD5, SHA-1, SHA-2, SHA-3, and BLAKE2 (~10x of sha256 based on our tests).
  • Secure, unlike MD5 and SHA-1. And secure against length extension, unlike SHA-2.
  • Highly parallelizable across any number of threads and SIMD lanes, because it's a Merkle tree on the inside.
  • Capable of verified streaming and incremental updates, again because it's a Merkle tree.
  • A PRF, MAC, KDF, and XOF, as well as a regular hash.
  • One algorithm with no variants, which is fast on x86-64 and also on smaller architectures.

For details see: https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blake3.pdf

@titusz titusz added this to the Version 1.1 milestone Jun 30, 2020
@titusz titusz self-assigned this Jun 30, 2020
@titusz titusz changed the title Change Instance-ID algrorithm to BLAKE2 Change Instance-ID algrorithm to BLAKE3 Jun 30, 2020
@lrosenthol
Copy link

What about supporting multihash (https://multiformats.io/multihash/) which would (a) allow implementors to choose the right algorithm for their implementation and (b) support forward thinking (since all hashes will be broken at some point)?

@titusz
Copy link
Member Author

titusz commented Aug 30, 2020

Yes we should carefully think about self-descriptiveness and forward compatibility. The ISCC is a composition of multiple hashes, that can also be used separately if required. One way would be to give each component a 2 byte header where we can encode the type, version, length end eventually type specific header information. Something like this:
ISCC-Component-Structure

@lrosenthol
Copy link

My point though @titusz is that there is already a standard for this - see my link in the previous comment. There is no reason to reinvent the wheel

@titusz
Copy link
Member Author

titusz commented Aug 31, 2020

@lrosenthol thank you for pointing this out. Adopting existing standards is indeed preferable where it makes sense. I am following the development of multiformats closely and have also been experimenting with multihash.

On the ISCC component level we currently have a 1-byte header plus 8-byte body structure. To conform to multihash we would need to add a minimum of 2 bytes header data per component to indicate type and length (we also need to indicate version and subtype specific flags). A multihash representation on the full ISCC (4 components combined) level might be good idea.

Multihashes are presented in base16 (hex) encoding. For the printable representation ISCC currently uses a more compact base58 encoding with a custom alphabet for human readability of the component type. So we would need to add the ISCC encoding to the multibase table and prepend another character per component. Which brings us to at least 3 bytes overhead per component while still missing the required version and subtype information.

Code compactness is a crucial design target for the ISCC. We have been collecting feedback on the expectations and requirements for the ISCC from a broad community. There are still some open questions on the final byte structure layout of the ISCC. We need to get those right in the first place. When that is stable there can be support for different printable encodings including multihash.

@titusz
Copy link
Member Author

titusz commented Dec 17, 2021

@lrosenthol I have opened an issue to make the next version of ISCC multiformats compatible: multiformats/multicodec#252 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants