You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We get predictable results when we dump to a byte array instead: LongHashFunction.xx().hashBytes("test".getBytes()) gets the same output as xxhash.xxh64('test').intdigest() which is the result that we'd expect.
We can't figure out how to have the python implementation find the same key for LongHashFunction.xx().hashChars("test")
If there's a way for other languages to get the same output for a given input string, it'd be nice to have it documented.
The text was updated successfully, but these errors were encountered:
We can't figure out how to have the python implementation find the same key for LongHashFunction.xx().hashChars("test")
You probably need to encode a string in UTF-16, because that's what essentially the Java's character arrays are. Also would need byte order, though.
leventov
changed the title
xxHash implementation produces inconsistent results compared to python
Document a way for other languages to get the same output for a given input string
Jul 19, 2019
the default encoding in different languages are not same. to get the same hash result, the binary layout of the input must be exactly same, so select a well defined encoding codec before hash.
The xxHash implementation doesn't produce predictable results when compared to python for strings (as in the given example.
I believe this is due to java's handling of character arrays: https://codeahoy.com/2016/05/08/the-char-type-in-java-is-broken/
We get predictable results when we dump to a byte array instead:
LongHashFunction.xx().hashBytes("test".getBytes())
gets the same output asxxhash.xxh64('test').intdigest()
which is the result that we'd expect.We can't figure out how to have the python implementation find the same key for
LongHashFunction.xx().hashChars("test")
If there's a way for other languages to get the same output for a given input string, it'd be nice to have it documented.
The text was updated successfully, but these errors were encountered: