Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve perfect hashing #411

Open
tdewolff opened this issue May 11, 2021 · 0 comments
Open

Improve perfect hashing #411

tdewolff opened this issue May 11, 2021 · 0 comments

Comments

@tdewolff
Copy link
Owner

From tdewolff/parse#62

Quickly map []byte to numbers from 0 to N, for a pre-defined set of N strings. Make []byte => Hash fast. Move the Hasher code into this repository.

Hash the []byte, check if it is lower than N, if so verify that it is the same string using bytes.Equal. The latter could be skipped if we are sure there are no collisions below a certain length of input bytes (can we optimize the search for hash seeds for this?).

Promising hash functions are https://github.com/dgryski/go-metro, https://github.com/dgryski/go-farm, and https://github.com/cespare/xxhash. Especially farm seems to be fast for short strings (len < 50).

See http://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests/ for comparisons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant