New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow hashers other than SipHasher #88
Comments
It's definitely possible, though it'll be a bit interesting to see how to configure it at compile time. It is worth noting that the current algorithm appears to need a very high quality hash function. xxhash works, and we used to use it, but some lower quality functions prevent the algorithm from finding a solution. I know I tried Java-style hashes at one point that didn't work, and I may have tried FNV as well at some point. |
Oh, I see, the algorithm needs specific hash functions too. I guess a more specific trait (more specific than |
Today on IRC:
|
Would seahash work for this purpose? |
I haven't tried, but I think SeaHash is supposed to produce pretty high quality output so I wouldn't be surprised if it did. |
PCG hash is another alternative? |
In many cases it is possible to find a very simple minimal hash function. For example, when looking up an ECC public key by value in a table of ~200 public keys, you can probably find a hash function that only looks at a single byte or a few bytes of the key and does 0, 1, or a few bitwise operations on those bytes. IMO it would be ideal if I could pass my own hash function body to the generator, and have the generator verify that it is indeed a (minimal) perfect hash function for the input data, and then have it use my function. |
Out of curiosity I tried out swapping in the FxHash and Seahash crates here. FxHash did not finish compiling the tests; Seahash works but is slower on the benchmarks here (takes about 60ns instead of about 30ns on my laptop). If that's not too disappointing to folks interested in trying another hash--it looks like the main complication to making this generic would be making the |
@djudd Odd; I had the impression that seahash was supposed to be faster. |
With the fixes in #164 I was able to get FNVHasher solving 1M entries, but it was slower than SipHasher. |
@abonander do you mean slower at constructing the map, or looking up one key in it? |
@SimonSapin constructing the map; I haven't benchmarked accessing it but i suspect it would be a relatively small difference since the hash function is only executed once per lookup. |
Hello from the future. The abandoned #236 has scaffolded the internal support for a generic hash function. To solve this, it seems to me that the user would need to have an additional proc-macro intermediary crate for their project. Am I wrong/missing a better way to do this? |
Solves part of rust-phf#88 by providing a separate function to customize the hashing function during generation. That way, one can avoid usage of the default SipHash as well as avoid having a `PhfHash` bound on the keys. This custom hash function can be used in combination with `get_index` directly.
With a compile-time hashmap, having a cryptographically secure hash is less important since there are no pathological cases to worry about -- everything is done at compile time.
IIRC SipHasher isn't that secure anyway, but there are options like FNV available too.
Perhaps we should allow for these to be selected instead, using defaulted generics?
The text was updated successfully, but these errors were encountered: