Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pageserver): use fnv hash for aux file encoding #7742

Merged
merged 3 commits into from
May 15, 2024

Conversation

skyzh
Copy link
Member

@skyzh skyzh commented May 13, 2024

Problem

FNV hash is simple, portable, and stable. This pull request vendors the FNV hash implementation from servo and modified it to use the u128 variant.

replaces #7644

ref #7462

Summary of changes

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh skyzh requested review from hlinnaka and koivunej May 13, 2024 16:13
@skyzh skyzh requested a review from a team as a code owner May 13, 2024 16:13
Copy link

github-actions bot commented May 13, 2024

3060 tests run: 2933 passed, 0 failed, 127 skipped (full report)


Code coverage* (full report)

  • functions: 31.4% (6339 of 20190 functions)
  • lines: 47.4% (47964 of 101267 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
c5f03a8 at 2024-05-14T15:46:38.218Z :recycle:

@hlinnaka
Copy link
Contributor

We can modify the algorithm to use 128b instead, but I didn't do it in this pull request, as I don't know if using 128b still preserves the properties of the hash algorithm.

128 bit and other length variants exist, see https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function. That particular crate only implements the 64 bit variant but there are others, and it's not hard to roll your own.

I'd assume 64 bits to be enough, though.

@hlinnaka
Copy link
Contributor

Do we need to worry about malicious actors creating intentional collisions here? What happens if there's a ton of collisions? Does it only cause trouble for the single tenant, or the whole pageserver?

Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh
Copy link
Member Author

skyzh commented May 14, 2024

switched to FNV128 using the prime + offset in the wiki, better to get a review again, or I'll merge it later today.

Signed-off-by: Alex Chi Z <chi@neon.tech>
@skyzh
Copy link
Member Author

skyzh commented May 14, 2024

Okay it still requires a review from code owner 🤣 @koivunej do you have time to take a look? Thanks :)

@skyzh
Copy link
Member Author

skyzh commented May 14, 2024

Do we need to worry about malicious actors creating intentional collisions here? What happens if there's a ton of collisions? Does it only cause trouble for the single tenant, or the whole pageserver?

Creating intentional collision might cause write amplifications like before John’s one-file-one-delta-record PR, and it will only cause trouble for a single tenant. We can change the encoding of aux file value to delta + image in the future if it indeed becomes a problem.

@skyzh skyzh merged commit 4b97683 into main May 15, 2024
55 checks passed
@skyzh skyzh deleted the skyzh/fnv-hash-aux-file branch May 15, 2024 17:17
@skyzh skyzh mentioned this pull request May 15, 2024
24 tasks
a-masterov pushed a commit that referenced this pull request May 20, 2024
FNV hash is simple, portable, and stable. This pull request vendors the
FNV hash implementation from servo and modified it to use the u128
variant.

replaces #7644

ref #7462

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants