Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make hash collisions in the registry much less likely #657

Merged
merged 3 commits into from Oct 15, 2019
Merged

Conversation

beorn7
Copy link
Member

@beorn7 beorn7 commented Oct 14, 2019

Sadly, that's not yet a complete solution (like the one planned for v2, see last item in #222). However, using xxhash for the registry improves things a lot. It also moves from summing to XOR'ing for the collectorID. (Hashing the hashes would be an option, too, but it would require sorting of the hashes. I didn't want to go down that rabbit hole and rather stick with a commutative and associative operation.)

(Interestingly, just moving from sum to XOR solved #633 already, but of course, other hash collisions might happen in practice despite the XOR. In fact, hash collisions might still occur. This PR is mostly there to make this sufficiently unlikely so that the world will survive until v2 is released.)

Signed-off-by: beorn7 <beorn@grafana.com>
This makes the collisions a bit less likely by XOR'ing descIDs rather
than adding them up for the collectorID.

Signed-off-by: beorn7 <beorn@grafana.com>
This is a much stronger hash function than fnv64a and comparably fast
(with super-fast assembly implementation for amd64).

Performance is not critical here anyway.

The old fnv64a is kept for vectors, where collision detection is in
place and the weakness of the hashing doesn't matter that much. I
implemented a vector version with xxhash and found that xxhash is
slower in all cases except very very high cardinality (where it is
only slightly faster). Also, ``xxhash.New`` comes with an allocation
of 80 bytes. Thus, to keep vectors alloc-free, we needed to add a
`sync.Pool`, which would have an additional performance overhead.

Signed-off-by: beorn7 <beorn@grafana.com>
@beorn7
Copy link
Member Author

beorn7 commented Oct 14, 2019

@awilliams this will be of interest for you.

@beorn7
Copy link
Member Author

beorn7 commented Oct 14, 2019

Id' consider this a fix for #633. The "proper" solution is contained in #222.

@awilliams
Copy link

Thanks @beorn7, this looks like a great bridge between now and v2. Nice to see the collision test case as well!

Copy link
Member

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@beorn7
Copy link
Member Author

beorn7 commented Oct 15, 2019

Thanks for having a look.

@beorn7 beorn7 merged commit 8b3e008 into master Oct 15, 2019
@beorn7 beorn7 deleted the beorn7/registry branch October 15, 2019 09:43
@@ -24,6 +24,8 @@ import (

const separatorByte byte = 255

var separatorByteSlice = []byte{255} // For convenient use with xxhash.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that it's likely to change, but should this not be []byte{separatorByte}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's better. (Originally, I wanted to remove separatorByte altogether, but then realized I want to keep the old hashing for vectors.) Will sent you a PR.

thaJeztah added a commit to thaJeztah/containerd that referenced this pull request Jan 13, 2020
full diff: prometheus/client_golang@v0.9.4...v1.1.0

Using v1.1.0, because version v1.2.0 and up use versioned import paths for the
github.com/cespare/xxhash/v2 dependency (prometheus/client_golang#657), which
causes vendoring with vndr to break due to the v2 in the import-path.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
thaJeztah added a commit to thaJeztah/docker that referenced this pull request Jan 16, 2020
full diff: prometheus/client_golang@v0.9.4...v1.1.0

Using v1.1.0, because version v1.2.0 and up use versioned import paths for the
github.com/cespare/xxhash/v2 dependency (prometheus/client_golang#657), which
causes vendoring with vndr to break due to the v2 in the import-path.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Jan 17, 2020
full diff: prometheus/client_golang@v0.9.4...v1.1.0

Using v1.1.0, because version v1.2.0 and up use versioned import paths for the
github.com/cespare/xxhash/v2 dependency (prometheus/client_golang#657), which
causes vendoring with vndr to break due to the v2 in the import-path.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 34a65cb3bad3cfc626bf0b15715680ed9633e455
Component: engine
tussennet pushed a commit to tussennet/containerd that referenced this pull request Sep 11, 2020
full diff: prometheus/client_golang@v0.9.4...v1.1.0

Using v1.1.0, because version v1.2.0 and up use versioned import paths for the
github.com/cespare/xxhash/v2 dependency (prometheus/client_golang#657), which
causes vendoring with vndr to break due to the v2 in the import-path.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants