Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(core): simplify bloom bucket computation #40489

Closed
wants to merge 3 commits into from

Conversation

JoostK
Copy link
Member

@JoostK JoostK commented Jan 19, 2021

The injector system uses a bloom filter to determine if a token is
possibly defined in the node injector tree, which is stored across
multiple bloom buckets that each represent 32 bits of the full 256-bit
wide bloom hash. This means that a computation is required to determine
the exact bloom bucket which is responsible for storing any given 32-bit
interval, which was previously computed using three bitmask operations
and three branches to derive the bloom bucket offset.

This commit exploits the observation that all bits beyond the low 5 bits
of the bloom hash are an accurate representation for the bucket offset,
if shifted right such that those bits become the least significant bits.
This reduces the three bitmask operations and three branches with a
single shift operation, while additionally offering a code size
improvement.

@JoostK JoostK added area: performance area: core Issues related to the framework runtime target: patch This PR is targeted for the next patch release labels Jan 19, 2021
@ngbot ngbot bot modified the milestone: Backlog Jan 19, 2021
@google-cla google-cla bot added the cla: yes label Jan 19, 2021
The injector system uses a bloom filter to determine if a token is
possibly defined in the node injector tree, which is stored across
multiple bloom buckets that each represent 32 bits of the full 256-bit
wide bloom hash. This means that a computation is required to determine
the exact bloom bucket which is responsible for storing any given 32-bit
interval, which was previously computed using three bitmask operations
and three branches to derive the bloom bucket offset.

This commit exploits the observation that all bits beyond the low 5 bits
of the bloom hash are an accurate representation for the bucket offset,
if shifted right such that those bits become the least significant bits.
This reduces the three bitmask operations and three branches with a
single shift operation, while additionally offering a code size
improvement.
…re handled correctly

This commits adds additional expectations to verify that the bloom
filter is able to correctly handle token IDs that exceed the size of
the bloom filter (which is currently 256 bits).
@alfaproject
Copy link
Contributor

Sexy <3

Copy link
Contributor

@mhevery mhevery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

// Each bloom bucket in `tData` represents `BLOOM_BUCKET_BITS` number of bits of `bloomHash`.
// Any bits in `bloomHash` beyond `BLOOM_BUCKET_BITS` indicate the bucket offset that the mask
// should be written to.
(tView.data as number[])[injectorIndex + (bloomHash >> BLOOM_BUCKET_BITS)] |= mask;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that is clear! I can't believe I did not think of that! 😊

@JoostK JoostK marked this pull request as ready for review January 19, 2021 22:31
@JoostK JoostK added the action: review The PR is still awaiting reviews from at least one requested reviewer label Jan 19, 2021
Copy link
Contributor

@jessicajaniuk jessicajaniuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed-for: size-tracking

@pullapprove pullapprove bot requested a review from alxhub January 19, 2021 22:41
Copy link
Contributor

@IgorMinar IgorMinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Very nice! Thank you.

Reviewed-for: size-tracking

@AndrewKushnir
Copy link
Contributor

Presubmit.

@AndrewKushnir AndrewKushnir added action: presubmit The PR is in need of a google3 presubmit and removed action: review The PR is still awaiting reviews from at least one requested reviewer action: presubmit The PR is in need of a google3 presubmit labels Jan 20, 2021
@JoostK JoostK added the action: merge The PR is ready for merge by the caretaker label Jan 20, 2021
jessicajaniuk pushed a commit that referenced this pull request Jan 21, 2021
The injector system uses a bloom filter to determine if a token is
possibly defined in the node injector tree, which is stored across
multiple bloom buckets that each represent 32 bits of the full 256-bit
wide bloom hash. This means that a computation is required to determine
the exact bloom bucket which is responsible for storing any given 32-bit
interval, which was previously computed using three bitmask operations
and three branches to derive the bloom bucket offset.

This commit exploits the observation that all bits beyond the low 5 bits
of the bloom hash are an accurate representation for the bucket offset,
if shifted right such that those bits become the least significant bits.
This reduces the three bitmask operations and three branches with a
single shift operation, while additionally offering a code size
improvement.

PR Close #40489
jessicajaniuk pushed a commit that referenced this pull request Jan 21, 2021
…re handled correctly (#40489)

This commits adds additional expectations to verify that the bloom
filter is able to correctly handle token IDs that exceed the size of
the bloom filter (which is currently 256 bits).

PR Close #40489
jessicajaniuk pushed a commit that referenced this pull request Jan 21, 2021
…re handled correctly (#40489)

This commits adds additional expectations to verify that the bloom
filter is able to correctly handle token IDs that exceed the size of
the bloom filter (which is currently 256 bits).

PR Close #40489
@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Feb 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
action: merge The PR is ready for merge by the caretaker area: core Issues related to the framework runtime area: performance cla: yes target: patch This PR is targeted for the next patch release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants