Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rotate-and-reduce for linear sequences of rotations #531

Merged
merged 1 commit into from Mar 27, 2024

Conversation

j2kun
Copy link
Collaborator

@j2kun j2kun commented Mar 13, 2024

Note: rebased off #530

So this is working but... I can't say I'm particularly happy with it.

Now it recognizes sequences of rotates-and-adds that comprise an aggregation of every value in a vector (exactly once), and replaces them with the logarithmic rotate-and-reduce trick. I wrote down a few non-triggering examples, but I'm not confident it's bug-free by any means.

To the extent that this PR uses a dataflow analysis, I'm somewhat happy. It's an improvement over how I handled the version of this trick that worked on tensor.extract. However, the dataflow analysis has some limitations that left me frustrated. Specifically, I wanted to have the dataflow analysis accumulate the tensor accesses with multiplicity and across multiple tensors. The original
member variable inside RotationLattice was something like DenseMap<Value, std::unordered_multiset<int64_t>>. The join method then updated all accessed tensors and added the index access counts rather than just taking a union as it does in this PR.

However, doing this hit assertions in the dataflow framework because the Lattice class asserts that repeated applications of join are idempotent (they call it monotonic) but in my case that would accumulate. I worked around it by constructing the analysis to work with set unions, and then added an extra step in the actual pass to do more specific detailed checks for validity. I hoped that with the extra index multiplicity information, we might be able to do more sophisticated transformations, like reordering some of the ops instead of just giving up. E.g., one thing that wouldn't work here is if the IR computes twice the sum of the tensor elements. I wonder if not using the Lattice framework and going one step lower to AbstractSparseForwardDataFlowAnalysis might fix that, or just making a totally custom analysis unrelated to the dataflow engine, but I ran out of steam this week. I also suspect we could rely on some more academic ideas like e-graphs to identify more optimal transformations.

Fixes #520

Fixes #511

@j2kun j2kun self-assigned this Mar 13, 2024
@j2kun j2kun marked this pull request as draft March 13, 2024 23:30
@j2kun j2kun force-pushed the linear-to-log-rotates branch 3 times, most recently from b1c0edc to cea97e1 Compare March 15, 2024 23:42
@j2kun j2kun changed the title WIP: rotate-and-reduce for sequences of rotations Add rotate-and-reduce for linear sequences of rotations Mar 15, 2024
@j2kun j2kun marked this pull request as ready for review March 16, 2024 00:30
@j2kun j2kun removed their assignment Mar 16, 2024
@j2kun j2kun force-pushed the linear-to-log-rotates branch 3 times, most recently from 52f14c7 to 6efd0d4 Compare March 18, 2024 19:55
Copy link
Collaborator

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Just a few nits

@j2kun j2kun force-pushed the linear-to-log-rotates branch 4 times, most recently from e42ae29 to 0ff09a0 Compare March 26, 2024 19:27
@j2kun j2kun added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Mar 26, 2024
copybara-service bot pushed a commit that referenced this pull request Mar 26, 2024
Depends on #531

Fixes #521

PiperOrigin-RevId: 619333026
@copybara-service copybara-service bot merged commit 2f996ac into google:main Mar 27, 2024
7 checks passed
copybara-service bot pushed a commit that referenced this pull request Mar 27, 2024
This is intended to be a consistent bundling of the ported HECO optimizations, so that all tests can use them identically.

Fixes #556
Depends on #531 for changes to hamming-distance test

PiperOrigin-RevId: 619348713
copybara-service bot pushed a commit that referenced this pull request Mar 27, 2024
This is intended to be a consistent bundling of the ported HECO optimizations, so that all tests can use them identically.

Fixes #556
Depends on #531 for changes to hamming-distance test

PiperOrigin-RevId: 619348713
copybara-service bot pushed a commit that referenced this pull request Mar 27, 2024
This is intended to be a consistent bundling of the ported HECO optimizations, so that all tests can use them identically.

Fixes #556
Depends on #531 for changes to hamming-distance test

PiperOrigin-RevId: 619655075
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing
Projects
None yet
2 participants