Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up HoistPlaintextOps #485

Merged
merged 1 commit into from Mar 6, 2024

Conversation

j2kun
Copy link
Collaborator

@j2kun j2kun commented Mar 5, 2024

Fixes #479

The two main problems:

  1. HoistPlaintextOps used std::find_if to scan the entire generic body, only to extract a single op, leading to a quadratic scan when there were a linear number of ops to hoist. When unrolling an affine.for, each iteration gets an affine.apply op, and they can all be lifted, so it nearly always causes quadratic behavior.
  2. In extractOpBeforeGeneric, the newly-created single-op generic was cloning all of the operands from the original generic op as its operands, even if it had zero secret operands. When lifting ~1k ops, this became a problem because the output of each single-op generic is added as an operand to the original generic it was lifted from, so we got into a situation like:
  • start: original generic has 1 arg
  • lift 1 op with 1 arg; original generic gets +1 arg
  • lift 1 op with 2 args; original generic gets +1 arg
  • lift 1 op with 3 args; original generic gets +1 arg
  • ... (x1000)
    After that, we have a quadratic number of unused args, so we apply a quadratic number of EraseUnusedGenericArg patterns. By more finely tracking the subset of operands needed in the new generic, we can reduce this to (# ops to lift * # operands per lifted op), which is minimal. The only tricky bit is that, if the op being extracted is a loop nest, you have to walk the op's subtree to find all the generic arguments it depends on.

The test I added ran in 15s with the bug still in place, and after these fixes it runs in 0.5s.

@j2kun j2kun requested review from asraa and removed request for asraa March 5, 2024 22:33
@j2kun j2kun requested a review from asraa March 5, 2024 23:04
@j2kun j2kun force-pushed the hoist-plaintext-speed branch 2 times, most recently from ec62a43 to b2f7a4c Compare March 5, 2024 23:11
Copy link
Collaborator

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 15 seconds to .5 seconds is great

Comment on lines +577 to 582
for (Operation &op : opRange) {
if (canHoist(op)) {
opsToHoist.push_back(&op);
hoistedAny = true;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could consider llvm::make_filter_range here

@j2kun j2kun added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Mar 6, 2024
@copybara-service copybara-service bot merged commit 04f6106 into google:main Mar 6, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speed up HoistPlaintextOps
2 participants