Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fx Importer and the existing dynamo support #3033

Open
penguin-wwy opened this issue Mar 18, 2024 · 11 comments
Open

Fx Importer and the existing dynamo support #3033

penguin-wwy opened this issue Mar 18, 2024 · 11 comments

Comments

@penguin-wwy
Copy link
Collaborator

Can Fx Importer be used to replace dynamo support(e.g. import_fx_graph_as_func)?

If so, will Fx Importer retain the interface that accepts GraphModule as input, such as FxImporter.import_graph_module? I understand this to be a reasonable requirement.

@penguin-wwy
Copy link
Collaborator Author

@stellaraccident Please help me to answer my questions

@stellaraccident
Copy link
Collaborator

I think we may have missed landing a documentation patch as I remember someone having edited to answer some of this. I can't find it now, and much of the discussion happened on Discourse. We need to update the documentation to basically call out:

  • The TorchScript based dynamo support (basically everything under the pt1 directory) is deprecated. While there is no active plan to remove it, neither am I aware of anyone fixing it.
  • The FxImporter is the future and is what is getting most of the work/patches.
  • We provide very basic APIs in the torch-mlir project itself for interacting with torch.export: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/fx.py . These are basically just examples and meant that integrators can do with them what they want.
  • I have no active plans to remove the FxImporter.import_graph_module for importing stateless FX graphs. This will likely remain the proper way to import from torch.compile with import_program being the way to import from torch.export.
  • We're just doing our best trying to follow how PyTorch is evolving on some of these points, so there is a bit of guessing and adapting as they move to more feature complete on some of this stuff.

@penguin-wwy
Copy link
Collaborator Author

  • The TorchScript based dynamo support (basically everything under the pt1 directory) is deprecated. While there is no active plan to remove it, neither am I aware of anyone fixing it.

I have previously attempted to fix issue #2998 related to pt1 dynamo support, but it seems that there is no interest, or can someone review them? If possible, we should remove or at least mark these interfaces as deprecated as soon as possible.

  • We provide very basic APIs in the torch-mlir project itself for interacting with torch.export: https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/fx.py . These are basically just examples and meant that integrators can do with them what they want.
  • I have no active plans to remove the FxImporter.import_graph_module for importing stateless FX graphs. This will likely remain the proper way to import from torch.compile with import_program being the way to import from torch.export.

In our project, we rely heavily on the functionality of torch.compile, so I still prefer to retain the FxImporter.import_graph_module and maintain it in the long term. Additionally, it would be helpful to provide a calling method similar to export_and_import in fx.py.

Regarding the e2e testing for Fx Importer, will you reuse the existing pt1 e2e testing facilities?

FYI, I tried to replace import_fx_graph_as_func with import_graph_module in the current dynamo e2e test workflow to test it, and encountered a crash issue with HBC_basic. It might be beneficial if we could add this part of the test to Fx Importer.

@stellaraccident
Copy link
Collaborator

The documentation needs more updates for sure. Much of the discussion happened here:

https://discourse.llvm.org/t/torch-mlir-pytorch2-uplift/74000/26

https://discourse.llvm.org/t/rfc-rename-torch-mlir-compile-apis-and-introduce-fx-based-analogs/76646/3

I wanted to more aggressively remove the less supported parts but received pushback. I know there are still folks using the various pieces, but it is a distributed project. I did push to organize the codebase so that the older parts that are not aligned with the pytorch roadmap are in segregated parts of the project. We should probably be more forceful in the documentation to steer people towards the more actively maintained parts.

I inherited much of the project from the former maintainers about 9 months ago and have been doing my best to try to steer it to better align with where PyTorch itself is going while also making a path for my team at AMD to work on the parts that we directly use (mostly the FX layer, linalg lowerings and onnx interop). We maintain most of the torch.compile support that we use in a downstream that leverages the FxImporter (which originated in that downstream and I donated to torch-mlir).

Sorry there's still some messiness. Contributions very welcome.

@stellaraccident
Copy link
Collaborator

Regarding the e2e testing for Fx Importer, will you reuse the existing pt1 e2e testing facilities?

Probably not long term, but no one has had time yet to redo testing infra that is still working. I'd like to see more of the fx style tests be lit tests, and we are assembling good test suites downstream that we will want to enable in torch-mlir when ready.

There are a lot of tests were assembling here that we will soon want to hook up to the torch-mlir CI: https://github.com/nod-ai/SHARK-TestSuite

@penguin-wwy
Copy link
Collaborator Author

I have two issues that need to be discussed.

First, is it possible to selectively disable torchdynamo config in the current e2e tests? If there is indeed no one maintaining dynamo support at the moment, then we should not need to continue executing dynamo's e2e tests. Otherwise, every new code addition will trigger xfail, which is unnecessary.

Secondly, regarding the testing issue of the fx importer. I briefly looked at the SHARK-TestSuite and I'm not sure if my understanding is correct. Most of its tests are based on models, and it cannot quickly help me filter out unsupported scenarios. Therefore, perhaps we still need to simply integrate the fx importer into the e2e tests, which would make it easier to reproduce and locate problems, until the SHARK-TestSuite is more complete and merged into the torch-mlir CI.

@stellaraccident
Copy link
Collaborator

Some of this was also being discussed on discord last night: https://discord.com/channels/636084430946959380/742573221882364009/1227852761769447474

I think I agree with you about updating the e2e tests to be based on fx importer. That is not the same as me knowing when that will get done or who will do it. I'd love it if someone would take that on. Eventually if no one does, the cost of keeping it like it is will raise high enough that I will have to act to transition it. But I have a lot of other things on my plate at this moment.

@stellaraccident
Copy link
Collaborator

I also agree that as a pragmatic first step, we can delete e2e test suites that are adding no value. Again, it's be great if someone could help with this. It is an inconvenience for my team but hasn't gotten bad enough that we've prioritized fixing it (and have been fixing the model level test gap vs trying to improve the op level tests).

My main requirement is that lowerings in tree get tested with real execution of some kind, ideally at the op level. I ultimately don't care how that is done. The e2e test suite is currently meeting that requirement but needs an uplift.

@penguin-wwy
Copy link
Collaborator Author

I have modified a config for FxImporter( #3151 ) based on the original torchdynamo for e2e testing. It's still quite rough at the moment, but I hope to refine it and replace the torchdynamo tests in CI with it. If this aligns with your thoughts?

@sjain-stanford
Copy link
Collaborator

Thanks @penguin-wwy for driving this work. Very happy to see we now have a way of testing e2e flows using the TorchDynamo frontend.

@penguin-wwy
Copy link
Collaborator Author

I've found a issue, it seems difficult for FxImporter to execute dynamic shape cases on the e2e-test? We can pass the dynamic_shapes to the export to generate FxGraph with dynamic dims, but you need to use the same export.Dim for the same dim, otherwise, it will throw an error.

The values of s2 = L['rhs'].size()[0] and s1 = L['lhs'].size()[1] must always be equal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants