Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] RPC Server Python Exception Can't Be Send to RPC Client #16707

Open
Johnson9009 opened this issue Mar 12, 2024 · 6 comments
Open

[Bug] RPC Server Python Exception Can't Be Send to RPC Client #16707

Johnson9009 opened this issue Mar 12, 2024 · 6 comments
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug

Comments

@Johnson9009
Copy link
Contributor

After upgrade with the V0.15.0, we found the RPC have a bug, if the RPC server have some exception, we can't see the error message in RPC client like below.

image

After some investigations, I found only if the exception is raise by a Python packed function will cause this issue, like the below experiment, the C++'s exception is thrown to Python side correctly, then the Python throw it again will lose the error message, the RPC client only can receive the exception but the error message in it is empty.
image

image

If the exception is happened in a pure C++ remote packed function, then the error message is correctly sent back to the RPC client.

How to reproduce ?
just hard code a raise "xxxxx" in the function begin of load_module in python/tvm/rpc/server.py, or other Python packed function, and then call it in RPC client.

@Johnson9009 Johnson9009 added type: bug needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it labels Mar 12, 2024
@Johnson9009
Copy link
Contributor Author

Johnson9009 commented Mar 12, 2024

@tqchen @Lunderberg Can you help to see whether it is relevant to the changes of #15596? It is important for us to fix this issue, because Q1 release is coming, thanks.

@Johnson9009
Copy link
Contributor Author

image
Is it relevant to this change?

@tqchen
Copy link
Member

tqchen commented Mar 12, 2024

maybe indeed related to #15596 @Lunderberg seems we need to stringify python errors if they are caught by RPC

@Lunderberg
Copy link
Contributor

Agreed. The full stack trace is in the python object, so we should be able to serialize it for RPC use. Prior to #15596, the full stack trace was embedded into the string error message, which worked with RPC transfers, but made it quite difficult to track nested levels of error messages.

I'm thinking that RPC's serialization will be the reverse of the parsing that occurs here, so that the stack trace objects can be rebuilt when received. (Though, missing the full python variable information available within a non-RPC err.__traceback__.)

@tqchen
Copy link
Member

tqchen commented Mar 19, 2024

@Lunderberg can you help on this one ?

@Lunderberg
Copy link
Contributor

Thank you for the ping, and I probably can in a few weeks, but am currently low on available time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug
Projects
None yet
Development

No branches or pull requests

3 participants