We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
hi i see the following error - it looks like the torch.compile worked fine but when i invoke the prediction after that it errors out:
predict_fn error: backend='torch_tensorrt' raised: TypeError: pybind11::init(): factory function returned nullptr
full log: torch_error.txt
Used docker image:
# use sagemaker DLC FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-gpu-py310-cu118-ubuntu20.04-sagemaker # Install additional dependencies RUN python -m pip install torch torch-tensorrt tensorrt --extra-index-ur https://download.pytorch.org/whl/cu118
how was the model compiled?
model.model_body[0].auto_model = torch.compile(model.model_body[0].auto_model, backend="torch_tensorrt", dynamic=False, options={"truncate_long_and_double": True, "precision": torch.half, "debug": True, "min_block_size": 1, "optimization_level": 4, "use_python_runtime": False})
to rule out that the issue is somewhere else - i tested with the following torch.compile - this works fine:
model.model_body[0].auto_model = torch.compile(model.model_body[0].auto_model, mode="reduce-overhead")
pytorch 2.1
cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang
The text was updated successfully, but these errors were encountered:
This probably should be sent to tensorrt repo
Sorry, something went wrong.
@ezyang so torch.compile with tensorrt (jit approach) is part of tensorrt repo?
@geraldstanje From torch_error.txt, it seems like graph capture was successful and there's a couple of errors from the tensorrt backend:
2024-05-10T21:15:33.961Z 2024-05-10T21:15:33,744 [INFO ] W-9001-model_1.0-stdout MODEL_LOG - [05/10/2024-21:15:33] [TRT] [W] CUDA initialization failure with error: 35. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ... 2024-05-10T21:15:33,747 [INFO ] W-9001-model_1.0-stdout MODEL_LOG - TypeError: pybind11::init(): factory function returned nullptr
Please re-open the issue if you believe the problem to be caused by the graph sent to tensorrt
No branches or pull requests
馃悰 Describe the bug
hi i see the following error - it looks like the torch.compile worked fine but when i invoke the prediction after that it errors out:
full log:
torch_error.txt
Used docker image:
how was the model compiled?
to rule out that the issue is somewhere else - i tested with the following torch.compile - this works fine:
Versions
pytorch 2.1
cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang
The text was updated successfully, but these errors were encountered: