Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is possible an inference with just a RTX 3080 of 10GB? #25

Open
davidmartinrius opened this issue Jun 21, 2023 · 3 comments
Open

Is possible an inference with just a RTX 3080 of 10GB? #25

davidmartinrius opened this issue Jun 21, 2023 · 3 comments

Comments

@davidmartinrius
Copy link

Hello,

I know it is very little memory, but it is what I have by now.

By default, the demo code won't inference because of cuda out of memory. I tried to reduce the batch size of the inference to just 1, but is not enough.

Do you know a way to reduce the memory consumption running the inference?

I know that the best solution is to upgrade the GPU to a RTX 3090/4090/A6000, but before that I would like to try another way if possible.

Thank you!

David Martin Rius

@deepanwayx
Copy link
Collaborator

The required VRAM is around 13GB for full precision inference with a batch size of 1

You can also try Colaboratory for inference: #10

@illtellyoulater
Copy link

@deepanwayx I suppose full inference precision is 32 bit, correct? If so, did you guys made any test to check whether 16 bit would still deliver good acceptable results?

@deepanwayx
Copy link
Collaborator

Yes, the full inference precision is 32-bit. We did not test with 16-bit inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants