Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemma in Model Garden Deployment - confusing section on Chat Applications #2800

Open
afirstenberg opened this issue Mar 25, 2024 · 1 comment

Comments

@afirstenberg
Copy link

Environment

  • Deployed a gemma-7b-it model on Vertex AI Model Garden using the "Deploy" button from the Gemma card. No additional tuning was done.
  • I have an instance running on a g2-standard-12 machine with I4 GPU. It is visible in the Online Prediction section of my Cloud Console.
  • I am able to reach the endpoint without any issues.

Unable to find any good documentation on what needs to be sent to the model and what to get back, I used the "Model Garden Gemma Deployment on Vertex" notebook to try and get an idea. (See #2799 for a related issue.)

Description:

  • The instructions in the section "Build chat applications with Gemma" indicate that there are templates that define the structure of a conversation.
  • The code shows how to use the template to create a prompt.
  • However, the prompt that is created isn't sent to the model. Instead, the text from a previous example is.
  • Furthermore, even if I do send the text to the model, it doesn't seem to respond any differently.
  • As with Model Garden Gemma Deployment on Vertex - incomplete documentation about prediction response format #2799, the response format that is to be expected isn't documented.

See starting

@gericdong
Copy link
Contributor

@kathyyu-google: could you please assist with this? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants