feat(vertexai): multimodal with all modalities #4110

Deleplace · 2024-04-25T10:02:54Z

Go version of the Python sample at https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/gemini_all_modalities.py

Region tag generativeaionvertexai_gemini_all_modalities

snippet-bot · 2024-04-25T10:02:59Z

Here is the summary of changes.

You are about to add 1 region tag.

vertexai/multimodal-all/multimodalall.go:18, tag generativeaionvertexai_gemini_all_modalities

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

Refresh this comment

jba · 2024-04-25T15:06:46Z

vertexai/multimodal-all/multimodalall.go

+// generateMultimodalContent generates a response into w, based upon the prompt
+// and video provided.
+// video and image are a Google Cloud Storage paths starting with "gs://"
+func generateMultimodalContent(w io.Writer, prompt, video, image, projectID, location, modelName string) error {


With all these string arguments, it's easy for callers to get confused. I'd define and pass a struct in this case.

Or is it possible to have this take a model and just call GenerateModel, and to the client and model creation elsewhere?

Let's try with a struct.

All the options here are compromises between sometimes conflicting goals...

grayside

Left some questions and suggestions. Questions feel like the answers could be blockers, but I'm open to LGTM with creation of follow-up bugs.

vertexai/multimodal-all/multimodalall.go

grayside · 2024-05-13T17:45:05Z

vertexai/multimodal-all/multimodalall.go

+
+// generateMultimodalContent generates a response into w, based upon the multimodal prompt
+// provided.
+func generateMultimodalContent(w io.Writer, prompt multimodalPrompt, projectID, location, modelName string) error {


I see the function name generateMultimodalContent is the same as generativeaionvertexai_gemini_video_with_audio from #4110.

question: Why do both samples exist? Would it make more sense to associate the same sample code with both region tags?
suggestion: If two samples are needed, rename one for clarity in the event someone is comparing them side-by-side.

This may benefit from feedback from @gericdong

We need language parity for the doc pages, usually Python comes first and then other languages are added in tabs at the same place: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/send-multimodal-prompts#all_modalities

Yes I will rename the func, for clarity/disambiguation

grayside · 2024-05-13T17:47:45Z

vertexai/multimodal-all/multimodalall_test.go

+	// The generated text may look like "The moment in the image happens at
+	// approximately 00:49 in the video. The context of the moment is..."


question: Was this meant to be captured as a test case?

No this is intended as a hint for the reader, about what kind of answer the model should provide.

Because of the very wide range of possible answers, this is not a test case.

The error coverage here of "not an error" is light, but from the Python samples it looks like the standard is "non-error, non-empty response" which this is doing.

Co-authored-by: Adam Ross <grayside@gmail.com>

feat(vertexai): multimodal with all modalities

774bf80

Deleplace requested a review from a team as a code owner April 25, 2024 10:02

product-auto-label bot added the samples Issues that are directly related to samples. label Apr 25, 2024

Merge branch 'main' into multimodal-all_2

0623293

Deleplace enabled auto-merge (squash) April 25, 2024 13:02

gericdong approved these changes Apr 25, 2024

View reviewed changes

jba requested changes Apr 25, 2024

View reviewed changes

grayside mentioned this pull request Apr 25, 2024

feat(vertexai): pdf input #4112

Merged

Custom type multimodalPrompt

12f886c

Deleplace requested a review from jba April 26, 2024 09:16

jba approved these changes Apr 26, 2024

View reviewed changes

Merge branch 'main' into multimodal-all_2

92a383a

grayside reviewed May 13, 2024

View reviewed changes

grayside mentioned this pull request May 13, 2024

feat(vertexai): Elastic Text-Embedding Model demo. #4127

Open

9 tasks

Deleplace and others added 3 commits May 14, 2024 19:31

Update vertexai/multimodal-all/multimodalall.go

b558152

Co-authored-by: Adam Ross <grayside@gmail.com>

Update vertexai/multimodal-all/multimodalall.go

861f354

Co-authored-by: Adam Ross <grayside@gmail.com>

fix(vertexai): improvements from code review.

2c36dc2

Deleplace requested a review from grayside May 14, 2024 17:39

grayside approved these changes May 14, 2024

View reviewed changes

grayside added 2 commits May 14, 2024 10:52

Merge branch 'main' into multimodal-all_2

24ea8b1

Merge branch 'main' into multimodal-all_2

6a6c4ec

Deleplace merged commit f4bb2dc into GoogleCloudPlatform:main May 14, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(vertexai): multimodal with all modalities #4110

feat(vertexai): multimodal with all modalities #4110

Deleplace commented Apr 25, 2024

snippet-bot bot commented Apr 25, 2024 •

edited

jba Apr 25, 2024

jba Apr 25, 2024

Deleplace Apr 26, 2024

grayside left a comment

grayside May 13, 2024 •

edited

Deleplace May 14, 2024

grayside May 13, 2024

Deleplace May 14, 2024

grayside May 14, 2024

		// The generated text may look like "The moment in the image happens at
		// approximately 00:49 in the video. The context of the moment is..."

feat(vertexai): multimodal with all modalities #4110

feat(vertexai): multimodal with all modalities #4110

Conversation

Deleplace commented Apr 25, 2024

snippet-bot bot commented Apr 25, 2024 • edited

jba Apr 25, 2024

Choose a reason for hiding this comment

jba Apr 25, 2024

Choose a reason for hiding this comment

Deleplace Apr 26, 2024

Choose a reason for hiding this comment

grayside left a comment

Choose a reason for hiding this comment

grayside May 13, 2024 • edited

Choose a reason for hiding this comment

Deleplace May 14, 2024

Choose a reason for hiding this comment

grayside May 13, 2024

Choose a reason for hiding this comment

Deleplace May 14, 2024

Choose a reason for hiding this comment

grayside May 14, 2024

Choose a reason for hiding this comment

snippet-bot bot commented Apr 25, 2024 •

edited

grayside May 13, 2024 •

edited