add PreProcessor for VLM #57

wnma3mz · 2024-01-01T14:44:15Z

No description provided.

ashvardanian · 2024-01-01T23:35:10Z

Thank you for contributions, @wnma3mz! Detaching the preprocessing code is probably the right thing to do. Give us a couple of days to merge it 🤗

VoVoR · 2024-01-05T14:09:17Z

@wnma3mz hey,

We appreciate your work on the PR!

I wanted to ask you to remove the changes from src/ dir and keep all the updates in scripts that are useful for onnx/coreml runtimes. We are using scr/ together with our pre-training code. So we didn't want to update it frequently.
We know it will be great to separate preprocessing and modeling into different classes, and did it already. You can expect it in the next release in a few weeks. But we did it in a different way a little bit.

Also, as far as I understand, you tested your script with model_fpath = "unum-cloud/uform-coreml-onnx," correct?

wnma3mz · 2024-01-06T02:32:57Z

Thanks for your reply, I have deleted the changes in the src directory.

As you said, I tested it at scripts/example.py. Therefore, this part of the code will have an impact. When you push a new preprocessing, feel free to remind me to update scripts/example.py to make sure it works correctly.

VoVoR · 2024-01-15T13:08:23Z

@wnma3mz hi
I've tested the exmaple.py script with "model_fpath = 'unum-cloud/uform-coreml-onnx'" - it didn't work. And it shouldn't because "get_model" won't work with our coreml/onnx HF model card.
How did you exactly run the script? Can you push the working version by any chance so I can check it?

wnma3mz · 2024-01-15T15:02:21Z

@VoVoR

I'm sorry for the trouble.
For the convenience of testing, I downloaded all the model file locally in advance. The file structure is as follows:

├── multilingual-v2.image-encoder.mlpackage
│   ├── Data
│   │   └── com.apple.CoreML
│   │       ├── model.mlmodel
│   │       └── weights
│   │           └── weight.bin
│   └── Manifest.json
├── multilingual-v2.image-encoder.mlpackage.zip
├── multilingual-v2.text-encoder.mlpackage
│   ├── Data
│   │   └── com.apple.CoreML
│   │       ├── model.mlmodel
│   │       └── weights
│   │           └── weight.bin
│   └── Manifest.json
├── multilingual-v2.text-encoder.mlpackage.zip
├── multilingual.image-encoder.onnx
├── multilingual.text-encoder.onnx
├── tokenizer.json
├── torch_config.json
└── torch_weight.pt

The current 'snapshot_download' function can interfere with testing due to network reasons, so I added the 'get_local_model' function for easy run.

If you have any other questions, please feel free to remind me

wnma3mz added 2 commits January 1, 2024 22:42

add PreProcessor for VLM

18e5eab

add torch,onnx,coreml example

1c7005a

ashvardanian mentioned this pull request Jan 1, 2024

[Refactor] Modular package organisation, pre-commit linting suite #58

Closed

remove models vlm PreProcessor

53de171

ashvardanian changed the base branch from main to main-dev January 11, 2024 20:13

ashvardanian assigned kimihailv and VoVoR Jan 11, 2024

ashvardanian requested review from kimihailv and VoVoR January 11, 2024 20:13

add func get_local_model for test torch model

decd1a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add PreProcessor for VLM #57

add PreProcessor for VLM #57

wnma3mz commented Jan 1, 2024

ashvardanian commented Jan 1, 2024

VoVoR commented Jan 5, 2024

wnma3mz commented Jan 6, 2024

VoVoR commented Jan 15, 2024

wnma3mz commented Jan 15, 2024

add PreProcessor for VLM #57

Are you sure you want to change the base?

add PreProcessor for VLM #57

Conversation

wnma3mz commented Jan 1, 2024

ashvardanian commented Jan 1, 2024

VoVoR commented Jan 5, 2024

wnma3mz commented Jan 6, 2024

VoVoR commented Jan 15, 2024

wnma3mz commented Jan 15, 2024