Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for msoco Dataset Reader for Multimodal Model Training #5474

Closed
1 task done
leedaehan-kev opened this issue May 19, 2024 · 1 comment
Closed
1 task done
Assignees
Labels
question Further information is requested

Comments

@leedaehan-kev
Copy link

leedaehan-kev commented May 19, 2024

Describe the question.

Hello, NVIDIA DALI team

For training multimodal models such as BLIP2, CLIP, and SEED, an image-text pair dataset is essential. However, it appears that DAli currently lacks a reader for the msoco dataset, which is crucial for preprocessing such datasets.

(coco_karpathy_train.json etc..)

Considering the frequent use of pair datasets in multimodal tasks, this feature is extremely important. Are there any plans to develop this reader? Or is it possible that it has already been developed, and I might have missed it in the documentation?

Thank you for your attention.

Best regards,

Daehan

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report
@leedaehan-kev leedaehan-kev added the question Further information is requested label May 19, 2024
@JanuszL
Copy link
Contributor

JanuszL commented May 19, 2024

Hi @leedaehan-kev,

Thank you for reaching out. As we see more and more custom data reading patterns and more variety of data set formats it is not feasible to cover them all. That is why we developed and recommend using the external source operator. It allows to define data reading in python efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants