multi-modal

Star

Here are 266 public repositories matching this topic...

modelscope / agentscope

Star

Start building LLM-empowered multi-agent applications in an easier way.

agent chatbot multi-agent multi-modal distributed-agents gpt-4 large-language-models llm llm-agent llama3 gpt-4o

Updated May 23, 2024
Python

howard-hou / VisualRWKV

Star

VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.

multi-modal large-language-models rwkv

Updated May 23, 2024
Python

modelscope / modelscope

Star

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

Updated May 23, 2024
Python

THUDM / CogVLM2

Star

GPT4V-level open-source multi-modal model based on Llama3-8B

pretrained-models language-model multi-modal cogvlm

Updated May 23, 2024
Python

modelscope / data-juicer

Star

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据！

Updated May 23, 2024
Python

marqo-ai / marqo

Star

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Updated May 23, 2024
Python

SciSharp / LLamaSharp

Star

A C#/.NET library to run LLM models (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

Updated May 23, 2024
C#

Expl0dingCat / Ame

Star

State-of-the-art, multi-modal virtual assistant framework powered by LLaMA. Ame is under active development.

ai chatbot multi-modal virtual-assistant virtual-assistant-framework large-language-models llm

Updated May 23, 2024
Python

amazon-science / contrastive_emc2

Star

Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"

machine-learning deep-neural-networks machine-learning-algorithms multi-modal multi-modal-learning mcmc-sampling contrastive-learning

Updated May 22, 2024
Python

IntelLabs / fastRAG

Star

Efficient Retrieval Augmentation and Generation Framework

nlp benchmark information-retrieval transformers knowledge-graph question-answering summarization multi-modal semantic-search diffusion sentence-transformers colbert llm generative-ai

Updated May 22, 2024
Python

Yuan-ManX / ai-multimodal-timeline

Star

Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Audio, Image, Video, Music and 3D content. 🔥

ai multi-modal deeplearning-ai multimodal multimodal-deep-learning llm

Updated May 22, 2024

valhalla / valhalla

Star

Open Source Routing Engine for OpenStreetMap

directions openstreetmap routing astar traveling-salesman dijkstra routing-engine isochrones multi-modal tiled

Updated May 22, 2024
C++

open-compass / VLMEvalKit

Star

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

Updated May 23, 2024
Python

PKU-YuanGroup / Video-LLaVA

Star

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

multi-modal instruction-tuning large-vision-language-model

Updated May 21, 2024
Python

kyegomez / zeta

Sponsor

Star

Build high-performance AI models with modular building blocks

multi-platform deep-learning transformers pytorch artificial-intelligence transformer speech-recognition multi-modal multi-agent-systems multi-modal-learning gpt4 llama2 longnet

Updated May 21, 2024
Python

THUDM / CogVLM

Star

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

Updated May 20, 2024
Python

OpenGVLab / InternVL

Star

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

image-classification gpt multi-modal semantic-segmentation video-classification mme image-text-retrieval llm vision-language-model gpt-4v vit-6b vit-22b gpt-4o

Updated May 20, 2024
Jupyter Notebook

wangxiao5791509 / MultiModal_BigModels_Survey

Star

[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models

audio review radar natural-language transformers point-cloud survey depth multi-modal thermal-infrared self-attention pre-training event-camera pengchenglab big-models anhui-university rgb-text-audio

Updated May 20, 2024

iSiddharth20 / Chat-With-OPD

Sponsor

Star

Offline Multi-Modal RAG. Execution Scripts optimized for for Intel, CUDA.

cuda intel inference multi-modal vector-database apple-silicon whisperx retrieval-augmented-generation mistral-7b

Updated May 18, 2024
Python

kyegomez / MultiModal-ToT

Sponsor

Star

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement

artificial-intelligence multi-modal multi-modality gpt4 multi-modality-data

Updated May 18, 2024
Python

Improve this page

Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal

Here are 266 public repositories matching this topic...

modelscope / agentscope

howard-hou / VisualRWKV

modelscope / modelscope

THUDM / CogVLM2

modelscope / data-juicer

marqo-ai / marqo

SciSharp / LLamaSharp

Expl0dingCat / Ame

amazon-science / contrastive_emc2

IntelLabs / fastRAG

Yuan-ManX / ai-multimodal-timeline

valhalla / valhalla

open-compass / VLMEvalKit

PKU-YuanGroup / Video-LLaVA

kyegomez / zeta

THUDM / CogVLM

OpenGVLab / InternVL

wangxiao5791509 / MultiModal_BigModels_Survey

iSiddharth20 / Chat-With-OPD

kyegomez / MultiModal-ToT

Improve this page

Add this topic to your repo