✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
-
Updated
Jun 2, 2024 - Python
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
A curated list of awesome Multimodal studies.
Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"
Personal Project: MPP-Qwen14B(Multimodal Pipeline Parallel-Qwen14B). Don't let the poverty limit your imagination! Train your own 14B LLaVA-like MLLM on RTX3090/4090 24GB.
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
Speech, Language, Audio, Music Processing with Large Language Model
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
up-to-date and curated list of awesome state-of-the-art LVLMs hallucinations research work, papers & resources
FreeVA: Offline MLLM as Training-Free Video Assistant
✨✨Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
[CVPR 2024] 🎬💭 chat with over 10K frames of video!
Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs
Curated papers on Large Language Models in Healthcare and Medical domain
Multimodal RAG and comparisons between language models. (Project for Deep Learning Module at the FHSWF)
UMBRAE: Unified Multimodal Decoding of Brain Signals | Unveiling the 'Dark Side' of Brain Modality
Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."