multi-modal

Here are 266 public repositories matching this topic...

TuAnh23 / MultiModalST

Limit the use of end-to-end data for Speech Translation (by leveraging Automatic Speech Recognition and Machine Translation data instead) using zero-shot multilingual text translation techniques.

multi-modal zero-shot few-shot speech-translation

Updated May 16, 2022
Python

Jakob-L-M / multi-modal-document-search

Star

This repository provides a streamlit application that enables a user to upload a screenshot which will than be queried against a database of PDF documents. Both the image structure as well as the (possibly) included text are used to find matching documents for a self defined set.

multi-modal ocr-recognition embedding-vectors streamlit vector-database

Updated Dec 28, 2023
Python

lyyf2002 / ASGEA

Star

Code for ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment

knowledge-graph alignment multi-modal entity-alignment asgea

Updated Feb 28, 2024
Python

kyegomez / qformer

Sponsor

Star

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

machine-learning ai machine artificial-intelligence multi-modal attention-mechanism multi-modality blip2

Updated May 17, 2024
Python

kyegomez / MegaVIT

Sponsor

Star

The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"

computer-vision artificial-intelligence multi-modal vision-and-language multi-modal-learning vision-transformer gpt4 multi-modal-fusion

Updated May 17, 2024
Python

Pruthvi-Sanghavi / air_water_land_surveillance_bot

Star

Repository for air water and land surveillance robot developed as a part of DRDO Robotics and Unmanned Systems Exposition.

quadcopter surveillance arduino-uno multi-modal differential-drive-robot

Updated Feb 10, 2021

JanTeichertKluge / DMLSim

Star

This library provides packages on DoubleML / Causal Machine Learning and Neural Networks in Python for Simulation and Case Studies.

machine-learning deep-learning neural-network simulation transformers transformer multi-modal causal-inference case-study bert causal multimodal multimodal-deep-learning dgp causal-machine-learning beit double-machine-learning doubleml

Updated Jun 20, 2023
Python

kyegomez / VisionLLaMA

Sponsor

Star

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta

ai deep-learning vit multi-modal vision-models vision-transformers

Updated May 18, 2024
Python

kyegomez / CELESTIAL-1

Sponsor

Star

Omni-Modality Processing, Understanding, and Generation

openai attention multi-modal multimodality attention-is-all-you-need attention-mechanisms multimodal multimodal-deep-learning gpt-4 gpt4 omnimodal

Updated May 3, 2024
Python

Seongwoong-sk / Multi-Mocessary

Star

Project based on VQA (Visual Question Answering), one of tasks of Multi-Modal

image-captioning multi-modal

Updated Nov 1, 2023

Devin-Taylor / MultiAug

Star

Multi-modal data augmentation for machine learning

learning data machine multi-modal augmentation tabular 3d-images

Updated Jun 4, 2019
Python

BMDSoftware / MMIR

Star

MMIR: Multimodal Image Registration

microscopy multi-modal image-registration histology pathology pathology-image

Updated Jan 4, 2024
JavaScript

THUKElab / Multi-OpenEA

Star

Repository For [ICASSP2023] [VISION, DEDUCTION AND ALIGNMENT: AN EMPIRICAL STUDY ON MULTI-MODAL KNOWLEDGE GRAPH ALIGNMENT]

knowledge-graph multi-modal entity-alignment

Updated Jul 26, 2023

kyegomez / PaLM2-VAdapter

Sponsor

Star

Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter"

ai models ml transformers attention deeplearning multi-modal neural-nets attention-is-all-you-need attention-mechanisms