The video demo of ECCV2022 paper titled "Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection"
-
Updated
Oct 25, 2022
The video demo of ECCV2022 paper titled "Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection"
Public repository of our assessment work in missing views for EO applications
Official Repo for "To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo", ACL 2022 (Short)
Implementation of CLIP model with a reduced capacity. For self-educational purposes only.
About Implementation for paper "InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4" (https://arxiv.org/abs/2308.12067)
Pytorch implementation of "Multi-domain translation between single-cell imaging and sequencing data using autoencoders" (https://www.nature.com/articles/s41467-020-20249-2) with custom models.
Learning Latent Semantic Representations of Paintings for Personalized Recommendation
Solution to one of the problems in 2021 NeurIPS Competition: A self-supervised contrastive learning model to learn matched cell modality embeddings in 10X Multiome data.
🐘 uncovering social interests in wildlife
Modality Translation through Conditional Encoder-Decoder (2023-1 Machine Learning for Visual Understanding Team project)
Research-repository: Disruption Prediction and Analysis through Multimodal Deep Learning in KSTAR
[AAAI24] Learning Invariant Inter-pixel Correlations for Superpixel Generation
🌈 Official Code for **Spatio-Temporal Fuzzy-oriented Multi-modal Meta-learning for Fine-grained Emotion Recognition**
Under review. [IROS 2024] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction
PyTorch implementation of the paper: All For One: Multi-modal Multi-Task Learning
Pytorch Implementation of Multimodal Entailment baseline
(BMVC23) Paper on 3D visual question answering at the lab of Prof. Dr. Niessner at Technical University of Munich.
Deep Symbolic Regression with Multimodal Pretraining
Contrastive-VisionVAE-Follower is a model used for multi-modal task called Vision-and-Language Navigation (VLN).
Add a description, image, and links to the multi-modal-learning topic page so that developers can more easily learn about it.
To associate your repository with the multi-modal-learning topic, visit your repo's landing page and select "manage topics."