SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

Olga Fink, Ismail Nejjar, Han Sun, Hao Dong
2023
Article de conférence

Résumé

In real-world scenarios, achieving domain generalization (DG) presents significant challenges as models are required to generalize to unknown target distributions. Generalizing to unseen multi-modal distributions poses even greater difficulties due to the distinct properties exhibited by different modalities. To overcome the challenges of achieving domain generalization in multi-modal scenarios, we propose SimMMDG, a simple yet effective multi-modal DG framework. We argue that mapping features from different modalities into the same embedding space impedes model generalization. To address this, we propose splitting the features within each modality into modality-specific and modality-shared components. We employ supervised contrastive learning on the modality-shared features to ensure they possess joint properties and impose distance constraints on modality-specific features to promote diversity. In addition, we introduce a cross-modal translation module to regularize the learned features, which can also be used for missing-modality generalization. We demonstrate that our framework is theoretically well-supported and achieves strong performance in multi-modal DG on the EPIC-Kitchens dataset and the novel Human-Animal-Cartoon (HAC) dataset introduced in this paper. Our source code and HAC dataset are available at https://github.com/donghao51/SimMMDG.

Source officielle

https://infoscience.epfl.ch/record/307342?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

Graph Chatbot

Chattez avec Graph Search

Robust machine learning for neuroscientific inference

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Reducing Annotation Efforts in Electricity Theft Detection Through Optimal Sample Selection

Robust machine learning for neuroscientific inference

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Reducing Annotation Efforts in Electricity Theft Detection Through Optimal Sample Selection