Publication

SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

Olga Fink, Ismail Nejjar, Han Sun, Hao Dong
2023
Conference paper
Abstract

In real-world scenarios, achieving domain generalization (DG) presents significant challenges as models are required to generalize to unknown target distributions. Generalizing to unseen multi-modal distributions poses even greater difficulties due to the distinct properties exhibited by different modalities. To overcome the challenges of achieving domain generalization in multi-modal scenarios, we propose SimMMDG, a simple yet effective multi-modal DG framework. We argue that mapping features from different modalities into the same embedding space impedes model generalization. To address this, we propose splitting the features within each modality into modality-specific and modality-shared components. We employ supervised contrastive learning on the modality-shared features to ensure they possess joint properties and impose distance constraints on modality-specific features to promote diversity. In addition, we introduce a cross-modal translation module to regularize the learned features, which can also be used for missing-modality generalization. We demonstrate that our framework is theoretically well-supported and achieves strong performance in multi-modal DG on the EPIC-Kitchens dataset and the novel Human-Animal-Cartoon (HAC) dataset introduced in this paper. Our source code and HAC dataset are available at https://github.com/donghao51/SimMMDG.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (33)
Modal logic
Modal logic is a kind of logic used to represent statements about necessity and possibility. It plays a major role in philosophy and related fields as a tool for understanding concepts such as knowledge, obligation, and causation. For instance, in epistemic modal logic, the formula can be used to represent the statement that is known. In deontic modal logic, that same formula can represent that is a moral obligation. Modal logic considers the inferences that modal statements give rise to.
Epistemic modality
Epistemic modality is a sub-type of linguistic modality that encompasses knowledge, belief, or credence in a proposition. Epistemic modality is exemplified by the English modals may, might, must. However, it occurs cross-linguistically, encoded in a wide variety of lexical items and grammatical structures. Epistemic modality has been studied from many perspectives within linguistics and philosophy. It is one of the most studied phenomena in formal semantics. (a) grammatically: through modal verbs (e.g.
Stimulus modality
Stimulus modality, also called sensory modality, is one aspect of a stimulus or what is perceived after a stimulus. For example, the temperature modality is registered after heat or cold stimulate a receptor. Some sensory modalities include: light, sound, temperature, taste, pressure, and smell. The type and location of the sensory receptor activated by the stimulus plays the primary role in coding the sensation. All sensory modalities work together to heighten stimuli sensation when necessary.
Show more
Related publications (33)

Robust machine learning for neuroscientific inference

Steffen Schneider

Modern neuroscience research is generating increasingly large datasets, from recording thousands of neurons over long timescales to behavioral recordings of animals spanning weeks, months, or even years. Despite a great variety in recording setups and expe ...
EPFL2024

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Arnout Jan J Devos

Machine learning (ML) enables artificial intelligent (AI) agents to learn autonomously from data obtained from their environment to perform tasks. Modern ML systems have proven to be extremely effective, reaching or even exceeding human intelligence.Althou ...
EPFL2024

Transfer learning application of self-supervised learning in ARPES

Gabriel Aeppli

There is a growing recognition that electronic band structure is a local property of materials and devices, and there is steep growth in capabilities to collect the relevant data. New photon sources, from small-laboratory-based lasers to free electron lase ...
IOP Publishing Ltd2023
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.