VETIM: Expanding the Vocabulary of Text-to-Image Models only with Text
Publications associées (48)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Abstractive summarization has seen big improvements in recent years, mostly due to advances in neural language modeling, language model pretraining, and scaling models and datasets. While large language models generate summaries that are fluent, coherent, ...
EPFL2023
Natural language processing and other artificial intelligence fields have witnessed impressive progress over the past decade. Although some of this progress is due to algorithmic advances in deep learning, the majority has arguably been enabled by scaling ...
Over the years, clinical institutes accumulated large amounts of digital slides from resected tissue specimens. These digital images, called whole slide images (WSIs), are high-resolution tissue snapshots that depict the complex interaction of cells at the ...
Whether a document is of historical or contemporary significance, typography plays a crucial role in its composition. From the early days of modern printing, typographic techniques have evolved and transformed, resulting in changes to the features of typog ...
The way biological brains carry out advanced yet extremely energy efficient signal processing remains both fascinating and unintelligible. It is known however that at least some areas of the brain perform fast and low-cost processing relying only on a smal ...
EPFL2023
,
Incomplete labels are common in multi-task learning for biomedical applications due to several practical difficulties, e.g., expensive annotation efforts by experts, limit of data collection, different sources of data. A naive approach to enable joint lear ...
New York2023
Robustness of medical image classification models is limited by its exposure to the candidate disease classes. Generalized zero shot learning (GZSL) aims at correctly predicting seen and unseen classes and most current GZSL approaches have focused on the s ...
In this work, we propose an approach to aid in mapping small settlements, which are often misclassified by models trained on a large-scale context (global or regional). We leverage pre-trained land cover models and few-shot learning to enhance the detectio ...
Vision-Language Pre-training (VLP) has advanced the performance of many visionlanguage tasks, such as image-text retrieval, visual entailment, and visual reasoning. The pre-training mostly utilizes lexical databases and image queries in English. Previous w ...
The success of self-supervised learning in computer vision and natural language processing has motivated pretraining methods on tabular data. However, most existing tabular self-supervised learning models fail to leverage information across multiple data t ...