Capsule networks, but not convolutional networks explain global configurational visual effects

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

In human vision, perception of local features depends on all elements in the visual field and their exact configuration. For example, observers performed a vernier discrimination task. When a surrounding square was added to the vernier, the task became much more difficult: a classic crowding effect. Crucially, adding more flanking squares improved performance (uncrowding). In addition, in displays of squares and stars, small changes in the configuration changed performance strongly. Here, we show that convolutional neural networks fail to address the global aspects of configuration because, first, the target and the flankers’ representations at a given layer are pooled within the receptive fields of the subsequent layer, leading to poor performance. Second, far away elements cannot interact with the vernier to produce uncrowding. We show that capsule networks, a new kind of neural network that explicitly takes configuration into account, can capture the experimental results well.

Capsule networks, but not convolutional networks explain global configurational visual effects

Graph Chatbot

Chattez avec Graph Search

Predicting Visual Stimuli From Cortical Response Recorded With Wide-Field Imaging in a Mouse

Probing and modulating inter-areal coupling in the cortical visual motion processing pathway with non-invasive brain stimulation

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Predicting Visual Stimuli From Cortical Response Recorded With Wide-Field Imaging in a Mouse

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Probing and modulating inter-areal coupling in the cortical visual motion processing pathway with non-invasive brain stimulation