Publication

Capsule networks explain complex spatial processing

Michael Herzog, Adrien Christophe Doerig, Lynn Schmittwilken
2019
Conference paper

Abstract

Classically, visual processing is described as a cascade of local feedforward computations. This view has received striking support from the success of convolutional neural networks (CNNs). However, CNNs only roughly mimic human vision. For example, CNNs do not take the global spatial configuration of visual elements into account and thus fail at simple tasks such as explaining crowding and uncrowding. In crowding, the perception of a target deteriorates in the presence of neighboring elements. Classically, adding flanking elements is thought to always decreases performance. However, adding flankers even far away from the target can improve performance, depending on the global configuration (uncrowding). We showed previously that no classic model of crowding, including CNNs, can explain uncrowding. Here, we show that capsule networks, a type of deep network combining CNNs and object segmentation, explain both crowding and uncrowding. We trained capsule networks to recognize targets and groups of shapes. There were no crowding/uncrowding stimuli in the training set. When we subsequently tested the network on crowding/uncrowding stimuli, both crowding and uncrowding occurred. We show theoretically how crowding and uncrowding naturally emerge from neural dynamics in capsule networks. These powerful recurrent models offer a new framework to understand previously unexplained experimental results.

Official source

https://infoscience.epfl.ch/record/270836?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Michael Herzog, Adrien Christophe Doerig, Lynn Schmittwilken
2019
Conference paper

Abstract

Official source

https://infoscience.epfl.ch/record/270836?ln=en

About this result

Ontological neighbourhood

Information engineering

Machine learning: Artificial neural networks

Related concepts (31)

Related publications (66)

Related MOOCs (25)

Capsule networks explain complex spatial processing

Graph Chatbot

Chat with Graph Search

Task-driven neural network models predict neural dynamics of proprioception: Neural network model weights

Predicting Visual Stimuli From Cortical Response Recorded With Wide-Field Imaging in a Mouse

The neural correlates of topographical disorientation-a lesion analysis study

Predicting Visual Stimuli From Cortical Response Recorded With Wide-Field Imaging in a Mouse

Task-driven neural network models predict neural dynamics of proprioception: Neural network model weights

The neural correlates of topographical disorientation-a lesion analysis study