Publication

Global information processing in feedforward deep networks

Abstract

While deep neural networks are state-of-the-art models of many parts of the human visual system, here we show that they fail to process global information in a humanlike manner. First, using visual crowding as a probe into global visual information processing, we found that regardless of architecture, feedforward deep networks successfully model an elementary version of crowding, but cannot exhibit its global counterpart (“uncrowding”). It is not yet well-understood whether this limitation could be ameliorated by substantially larger and more naturalistic training conditions, or by attentional mechanisms. To investigate this, we studied models trained with the CLIP (Contrastive Language-Image Pretraining) procedure, which is a training procedure for a set of attention-based models intended for zero-shot classification of images. CLIP models are trained by self-supervised pairing of generated labels with image inputs on a composite dataset of approximately 400 million images. Due to this training procedure, CLIP models have shown to exhibit highly abstract representations, state-of-the-art performance in zero-shot classification, and to make classification errors that are more in line with the errors humans make than previous models. Despite these advances, we show, by fitting logistic regression models to the activations of layers in CLIP models, that training procedure, architectural differences, nor training dataset size can ameliorate feedforward networks’ inability to reproduce humanlike global information processing in an uncrowding task. This highlights an important aspect of visual information processing: feedforward computations alone are not enough to explain how visual information in humans is combined globally.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (32)
Deep learning
Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
Artificial neural network
Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.
Feedforward neural network
A feedforward neural network (FNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. Its flow is uni-directional, meaning that the information in the model flows in only one direction—forward—from the input nodes, through the hidden nodes (if any) and to the output nodes, without any cycles or loops, in contrast to recurrent neural networks, which have a bi-directional flow.
Show more
Related publications (159)

The neural correlates of topographical disorientation-a lesion analysis study

Olaf Blanke, Lukas Heydrich, Eva Blondiaux

Topographical disorientation refers to the selective inability to orient oneself in familiar surroundings. However, to date its neural correlates remain poorly understood. Here we use quantitative lesion analysis and a lesion network mapping approach in or ...
Hoboken2024

SVGC-AVA: 360-Degree Video Saliency Prediction With Spherical Vector-Based Graph Convolution and Audio-Visual Attention

Pascal Frossard, Chenglin Li, Li Wei, Qin Yang, Yuelei Li

Viewers of 360-degree videos are provided with both visual modality to characterize their surrounding views and audio modality to indicate the sound direction. Though both modalities are important for saliency prediction, little work has been done by joint ...
Ieee-Inst Electrical Electronics Engineers Inc2024

DeepGeo: Deep Geometric Mapping for Automated and Effective Parameterization in Aerodynamic Shape Optimization

Pascal Fua, Zhen Wei

Aerodynamic shape optimization (ASO) is a key technique in aerodynamic designs, aimed at enhancing an object’s physical performance while adhering to specific constraints. Traditional parameterization methods for ASO often require substantial manual tuning ...
2024
Show more
Related MOOCs (13)
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Neuronal Dynamics - Computational Neuroscience of Single Neurons
The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.