**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Concept# Autoencoder

Summary

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning). An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction.
Variants exist, aiming to force the learned representations to assume useful properties. Examples are regularized autoencoders (Sparse, Denoising and Contractive), which are effective in learning representations for subsequent classification tasks, and Variational autoencoders, with applications as generative models. Autoencoders are applied to many problems, including facial recognition, feature detection, anomaly detection and acquiring the meaning of words. Autoencoders are also generative models which can randomly generate new data that is similar to the input data (training data).
An autoencoder is defined by the following components: Two sets: the space of decoded messages ; the space of encoded messages . Almost always, both and are Euclidean spaces, that is, for some . Two parametrized families of functions: the encoder family , parametrized by ; the decoder family , parametrized by .For any , we usually write , and refer to it as the code, the latent variable, latent representation, latent vector, etc. Conversely, for any , we usually write , and refer to it as the (decoded) message.
Usually, both the encoder and the decoder are defined as multilayer perceptrons. For example, a one-layer-MLP encoder is:
where is an element-wise activation function such as a sigmoid function or a rectified linear unit, is a matrix called "weight", and is a vector called "bias".
An autoencoder, by itself, is simply a tuple of two functions. To judge its quality, we need a task. A task is defined by a reference probability distribution over , and a "reconstruction quality" function , such that measures how much differs from .

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related courses (20)

DH-406: Machine learning for DH

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

EE-559: Deep learning

This course explores how to design reliable discriminative and generative neural networks, the ethics of data acquisition and model deployment, as well as modern multi-modal models.

PHYS-467: Machine learning for physicists

Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practi

Related publications (339)

Related concepts (16)

Related lectures (68)

Ontological neighbourhood

Deep learning

Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.

Generative model

In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is inconsistent, but three major types can be distinguished, following : A generative model is a statistical model of the joint probability distribution on given observable variable X and target variable Y; A discriminative model is a model of the conditional probability of the target Y, given an observation x; and Classifiers computed without using a probability model are also referred to loosely as "discriminative".

Kernel Methods: Neural Networks

Covers the fundamentals of neural networks, focusing on RBF kernels and SVM.

Deep Generative Models: Variational Autoencoders & GANs

Explores Variational Autoencoders and Generative Adversarial Networks for deep generative modeling.

Deep Generative Models

Covers deep generative models, including variational autoencoders, GANs, and deep convolutional GANs.

Alfio Quarteroni, Francesco Regazzoni, Stefano Pagani

Predicting the evolution of systems with spatio-temporal dynamics in response to external stimuli is essential for scientific progress. Traditional equations-based approaches leverage first principles through the numerical approximation of differential equ ...

Jan Sickmann Hesthaven, Federico Pichi

The present work proposes a framework for nonlinear model order reduction based on a Graph Convolutional Autoencoder (GCA-ROM). In the reduced order modeling (ROM) context, one is interested in obtaining real -time and many-query evaluations of parametric ...

Martin Jaggi, Vinitra Swamy, Jibril Albachir Frej, Julian Thomas Blackwell

Interpretability for neural networks is a trade-off between three key requirements: 1) faithfulness of the explanation (i.e., how perfectly it explains the prediction), 2) understandability of the explanation by humans, and 3) model performance. Most exist ...

2024