**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Lecture# Dimensionality Reduction: PCA & Autoencoders

Description

This lecture covers the concepts of Principal Component Analysis (PCA) and Autoencoders for dimensionality reduction. Starting with the basics of PCA, it explains how to find the most important signal while removing noise. It then delves into Kernel PCA for nonlinear data. The discussion extends to Autoencoders, emphasizing their nonlinear mappings and applications in denoising and sparsity. The lecture also explores Convolutional Autoencoders and their role in generating new samples. Lastly, it showcases the practical use of Autoencoders for image retrieval and data generation.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

In course

CS-233(a): Introduction to machine learning (BA3)

Machine learning and data analysis are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analy

Instructor

Related concepts (200)

Dimensionality reduction

Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable (hard to control or deal with).

Nonlinear dimensionality reduction

Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa) itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

Autoencoder

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning). An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction. Variants exist, aiming to force the learned representations to assume useful properties.

Data

In common usage and statistics, data (USˈdætə; UKˈdeɪtə) is a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally. A datum is an individual value in a collection of data. Data is usually organized into structures such as tables that provide additional context and meaning, and which may themselves be used as data in larger structures.

Principal component analysis

Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and enabling the visualization of multidimensional data. Formally, PCA is a statistical technique for reducing the dimensionality of a dataset. This is accomplished by linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data.

Related lectures (738)

Document Analysis: Topic ModelingDH-406: Machine learning for DH

Explores document analysis, topic modeling, and generative models for data generation in machine learning.

Data Representation: PCADH-406: Machine learning for DH

Covers data representation using PCA for dimensionality reduction, focusing on signal preservation and noise removal.

Neural Networks Recap: Activation FunctionsDH-406: Machine learning for DH

Covers the basics of neural networks, activation functions, training, image processing, CNNs, regularization, and dimensionality reduction methods.

Understanding AutoencodersCS-233(a): Introduction to machine learning (BA3)

Explores autoencoders, from linear mappings in PCA to nonlinear mappings, deep autoencoders, and their applications.

Dimensionality Reduction: PCA & t-SNEDH-406: Machine learning for DH

Explores PCA and t-SNE for reducing dimensions and visualizing high-dimensional data effectively.